# Tuning Pipeline

In [0]:
from sklearn import set_config; set_config(display='diagram')

👇 Consider the following dataset.

In [0]:
import pandas as pd

data = pd.read_csv("data.csv")

data.head()

- Each observations represents a player
- Each column a characteristic of performance

The target defines whether the player last less than 5 years `0` or 5 years or more `1` as a professional.

In [0]:
X = data.drop(columns="target_5y")
y = data['target_5y']

## Pipeline

👇 We are giving you the simple pipeline below

In [0]:
from sklearn.svm import SVC
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MinMaxScaler
from sklearn.impute import SimpleImputer

# Preprocessing pipe
preprocessor = Pipeline([
    ('imputer', SimpleImputer(strategy='mean')),
    ('scaling', MinMaxScaler())
])

# Final pipe
pipe = Pipeline([
    ('preprocessing', preprocessor),
    ('model_svm', SVC())])

pipe

## Fine Tuning

Our task is to assist the recruitment process of promising young players.  
The model should **limit false alarms as much as possible** to avoid recruiting players that will flop.

❓ **Fine-tune this pipeline so as to maximize your objective**

- Use the `scoring` metric appropriate for the task
- Grid Search for the optimal:
    - imputing `strategy`
    - `kernel`
    - regularization factor `C`... 


- Store your random search results in a `search`

In [0]:
from sklearn import set_config; set_config(display='text')
# YOUR CODE BELOW

In [0]:
from nbresult import ChallengeResult

result = ChallengeResult('solution',
    search = search
)
result.write()
print(result.check())


## Export

Once you have built your optimal pipeline, export it as a pickle file

In [0]:
import pickle

# Export pipeline as pickle file
with open("pipeline.pkl", "wb") as file:
    pickle.dump(pipe_tuned, file )

🏁 Congratulation. Don't forget to add, commit and push your notebook.