# Module 30 Topic Review  
## ML pipelines, grid-searching, and pickles!

### Pipelines in SciKit-Learn

Pipelines provide DRY way to write model development workflows in an easily repeatable, scalable, and readible format. 

```Python
from sklearn.pipeline import Pipeline

pipe = Pipeline([('mms', MinMaxScaler()),
                 ('tree', DecisionTreeClassifier(random_state=123))])

pipe.fit(X_train, y_train)

```

### Hyperparameter tuning with gridsearchCV

Gridsearch provides a DRY format for iterating model production and discovering the best hyperparameter values for the pipeline.

```Python

grid = [{'tree__max_depth': [None, 2, 6, 10], 
         'tree__min_samples_split': [5, 10]}]


gridsearch = GridSearchCV(estimator=pipe, 
                          param_grid=grid, 
                          scoring='accuracy', 
                          cv=5)

gridsearch.fit(X_train, y_train)

```

### Deploying model objects with Pickle/joblib

The pickle library provides a way to save python objects on disk (instead of in memory) so they can be saved and re-used later. 

```Python
import pickle

data_object = {'key':'value','key2':123}

with open('data.pickle', 'wb') as f: # write the new file 
    pickle.dump(data_object, f) # dump the desired python object into the pickle file

import pickle
with open('data.pickle', 'rb') as f: # read an existing pickle file
    data_object2 = pickle.load(f) # load the opened file and asign it to a variable. 
data_object2
```

#### use joblib for sklearn objects

```Python
import joblib

with open('regression_model.pkl', 'wb') as f: # instantiate the new file
    joblib.dump(model, f) # dump the model object into the file

import joblib
with open('regression_model.pkl', 'rb') as f: # read the pickle file
    model2 = joblib.load(f) # load the pickle file into a variable
    
print(f"Loaded model is y = {model2.coef_[0]}x + {model2.intercept_}")

"Loaded model is y = 1.0x + 1.0"

model2.predict([[10], [11], [12]])

array([11., 12., 13.])
```