[![Open in Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/justmarkham/scikit-learn-tips/master?filepath=notebooks%2F49_tune_multiple_models.ipynb)

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/justmarkham/scikit-learn-tips/blob/master/notebooks/49_tune_multiple_models.ipynb)

# 🤖⚡ scikit-learn tip #49 ([video](https://www.youtube.com/watch?v=v2QpvCJ1ar8&list=PL5-da3qGB5ID7YYAqireYEew2mWVvgmj6&index=49))

You can tune 2+ models using the same grid search! Here's how:

1. Create multiple parameter dictionaries
2. Specify the model within each dictionary
3. Put the dictionaries in a list

See example 👇

In [1]:
import pandas as pd
df = pd.read_csv('http://bit.ly/kaggletrain')

In [2]:
cols = ['Sex', 'Name', 'Age']
X = df[cols]
y = df['Survived']

In [3]:
from sklearn.preprocessing import OneHotEncoder
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.impute import SimpleImputer
from sklearn.compose import ColumnTransformer
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.pipeline import Pipeline
from sklearn.model_selection import GridSearchCV

In [4]:
# this will be the first Pipeline step
ct = ColumnTransformer(
    [('ohe', OneHotEncoder(), ['Sex']),
     ('vectorizer', CountVectorizer(), 'Name'),
     ('imputer', SimpleImputer(), ['Age'])])

In [5]:
# each of these models will take a turn as the second Pipeline step
clf1 = LogisticRegression(solver='liblinear', random_state=1)
clf2 = RandomForestClassifier(random_state=1)

In [6]:
# create the Pipeline
pipe = Pipeline([('preprocessor', ct), ('classifier', clf1)])

In [7]:
# create the parameter dictionary for clf1
params1 = {}
params1['preprocessor__vectorizer__ngram_range'] = [(1, 1), (1, 2)]
params1['classifier__penalty'] = ['l1', 'l2']
params1['classifier__C'] = [0.1, 1, 10]
params1['classifier'] = [clf1]

In [8]:
# create the parameter dictionary for clf2
params2 = {}
params2['preprocessor__vectorizer__ngram_range'] = [(1, 1), (1, 2)]
params2['classifier__n_estimators'] = [100, 200]
params2['classifier__min_samples_leaf'] = [1, 2]
params2['classifier'] = [clf2]

In [9]:
# create a list of parameter dictionaries
params = [params1, params2]

In [10]:
# this will search every parameter combination within each dictionary
grid = GridSearchCV(pipe, params)
grid.fit(X, y)
grid.best_params_

{'classifier': LogisticRegression(C=10, penalty='l1', random_state=1, solver='liblinear'),
 'classifier__C': 10,
 'classifier__penalty': 'l1',
 'preprocessor__vectorizer__ngram_range': (1, 2)}

### Want more tips? [View all tips on GitHub](https://github.com/justmarkham/scikit-learn-tips) or [Sign up to receive 2 tips by email every week](https://scikit-learn.tips) 💌

© 2020 [Data School](https://www.dataschool.io). All rights reserved.