Let's say we want to prepare data and try some scalers and classifiers for prediction in a classification problem. We will tune paramaters of classifiers by grid search technique.

Data preparing:

In [1]:
from sklearn.datasets import make_classification


X, y = make_classification()

Setting steps for our pipelines and parameters for grid search:

In [2]:
from reskit.core import Pipeliner


from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import MinMaxScaler

from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC


classifiers = [('LR', LogisticRegression()),
               ('SVC', SVC())]

scalers = [('standard', StandardScaler()),
           ('minmax', MinMaxScaler())]

steps = [('scaler', scalers),
         ('classifier', classifiers)]

param_grid = {'LR': {'penalty': ['l1', 'l2']},
              'SVC': {'kernel': ['linear', 'poly', 'rbf', 'sigmoid']}}

Setting a cross-validation for grid searching of hyperparameters and for evaluation of models with obtained hyperparameters.

In [3]:
from sklearn.model_selection import StratifiedKFold

grid_cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=0)
eval_cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=1)

Creating a plan of our research:

In [4]:
pipe = Pipeliner(steps=steps, grid_cv=grid_cv, eval_cv=eval_cv, param_grid=param_grid)
pipe.plan_table

Unnamed: 0,scaler,classifier
0,standard,LR
1,standard,SVC
2,minmax,LR
3,minmax,SVC


To tune parameters of models and evaluate this models, run:

In [5]:
pipe.get_results(X, y, scoring=['roc_auc'])

Line: 1/4
Line: 2/4
Line: 3/4
Line: 4/4


Unnamed: 0,scaler,classifier,grid_roc_auc_mean,grid_roc_auc_std,grid_roc_auc_best_params,eval_roc_auc_mean,eval_roc_auc_std,eval_roc_auc_scores
0,standard,LR,0.972,0.0466476,{'penalty': 'l1'},0.932,0.0667533,[ 0.96 0.98 0.8 0.97 0.95]
1,standard,SVC,0.948,0.0487442,{'kernel': 'rbf'},0.912,0.0570614,[ 0.96 0.99 0.83 0.88 0.9 ]
2,minmax,LR,0.968,0.049558,{'penalty': 'l1'},0.966,0.028,[ 0.94 1. 0.94 1. 0.95]
3,minmax,SVC,0.958,0.056,{'kernel': 'rbf'},0.954,0.0349857,[ 0.96 0.99 0.89 0.98 0.95]
