## All Hyperparameter techniques
- All Techniques Of Hyper Parameter Optimization
- GridSearchCV
- RandomizedSearchCV
- Bayesian Optimization -Automate Hyperparameter Tuning (Hyperopt)
- Sequential Model Based Optimization(Tuning a scikit-learn estimator with skopt)
- Optuna- Automate Hyperparameter Tuning
- Genetic Algorithms (TPOT Classifier)
### References
- https://github.com/fmfn/BayesianOptimization
- https://github.com/hyperopt/hyperopt
- https://www.jeremyjordan.me/hyperparameter-tuning/
- https://optuna.org/
- https://towardsdatascience.com/hyperparameters-optimization-526348bb8e2d(By Pier Paolo Ippolito )
- https://scikit-optimize.github.io/stable/auto_examples/hyperparameter-optimization.html

In [1]:
import warnings
warnings.filterwarnings('ignore')

In [2]:
import pandas as pd
df=pd.read_csv('diabetes.csv')
df.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [3]:
## Glucose column their is 0 value so usally Glucose does not zero we replace these value with median
import numpy as np
df['Glucose']=np.where(df['Glucose']==0,df['Glucose'].median(),df['Glucose'])
df['Insulin']=np.where(df['Insulin']==0,df['Insulin'].median(),df['Insulin'])
df['SkinThickness']=np.where(df['SkinThickness']==0,df['SkinThickness'].median(),df['SkinThickness'])

In [4]:
df.Glucose.isnull().sum()

0

In [5]:
#### independent And Dependent features
X=df.drop('Outcome',axis=1)
y=df.Outcome

In [6]:
X

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age
0,6,148.0,72,35.0,30.5,33.6,0.627,50
1,1,85.0,66,29.0,30.5,26.6,0.351,31
2,8,183.0,64,23.0,30.5,23.3,0.672,32
3,1,89.0,66,23.0,94.0,28.1,0.167,21
4,0,137.0,40,35.0,168.0,43.1,2.288,33
...,...,...,...,...,...,...,...,...
763,10,101.0,76,48.0,180.0,32.9,0.171,63
764,2,122.0,70,27.0,30.5,36.8,0.340,27
765,5,121.0,72,23.0,112.0,26.2,0.245,30
766,1,126.0,60,23.0,30.5,30.1,0.349,47


In [7]:
print(X.head())
print(y.head())

   Pregnancies  Glucose  BloodPressure  SkinThickness  Insulin   BMI  \
0            6    148.0             72           35.0     30.5  33.6   
1            1     85.0             66           29.0     30.5  26.6   
2            8    183.0             64           23.0     30.5  23.3   
3            1     89.0             66           23.0     94.0  28.1   
4            0    137.0             40           35.0    168.0  43.1   

   DiabetesPedigreeFunction  Age  
0                     0.627   50  
1                     0.351   31  
2                     0.672   32  
3                     0.167   21  
4                     2.288   33  
0    1
1    0
2    1
3    0
4    1
Name: Outcome, dtype: int64


In [8]:
#### Train Test Split
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.20,random_state=33)

In [9]:
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)

(614, 8)
(614,)
(154, 8)
(154,)


In [10]:
from sklearn.ensemble import RandomForestClassifier

In [11]:
rf=RandomForestClassifier(n_estimators=10)

In [12]:
rf.fit(X_train,y_train)

RandomForestClassifier(n_estimators=10)

In [13]:
prediction=rf.predict(X_test)

In [14]:
y.value_counts()

0    500
1    268
Name: Outcome, dtype: int64

In [15]:
from sklearn.metrics import confusion_matrix,classification_report,accuracy_score
print(confusion_matrix(y_test,prediction))
print(accuracy_score(y_test,prediction))
print(classification_report(y_test,prediction))

[[87 12]
 [26 29]]
0.7532467532467533
              precision    recall  f1-score   support

           0       0.77      0.88      0.82        99
           1       0.71      0.53      0.60        55

    accuracy                           0.75       154
   macro avg       0.74      0.70      0.71       154
weighted avg       0.75      0.75      0.74       154



The main parameter used by a Random Forest Classifier are:
 - criterion= the function used to evaluate the quality of a split
 - max_depth=maximum number of levels allowed in each tree
 - max_features= maximum number of feautres considered  when splitting a node
 - min_sample_leaf=minimum number of samples which can be stored in a tree leaf.
 - min_sample_split= minimum number of samples necessary in a node to cause node spiliting
 - n-estimators=number of trees in the ensemble
    

In [17]:
## Manual Hyperparameter Tuning
model=RandomForestClassifier(n_estimators=500,criterion='gini',
                            max_features='sqrt',min_samples_leaf=10,random_state=100).fit(X_train,y_train)
predictions=model.predict(X_test)
print(confusion_matrix(y_test,predictions))
print(accuracy_score(y_test,predictions))
print(classification_report(y_test,predictions))

[[87 12]
 [28 27]]
0.7402597402597403
              precision    recall  f1-score   support

           0       0.76      0.88      0.81        99
           1       0.69      0.49      0.57        55

    accuracy                           0.74       154
   macro avg       0.72      0.68      0.69       154
weighted avg       0.73      0.74      0.73       154



#### Randamized SearchCV

In [22]:
import numpy as np
from sklearn.model_selection import RandomizedSearchCV
## Number of trees in random forest
n_estimators=[int(x) for x in np.linspace(start=200,stop=2000,num=10)]
## number of features to consider at every split 
max_features=['auto','sqrt','log2']
#maximum number of levels in tree
max_depth=[int(x) for x in np.linspace(10,1000,10)]
#minimum number of sample required to split a node
min_samples_split=[1,2,4,5,6,7,10,14]
#minimum number of sample required at each leaf node
min_samples_leaf=[1,2,3,4,6,8,9]
## create the random grid
random_grid={'n_estimators':n_estimators,
             'max_features':max_features,
             'max_depth':max_depth,
             'min_samples_split':min_samples_split,
             'min_samples_leaf':min_samples_leaf,
             'criterion':['entropy','gini']
    
}
print(random_grid)

{'n_estimators': [200, 400, 600, 800, 1000, 1200, 1400, 1600, 1800, 2000], 'max_features': ['auto', 'sqrt', 'log2'], 'max_depth': [10, 120, 230, 340, 450, 560, 670, 780, 890, 1000], 'min_samples_split': [1, 2, 4, 5, 6, 7, 10, 14], 'min_samples_leaf': [1, 2, 3, 4, 6, 8, 9], 'criterion': ['entropy', 'gini']}


In [23]:
rf=RandomForestClassifier()
rf_randomcv=RandomizedSearchCV(estimator=rf,param_distributions=random_grid,
                               n_iter=100,random_state=100,n_jobs=-1,verbose=2,cv=3)
## fit the randomized model
rf_randomcv.fit(X_train,y_train)


Fitting 3 folds for each of 100 candidates, totalling 300 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 2 concurrent workers.
[Parallel(n_jobs=-1)]: Done  37 tasks      | elapsed:  1.8min
[Parallel(n_jobs=-1)]: Done 158 tasks      | elapsed:  9.6min
[Parallel(n_jobs=-1)]: Done 300 out of 300 | elapsed: 18.9min finished


RandomizedSearchCV(cv=3, estimator=RandomForestClassifier(), n_iter=100,
                   n_jobs=-1,
                   param_distributions={'criterion': ['entropy', 'gini'],
                                        'max_depth': [10, 120, 230, 340, 450,
                                                      560, 670, 780, 890,
                                                      1000],
                                        'max_features': ['auto', 'sqrt',
                                                         'log2'],
                                        'min_samples_leaf': [1, 2, 3, 4, 6, 8,
                                                             9],
                                        'min_samples_split': [1, 2, 4, 5, 6, 7,
                                                              10, 14],
                                        'n_estimators': [200, 400, 600, 800,
                                                         1000, 1200, 1400, 1600,
                  

In [25]:
rf_randomcv.best_params_

{'n_estimators': 200,
 'min_samples_split': 10,
 'min_samples_leaf': 3,
 'max_features': 'sqrt',
 'max_depth': 1000,
 'criterion': 'gini'}

In [29]:
rf_randomcv.best_estimator_

RandomForestClassifier(max_depth=1000, max_features='sqrt', min_samples_leaf=3,
                       min_samples_split=10, n_estimators=200)

In [30]:
best_random_grid=rf_randomcv.best_estimator_

In [32]:
from sklearn.metrics import confusion_matrix,accuracy_score,classification_report
y_pred=best_random_grid.predict(X_test)
print(confusion_matrix(y_pred,y_test))
print(f"Accuracy score {accuracy_score(y_pred,y_test)}")
print(f"classification report: {classification_report(y_pred,y_test)}")

[[87 25]
 [12 30]]
Accuracy score 0.7597402597402597
classification report:               precision    recall  f1-score   support

           0       0.88      0.78      0.82       112
           1       0.55      0.71      0.62        42

    accuracy                           0.76       154
   macro avg       0.71      0.75      0.72       154
weighted avg       0.79      0.76      0.77       154



### GridSearch CV

In [33]:
rf_randomcv.best_params_

{'n_estimators': 200,
 'min_samples_split': 10,
 'min_samples_leaf': 3,
 'max_features': 'sqrt',
 'max_depth': 1000,
 'criterion': 'gini'}

In [34]:
from sklearn.model_selection import GridSearchCV

In [35]:
[rf_randomcv.best_params_['min_samples_leaf'],
                               rf_randomcv.best_params_['min_samples_leaf']+2,
                               rf_randomcv.best_params_['min_samples_leaf']+4,
                               rf_randomcv.best_params_['min_samples_leaf']-5,
                               rf_randomcv.best_params_['min_samples_leaf']-6,
                               rf_randomcv.best_params_['min_samples_leaf']-1,
                               rf_randomcv.best_params_['min_samples_leaf']-3,
                               rf_randomcv.best_params_['min_samples_leaf']-2],

([3, 5, 7, -2, -3, 2, 0, 1],)

In [36]:
[rf_randomcv.best_params_['min_samples_split']-3,
                                rf_randomcv.best_params_['min_samples_split']-9,
                                rf_randomcv.best_params_['min_samples_split']+1,
                                rf_randomcv.best_params_['min_samples_split']+8,
                                rf_randomcv.best_params_['min_samples_split']+3,
                                rf_randomcv.best_params_['min_samples_split']+1,
                                rf_randomcv.best_params_['min_samples_split']+10]

[7, 1, 11, 18, 13, 11, 20]

In [37]:
[rf_randomcv.best_params_['n_estimators']+100,
                            rf_randomcv.best_params_['n_estimators']-100,
                            rf_randomcv.best_params_['n_estimators']-300,
                            rf_randomcv.best_params_['n_estimators']+400,
                            rf_randomcv.best_params_['n_estimators']+300],

([300, 100, -100, 600, 500],)

In [46]:
param_grid={'criterion':[rf_randomcv.best_params_['criterion']],
            'max_depth':[rf_randomcv.best_params_['max_depth']],
            'max_features':[rf_randomcv.best_params_['max_features']],
            'min_samples_leaf':[rf_randomcv.best_params_['min_samples_leaf'],
                               rf_randomcv.best_params_['min_samples_leaf']+2,
                               rf_randomcv.best_params_['min_samples_leaf']+4,
                               rf_randomcv.best_params_['min_samples_leaf']-5,
                               rf_randomcv.best_params_['min_samples_leaf']-6],
            'min_samples_split':[rf_randomcv.best_params_['min_samples_split'],
                                rf_randomcv.best_params_['min_samples_split']-9,
                                rf_randomcv.best_params_['min_samples_split']+8,
                                rf_randomcv.best_params_['min_samples_split']+3,
                                rf_randomcv.best_params_['min_samples_split']+10],
            'n_estimators':[rf_randomcv.best_params_['n_estimators'],
                            rf_randomcv.best_params_['n_estimators']+100,
                            rf_randomcv.best_params_['n_estimators']-100,
                            rf_randomcv.best_params_['n_estimators']-300,
                            rf_randomcv.best_params_['n_estimators']-300,
                            rf_randomcv.best_params_['n_estimators']+400]
            
    
}
print(param_grid)

{'criterion': ['gini'], 'max_depth': [1000], 'max_features': ['sqrt'], 'min_samples_leaf': [3, 5, 7, -2, -3], 'min_samples_split': [10, 1, 18, 13, 20], 'n_estimators': [200, 300, 100, -100, -100, 600]}


In [49]:
# 1*1*1*8*7*10

In [50]:
## fit the grid_search to the data
rf=RandomForestClassifier()
grid_search=GridSearchCV(estimator=rf,param_grid=param_grid,cv=10,n_jobs=1,verbose=2)
grid_search.fit(X_train,y_train)

Fitting 10 folds for each of 150 candidates, totalling 1500 fits
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=200 


[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=200, total=   0.8s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=200 


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.7s remaining:    0.0s


[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=200, total=   0.7s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=200, total=   0.7s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=200, total=   0.8s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=200, total=   0.7s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=200 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=600, total=   2.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=600, total=   2.3s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=600, total=   2.3s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=10, n_estimators=600 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=1, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=1, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=1, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=1, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=1, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=1, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=1, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=1, n_estimators=100 
[CV]  criterion=gini

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=1, n_estimators=600, total=   0.5s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=1, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=1, n_estimators=600, total=   0.5s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=18, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=18, n_estimators=200, total=   0.8s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=18, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=18, n_estimators=200, total=   0.7s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=18, n_estimators=200 
[CV]  criterion

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=18, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=18, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=18, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=18, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=18, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=18, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=18, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=18, n_estimators=600 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=13, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=13, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=13, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=13, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=13, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=13, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=13, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=13, n_estimators=100 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=13, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=13, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=13, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=20, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=20, n_estimators=200, total=   0.7s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=20, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=20, n_estimators=200, total=   0.7s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=20, n_estimators=200 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=20, n_estimators=600, total=   2.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=20, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=20, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=20, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=20, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=20, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=20, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=3, min_samples_split=20, n_estimators=600 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=10, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=10, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=10, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=10, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=10, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=10, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=10, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=10, n_estimators=100 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=10, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=10, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=10, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=1, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=1, n_estimators=200, total=   0.2s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=1, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=1, n_estimators=200, total=   0.2s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=1, n_estimators=200 
[CV]  criterion=g

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=1, n_estimators=-100, total=   0.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=1, n_estimators=-100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=1, n_estimators=-100, total=   0.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=1, n_estimators=-100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=1, n_estimators=-100, total=   0.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=1, n_estimators=-100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=1, n_estimators=-100, total=   0.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=1, n_estimators=-100 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=300, total=   1.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=300 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=300, total=   1.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=300 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=300, total=   1.3s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=300 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=300, total=   1.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=300 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=18, n_estimators=600 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=13, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=13, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=13, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=13, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=13, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=13, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=13, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=13, n_estimators=100 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=13, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=13, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=13, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=20, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=20, n_estimators=200, total=   0.7s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=20, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=20, n_estimators=200, total=   0.7s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=20, n_estimators=200 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=20, n_estimators=600, total=   2.2s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=20, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=20, n_estimators=600, total=   2.5s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=20, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=20, n_estimators=600, total=   2.3s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=20, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=20, n_estimators=600, total=   2.3s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=5, min_samples_split=20, n_estimators=600 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=10, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=10, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=10, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=10, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=10, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=10, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=10, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=10, n_estimators=100 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=10, n_estimators=600, total=   2.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=10, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=10, n_estimators=600, total=   2.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=1, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=1, n_estimators=200, total=   0.2s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=1, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=1, n_estimators=200, total=   0.2s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=1, n_estimators=200 
[CV]  criterion=g

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=1, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=1, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=1, n_estimators=600, total=   0.5s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=1, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=1, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=1, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=1, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=1, n_estimators=600 
[CV]  criterion=gini

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=18, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=18, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=18, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=18, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=18, n_estimators=100, total=   0.3s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=18, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=18, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=18, n_estimators=100 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=18, n_estimators=600, total=   2.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=18, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=18, n_estimators=600, total=   2.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=13, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=13, n_estimators=200, total=   0.7s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=13, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=13, n_estimators=200, total=   0.7s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=13, n_estimators=200 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=13, n_estimators=600, total=   2.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=13, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=13, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=13, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=13, n_estimators=600, total=   2.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=13, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=13, n_estimators=600, total=   2.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=13, n_estimators=600 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=20, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=20, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=20, n_estimators=100, total=   0.3s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=20, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=20, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=20, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=20, n_estimators=100, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=20, n_estimators=100 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=20, n_estimators=600, total=   2.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=20, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=7, min_samples_split=20, n_estimators=600, total=   2.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=10, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=10, n_estimators=200, total=   0.2s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=10, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=10, n_estimators=200, total=   0.3s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=10, n_estimators=200 
[CV]  c

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=10, n_estimators=600, total=   0.5s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=10, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=10, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=10, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=10, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=10, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=10, n_estimators=600, total=   0.5s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=10, n_estimators=600 
[CV]

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=1, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=1, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=1, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=1, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=1, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=1, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=1, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=1, n_estimators=100 
[CV]  criter

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=1, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=200, total=   0.2s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=200, total=   0.2s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=200, total=   0.2s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=200 
[CV] 

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=600, total=   0.5s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=18, n_estimators=600 
[CV]

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=13, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=13, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=13, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=13, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=13, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=13, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=13, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=13, n_estimators=100 
[CV]

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=13, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=13, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=13, n_estimators=600, total=   0.5s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=20, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=20, n_estimators=200, total=   0.2s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=20, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=20, n_estimators=200, total=   0.2s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=20, n_estimators=200 
[CV]

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=20, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=20, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=20, n_estimators=600, total=   0.7s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=20, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=20, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=20, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=20, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-2, min_samples_split=20, n_estimators=600 
[CV]

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=10, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=10, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=10, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=10, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=10, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=10, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=10, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=10, n_estimators=100 
[CV]

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=10, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=10, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=10, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=10, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=10, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=1, n_estimators=200 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=1, n_estimators=200, total=   0.2s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=1, n_estimators=200 
[CV]  c

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=1, n_estimators=-100, total=   0.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=1, n_estimators=-100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=1, n_estimators=-100, total=   0.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=1, n_estimators=-100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=1, n_estimators=-100, total=   0.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=1, n_estimators=-100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=1, n_estimators=-100, total=   0.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=1, n_estimators=-100 
[CV]

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=300, total=   0.3s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=300 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=300, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=300 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=300, total=   0.3s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=300 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=300, total=   0.4s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=300 
[CV]

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=-100, total=   0.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=-100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=-100, total=   0.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=18, n_estimators=600 
[

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=100 
[CV]

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=600, total=   0.7s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=600 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=600, total=   0.6s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=13, n_estimators=600 
[CV]

[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=20, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=20, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=20, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=20, n_estimators=100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=20, n_estimators=100, total=   0.1s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=20, n_estimators=-100 
[CV]  criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=20, n_estimators=-100, total=   0.0s
[CV] criterion=gini, max_depth=1000, max_features=sqrt, min_samples_leaf=-3, min_samples_split=20, n_estimators=-100 
[

[Parallel(n_jobs=1)]: Done 1500 out of 1500 | elapsed: 11.1min finished


GridSearchCV(cv=10, estimator=RandomForestClassifier(), n_jobs=1,
             param_grid={'criterion': ['gini'], 'max_depth': [1000],
                         'max_features': ['sqrt'],
                         'min_samples_leaf': [3, 5, 7, -2, -3],
                         'min_samples_split': [10, 1, 18, 13, 20],
                         'n_estimators': [200, 300, 100, -100, -100, 600]},
             verbose=2)

In [51]:
grid_search.best_estimator_

RandomForestClassifier(max_depth=1000, max_features='sqrt', min_samples_leaf=7,
                       min_samples_split=20, n_estimators=600)

In [52]:
best_grid=grid_search.best_estimator_

In [53]:
best_grid

RandomForestClassifier(max_depth=1000, max_features='sqrt', min_samples_leaf=7,
                       min_samples_split=20, n_estimators=600)

In [54]:
y_pred=best_grid.predict(X_test)
print(confusion_matrix(y_test,y_pred))
print(f"Accuracy Score: {accuracy_score(y_test,y_pred)}")
print(f"Classification report: {classification_report(y_test,y_pred)}")

[[86 13]
 [25 30]]
Accuracy Score: 0.7532467532467533
Classification report:               precision    recall  f1-score   support

           0       0.77      0.87      0.82        99
           1       0.70      0.55      0.61        55

    accuracy                           0.75       154
   macro avg       0.74      0.71      0.72       154
weighted avg       0.75      0.75      0.75       154



### Automated Hyperparameter Tuning
Automated Hyperparameter Tuning cab be done by using techniques such as
- Bayesian Optimization
- Gradient Descent
- Evolutionary Algorithms


## Bayesian Optimization
Bayesian optimization uses probability to find the minimum of a function. The final aim is to find the input value to a function which can gives us the lowest possible output value.It usually performs better than random,grid and manual search providing better performance in the testing phase and reduced optimization time. In Hyperopt, Bayesian Optimization can be implemented giving 3 three main parameters to the function fmin.

- Objective Function = defines the loss function to minimize.
- Domain Space = defines the range of input values to test (in Bayesian Optimization this space creates a probability distribution for each of the used Hyperparameters).
- Optimization Algorithm = defines the search algorithm to use to select the best input values to use in each new iteration.

In [55]:
## pip install hyperopt
from hyperopt import hp,fmin,tpe,STATUS_OK,Trials

In [60]:
space={'criterion':hp.choice('criterion',['entropy','gini']),
       'max_depth':hp.quniform('max_depth',10,1200,10),
       'max_features':hp.choice('max_features',['auto','sqrt','log2',None]),
       'min_samples_leaf':hp.uniform('min_samples_leaf',0,0.5),
       'min_samples_split':hp.uniform('min_samples_split',0,1),
       'n_estimators':hp.choice('n_estimators',[10,50,300,750,1200,1300,1500])
    
}
print(space)

{'criterion': <hyperopt.pyll.base.Apply object at 0x0E7C1FE8>, 'max_depth': <hyperopt.pyll.base.Apply object at 0x0E7B8118>, 'max_features': <hyperopt.pyll.base.Apply object at 0x0E7B8760>, 'min_samples_leaf': <hyperopt.pyll.base.Apply object at 0x0E7B8778>, 'min_samples_split': <hyperopt.pyll.base.Apply object at 0x0E7B89D0>, 'n_estimators': <hyperopt.pyll.base.Apply object at 0x0E7B8B80>}


In [61]:
space

{'criterion': <hyperopt.pyll.base.Apply at 0xe7c1fe8>,
 'max_depth': <hyperopt.pyll.base.Apply at 0xe7b8118>,
 'max_features': <hyperopt.pyll.base.Apply at 0xe7b8760>,
 'min_samples_leaf': <hyperopt.pyll.base.Apply at 0xe7b8778>,
 'min_samples_split': <hyperopt.pyll.base.Apply at 0xe7b89d0>,
 'n_estimators': <hyperopt.pyll.base.Apply at 0xe7b8b80>}

In [62]:
space['criterion']

<hyperopt.pyll.base.Apply at 0xe7c1fe8>

In [64]:
## this is my objective function
def objective(space):
    model=RandomForestClassifier(criterion=space['criterion'],max_depth=space['max_depth'],
                                max_features=space['max_features'],
                                min_samples_leaf=space['min_samples_leaf'],
                                min_samples_split=space['min_samples_split'],
                                n_estimators=space['n_estimators'])
    accuracy=cross_val_score(model,X_train,y_train,cv=5).mean()
    
    ## we aim to maximize accuracy,therefore we return it as a negative value
    return {'loss': -accuracy, 'status':STATUS_OK}
    

In [65]:
from sklearn.model_selection import cross_val_score
trials=Trials()
best=fmin(fn=objective,
         space=space,
         algo=tpe.suggest,
         max_evals=80,
         trials=trials)
best

100%|███████████████████████████████████████████████| 80/80 [24:16<00:00, 18.20s/trial, best loss: -0.7769159003065441]


{'criterion': 0,
 'max_depth': 690.0,
 'max_features': 3,
 'min_samples_leaf': 0.0005655075909129226,
 'min_samples_split': 0.14004675535372835,
 'n_estimators': 3}

In [66]:
## mapping
crit={0:'entropy',1:'gini'}
feat={0:'auto',1:'sqrt',2:'log2',3:None}
est={0: 10, 1: 50, 2: 300, 3: 750, 4: 1200, 5: 1300, 6: 1500}
print(crit[best['criterion']])
print(feat[best['max_features']])
print(est[best['n_estimators']])

entropy
None
750


In [67]:
best['min_samples_leaf']

0.0005655075909129226

In [70]:
trainedforest=RandomForestClassifier(criterion=crit[best['criterion']],max_depth=best['max_depth'],
                                    max_features=feat[best['max_features']],
                                    min_samples_leaf=best['min_samples_leaf'],
                                    min_samples_split=best['min_samples_split'],
                                    n_estimators=est[best['n_estimators']]).fit(X_train,y_train)
predictionforest=trainedforest.predict(X_test)
print(confusion_matrix(predictionforest,y_test))
print(accuracy_score(predictionforest,y_test))
print(classification_report(predictionforest,y_test))
acc5=accuracy_score(y_test,predictionforest)

[[86 25]
 [13 30]]
0.7532467532467533
              precision    recall  f1-score   support

           0       0.87      0.77      0.82       111
           1       0.55      0.70      0.61        43

    accuracy                           0.75       154
   macro avg       0.71      0.74      0.72       154
weighted avg       0.78      0.75      0.76       154

