### Boosting

Important Parameters

- **max_depth** : Depth of the Trees ( How Deep each Individual Tree can go )

- **n_estimators** : Number of Estimators ( N Independent Decision Trees )

- **learning_rate** : How Quickly and whether or not the Algorithm will find the Optimal Solution ( Minima )

In [1]:
import joblib
import pandas as pd
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import GridSearchCV

import warnings
warnings.filterwarnings('ignore',category=FutureWarning)
warnings.filterwarnings('ignore',category=DeprecationWarning)

X_train = pd.read_csv('../Data/X_train.csv')
y_train = pd.read_csv('../Data/Y_train.csv')

In [2]:
def performance(results):
    print(f'Best Parameters : {results.best_params_}\n')
    mean = results.cv_results_['mean_test_score']
    std = results.cv_results_['std_test_score']
    params = results.cv_results_['params']
    for mean, std, params in zip(mean, std, params):
        print(f'{round(mean,2)} | (+/-{round(std*2,2)}) for {params}')

In [3]:
gbc = GradientBoostingClassifier()
parameters = {
    'n_estimators':[5,50,250,500],
    'max_depth':[1,3,5,7,9],
    'learning_rate':[0.01,0.1,1,10,100]
}

gscv = GridSearchCV(gbc, parameters, cv=5)
gscv.fit(X_train, y_train.values.ravel())

print(performance(gscv))

Best Parameters : {'learning_rate': 0.01, 'max_depth': 3, 'n_estimators': 500}

0.62 | (+/-0.01) for {'learning_rate': 0.01, 'max_depth': 1, 'n_estimators': 5}
0.8 | (+/-0.12) for {'learning_rate': 0.01, 'max_depth': 1, 'n_estimators': 50}
0.8 | (+/-0.12) for {'learning_rate': 0.01, 'max_depth': 1, 'n_estimators': 250}
0.81 | (+/-0.12) for {'learning_rate': 0.01, 'max_depth': 1, 'n_estimators': 500}
0.62 | (+/-0.01) for {'learning_rate': 0.01, 'max_depth': 3, 'n_estimators': 5}
0.81 | (+/-0.07) for {'learning_rate': 0.01, 'max_depth': 3, 'n_estimators': 50}
0.83 | (+/-0.07) for {'learning_rate': 0.01, 'max_depth': 3, 'n_estimators': 250}
0.84 | (+/-0.08) for {'learning_rate': 0.01, 'max_depth': 3, 'n_estimators': 500}
0.62 | (+/-0.01) for {'learning_rate': 0.01, 'max_depth': 5, 'n_estimators': 5}
0.82 | (+/-0.05) for {'learning_rate': 0.01, 'max_depth': 5, 'n_estimators': 50}
0.82 | (+/-0.04) for {'learning_rate': 0.01, 'max_depth': 5, 'n_estimators': 250}
0.83 | (+/-0.05) for {'learni

In [4]:
gscv.best_estimator_

GradientBoostingClassifier(learning_rate=0.01, n_estimators=500)

Write Model **Pickle** 

In [5]:
joblib.dump(gscv.best_estimator_,'../Data/GBC_Model.pkl')

['../Data/GBC_Model.pkl']