## Boosting Algorithms in Sklearn -
**Boosting algorithms comes under ensemble machine learning algorithms**

In this notebook we'll be using this 5 Algorithms
1. Gradient boost inbuild Bossting Algorithm (sklearn.ensemble)
2. XGboost (Regression and Classification)
3. LightGBM (Regression and Classification)
4. CatBoost (Regression and Classification)
5. AdaBoost (Regression and Classification)

### About Gradient boost

Gradient boosting refers to a class of ensemble machine learning algorithms that can be used for classification or regression predictive modeling problems.

Gradient boosting is also known as gradient tree boosting, stochastic gradient boosting (an extension), and gradient boosting machines, or GBM for short.

Ensembles are constructed from decision tree models. Trees are added one at a time to the ensemble and fit to correct the prediction errors made by prior models. This is a type of ensemble machine learning model referred to as boosting.

Models are fit using any arbitrary differentiable loss function and gradient descent optimization algorithm. This gives the technique its name, “gradient boosting,” as the loss gradient is minimized as the model is fit, much like a neural network.

Gradient boosting is an effective machine learning algorithm and is often the main, or one of the main, algorithms used to win machine learning competitions (like Kaggle) on tabular and similar structured datasets.

### Gradient Boost

In [2]:
#Importing all modules and dataset
from sklearn.datasets import load_breast_cancer
from sklearn.datasets import fetch_california_housing
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import train_test_split, cross_val_score, RepeatedStratifiedKFold


**Classification Dataset**

In [52]:
BreastCancer = load_breast_cancer()
BreastCancer_df = pd.DataFrame(BreastCancer.data, columns = BreastCancer.feature_names)
BreastCancer_df["Target"] = BreastCancer.target
print(BreastCancer_df.shape)
BreastCancer_df.head()

(569, 31)


Unnamed: 0,mean radius,mean texture,mean perimeter,mean area,mean smoothness,mean compactness,mean concavity,mean concave points,mean symmetry,mean fractal dimension,...,worst texture,worst perimeter,worst area,worst smoothness,worst compactness,worst concavity,worst concave points,worst symmetry,worst fractal dimension,Target
0,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,0.07871,...,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189,0
1,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,0.05667,...,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902,0
2,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,0.05999,...,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758,0
3,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,0.2597,0.09744,...,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173,0
4,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,0.1809,0.05883,...,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678,0


In [13]:
np.random.seed(42)
X, y = BreastCancer_df.drop("Target",axis=1), BreastCancer_df.Target
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)

In [14]:
clf = GradientBoostingClassifier(random_state=42)
clf.fit(X_train,y_train)

In [15]:
y_preds = clf.predict(X_test)
y_preds

array([1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1,
       0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1,
       1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1,
       0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0,
       1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1,
       0, 1, 1, 0])

In [16]:
GB_clf_score =clf.score(X_test,y_test)

0.956140350877193

In [20]:
cv = RepeatedStratifiedKFold(n_splits = 10, n_repeats=3, random_state = 42)
GB_clf_cv = np.mean(cross_val_score(clf,X,y,cv=cv)) # Cross validated score on 5 kfolds
GB_clf_cv

0.959628237259816

## Regression Dataset

In [32]:
CaliforniaHousing = fetch_california_housing()
ch_df = pd.DataFrame(CaliforniaHousing.data, columns = CaliforniaHousing.feature_names)
ch_df["Target"] = CaliforniaHousing.target
ch_df.head()

Unnamed: 0,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude,Target
0,8.3252,41.0,6.984127,1.02381,322.0,2.555556,37.88,-122.23,4.526
1,8.3014,21.0,6.238137,0.97188,2401.0,2.109842,37.86,-122.22,3.585
2,7.2574,52.0,8.288136,1.073446,496.0,2.80226,37.85,-122.24,3.521
3,5.6431,52.0,5.817352,1.073059,558.0,2.547945,37.85,-122.25,3.413
4,3.8462,52.0,6.281853,1.081081,565.0,2.181467,37.85,-122.25,3.422


In [33]:
np.random.seed(42)
X, y = ch_df.drop("Target",axis= 1), ch_df.Target
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)
X_train.shape, X_test.shape, y_train.shape, y_test.shape, ch_df.shape

((16512, 8), (4128, 8), (16512,), (4128,), (20640, 9))

In [34]:
model = GradientBoostingRegressor(random_state = 42)
model.fit(X_train,y_train)

In [38]:
model_score = model.score(X_test,y_test)
# cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state = 42)
GB_model_cv = np.mean(cross_val_score(model,X,y,cv=5))
GB_model_cv, model_score

(0.6698291094959469, 0.7756446042829697)

**Let's compare this result with RandomForestRegressor**

In [26]:
from sklearn.ensemble import RandomForestRegressor
model = RandomForestRegressor(random_state=42)
model.fit(X_train,y_train)
model.score(X_test,y_test)

0.8051230593157366

## Let's use HistGradientBoost algorithms
The scikit-learn library provides an alternate implementation of the gradient boosting algorithm, referred to as histogram-based gradient boosting.

This is an alternate approach to implement gradient tree boosting inspired by the LightGBM library (described more later). This implementation is provided via the HistGradientBoostingClassifier and HistGradientBoostingRegressor classes.

The primary benefit of the histogram-based approach to gradient boosting is speed. These implementations are designed to be much faster to fit on training data.

In [40]:
from sklearn.ensemble import HistGradientBoostingRegressor
np.random.seed(42)
model = HistGradientBoostingRegressor()
model.fit(X_train,y_train)
model.score(X_test,y_test)

0.8355440864062386

In [42]:
from sklearn.ensemble import HistGradientBoostingClassifier
np.random.seed(42)
X, y = BreastCancer_df.drop("Target",axis=1), BreastCancer_df.Target
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)
clf = HistGradientBoostingClassifier()
clf.fit(X_train,y_train)
clf.score(X_test,y_test) # see this time it is less than normal Gradient boost 

0.9736842105263158

## Gradient Boosting with `XGBoost`

XGBoost, which is short for “Extreme Gradient Boosting,” is a library that provides an efficient implementation of the gradient boosting algorithm.

The main benefit of the XGBoost implementation is computational efficiency and often better model performance.

`pip install xgboost` or `conda install xgboost`

**Classification**

In [45]:
from xgboost import XGBClassifier
np.random.seed(42)
X, y = BreastCancer_df.drop("Target",axis=1), BreastCancer_df.Target
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)
clf = XGBClassifier()
clf.fit(X_train,y_train)
clf.score(X_test,y_test)

0.956140350877193

**Regression**

In [46]:
from xgboost import XGBRegressor
np.random.seed(42)
X, y = ch_df.drop("Target",axis= 1), ch_df.Target
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)
model = XGBRegressor()
model.fit(X_train,y_train)
model.score(X_test,y_test)

0.828616180679985

In [47]:
#with objective = 'reg:squarederror'
model = XGBRegressor(objective='reg:squarederror')
model.fit(X_train,y_train)
model.score(X_test,y_test)

0.828616180679985

## Gradient Boosting With LightGBM

LightGBM, short for Light Gradient Boosted Machine, is a library developed at Microsoft that provides an efficient implementation of the gradient boosting algorithm.

The primary benefit of the LightGBM is the changes to the training algorithm that make the process dramatically faster, and in many cases, result in a more effective model.

For more technical details on the LightGBM algorithm, see the paper:
https://papers.nips.cc/paper_files/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html

`pip install lightgbm` or `conda install lightgbm`

**Classification**

In [49]:
from lightgbm import LGBMClassifier
np.random.seed(42)
X, y = BreastCancer_df.drop("Target",axis=1), BreastCancer_df.Target
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)
clf = LGBMClassifier()
clf.fit(X_train,y_train)
clf.score(X_test,y_test)

0.9649122807017544

**Regression**

In [50]:
from lightgbm import LGBMRegressor
np.random.seed(42)
X, y = ch_df.drop("Target",axis= 1), ch_df.Target
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)
model = LGBMRegressor()
model.fit(X_train,y_train)
model.score(X_test,y_test)

0.8360449251645318

## Gradient Boosting with CatBoost
CatBoost is a third-party library developed at `Yandex` that provides an efficient implementation of the gradient boosting algorithm.

> The primary benefit of the CatBoost (in addition to computational speed improvements) is support for categorical input variables. This gives the library its name CatBoost for “Category Gradient Boosting.”

For more technical details on the CatBoost algorithm, see the paper:
https://arxiv.org/abs/1810.11363

`pip install catboost` or `conda install catboost`

**Classification**

In [51]:
from catboost import CatBoostClassifier
np.random.seed(42)
X, y = BreastCancer_df.drop("Target",axis=1), BreastCancer_df.Target
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)
clf = CatBoostClassifier()
clf.fit(X_train,y_train)
clf.score(X_test,y_test)

Learning rate set to 0.00736
0:	learn: 0.6812707	total: 135ms	remaining: 2m 14s
1:	learn: 0.6692554	total: 143ms	remaining: 1m 11s
2:	learn: 0.6584000	total: 150ms	remaining: 49.9s
3:	learn: 0.6482154	total: 158ms	remaining: 39.3s
4:	learn: 0.6375285	total: 165ms	remaining: 32.8s
5:	learn: 0.6284122	total: 172ms	remaining: 28.4s
6:	learn: 0.6165950	total: 180ms	remaining: 25.6s
7:	learn: 0.6064602	total: 187ms	remaining: 23.2s
8:	learn: 0.5955532	total: 197ms	remaining: 21.6s
9:	learn: 0.5862556	total: 203ms	remaining: 20.1s
10:	learn: 0.5759064	total: 210ms	remaining: 18.9s
11:	learn: 0.5665626	total: 218ms	remaining: 18s
12:	learn: 0.5565223	total: 223ms	remaining: 16.9s
13:	learn: 0.5480549	total: 228ms	remaining: 16s
14:	learn: 0.5396360	total: 234ms	remaining: 15.4s
15:	learn: 0.5311919	total: 240ms	remaining: 14.8s
16:	learn: 0.5238194	total: 245ms	remaining: 14.2s
17:	learn: 0.5156165	total: 252ms	remaining: 13.7s
18:	learn: 0.5084395	total: 257ms	remaining: 13.3s
19:	learn: 0.5

188:	learn: 0.0925405	total: 1.19s	remaining: 5.12s
189:	learn: 0.0917720	total: 1.2s	remaining: 5.13s
190:	learn: 0.0910676	total: 1.21s	remaining: 5.12s
191:	learn: 0.0906266	total: 1.22s	remaining: 5.12s
192:	learn: 0.0902439	total: 1.22s	remaining: 5.12s
193:	learn: 0.0898034	total: 1.23s	remaining: 5.12s
194:	learn: 0.0892914	total: 1.24s	remaining: 5.11s
195:	learn: 0.0887706	total: 1.24s	remaining: 5.1s
196:	learn: 0.0881590	total: 1.25s	remaining: 5.1s
197:	learn: 0.0876772	total: 1.25s	remaining: 5.08s
198:	learn: 0.0872278	total: 1.26s	remaining: 5.08s
199:	learn: 0.0867208	total: 1.27s	remaining: 5.07s
200:	learn: 0.0860662	total: 1.27s	remaining: 5.07s
201:	learn: 0.0856647	total: 1.28s	remaining: 5.07s
202:	learn: 0.0851897	total: 1.29s	remaining: 5.06s
203:	learn: 0.0848525	total: 1.29s	remaining: 5.05s
204:	learn: 0.0843849	total: 1.3s	remaining: 5.04s
205:	learn: 0.0838887	total: 1.31s	remaining: 5.04s
206:	learn: 0.0833576	total: 1.31s	remaining: 5.03s
207:	learn: 0.08

354:	learn: 0.0427229	total: 2.19s	remaining: 3.98s
355:	learn: 0.0425118	total: 2.19s	remaining: 3.97s
356:	learn: 0.0423602	total: 2.21s	remaining: 3.98s
357:	learn: 0.0422105	total: 2.21s	remaining: 3.97s
358:	learn: 0.0420498	total: 2.22s	remaining: 3.96s
359:	learn: 0.0419552	total: 2.23s	remaining: 3.96s
360:	learn: 0.0418603	total: 2.23s	remaining: 3.95s
361:	learn: 0.0417271	total: 2.24s	remaining: 3.94s
362:	learn: 0.0415793	total: 2.24s	remaining: 3.94s
363:	learn: 0.0414275	total: 2.25s	remaining: 3.93s
364:	learn: 0.0413226	total: 2.25s	remaining: 3.92s
365:	learn: 0.0411594	total: 2.26s	remaining: 3.92s
366:	learn: 0.0410132	total: 2.27s	remaining: 3.91s
367:	learn: 0.0408437	total: 2.27s	remaining: 3.9s
368:	learn: 0.0407369	total: 2.28s	remaining: 3.9s
369:	learn: 0.0405729	total: 2.28s	remaining: 3.89s
370:	learn: 0.0403942	total: 2.29s	remaining: 3.88s
371:	learn: 0.0402594	total: 2.29s	remaining: 3.87s
372:	learn: 0.0401218	total: 2.3s	remaining: 3.87s
373:	learn: 0.0

530:	learn: 0.0249697	total: 3.18s	remaining: 2.81s
531:	learn: 0.0248757	total: 3.19s	remaining: 2.81s
532:	learn: 0.0248165	total: 3.19s	remaining: 2.8s
533:	learn: 0.0247645	total: 3.2s	remaining: 2.79s
534:	learn: 0.0246936	total: 3.21s	remaining: 2.79s
535:	learn: 0.0246348	total: 3.21s	remaining: 2.78s
536:	learn: 0.0245991	total: 3.22s	remaining: 2.78s
537:	learn: 0.0245398	total: 3.23s	remaining: 2.77s
538:	learn: 0.0244665	total: 3.23s	remaining: 2.76s
539:	learn: 0.0243788	total: 3.23s	remaining: 2.75s
540:	learn: 0.0243006	total: 3.24s	remaining: 2.75s
541:	learn: 0.0242498	total: 3.25s	remaining: 2.74s
542:	learn: 0.0241898	total: 3.25s	remaining: 2.73s
543:	learn: 0.0241256	total: 3.25s	remaining: 2.73s
544:	learn: 0.0240617	total: 3.26s	remaining: 2.72s
545:	learn: 0.0240099	total: 3.27s	remaining: 2.71s
546:	learn: 0.0239438	total: 3.27s	remaining: 2.71s
547:	learn: 0.0238856	total: 3.27s	remaining: 2.7s
548:	learn: 0.0238346	total: 3.28s	remaining: 2.69s
549:	learn: 0.0

703:	learn: 0.0161939	total: 4.17s	remaining: 1.75s
704:	learn: 0.0161484	total: 4.17s	remaining: 1.75s
705:	learn: 0.0161050	total: 4.18s	remaining: 1.74s
706:	learn: 0.0160675	total: 4.19s	remaining: 1.74s
707:	learn: 0.0160374	total: 4.19s	remaining: 1.73s
708:	learn: 0.0160064	total: 4.2s	remaining: 1.72s
709:	learn: 0.0159561	total: 4.2s	remaining: 1.72s
710:	learn: 0.0159188	total: 4.21s	remaining: 1.71s
711:	learn: 0.0158936	total: 4.21s	remaining: 1.71s
712:	learn: 0.0158576	total: 4.22s	remaining: 1.7s
713:	learn: 0.0158182	total: 4.22s	remaining: 1.69s
714:	learn: 0.0157924	total: 4.23s	remaining: 1.69s
715:	learn: 0.0157574	total: 4.23s	remaining: 1.68s
716:	learn: 0.0157274	total: 4.24s	remaining: 1.67s
717:	learn: 0.0157005	total: 4.25s	remaining: 1.67s
718:	learn: 0.0156583	total: 4.25s	remaining: 1.66s
719:	learn: 0.0156367	total: 4.25s	remaining: 1.65s
720:	learn: 0.0155991	total: 4.26s	remaining: 1.65s
721:	learn: 0.0155705	total: 4.26s	remaining: 1.64s
722:	learn: 0.0

885:	learn: 0.0112208	total: 5.16s	remaining: 664ms
886:	learn: 0.0111963	total: 5.17s	remaining: 658ms
887:	learn: 0.0111671	total: 5.17s	remaining: 652ms
888:	learn: 0.0111388	total: 5.18s	remaining: 647ms
889:	learn: 0.0111127	total: 5.18s	remaining: 641ms
890:	learn: 0.0110842	total: 5.19s	remaining: 635ms
891:	learn: 0.0110658	total: 5.2s	remaining: 630ms
892:	learn: 0.0110500	total: 5.2s	remaining: 624ms
893:	learn: 0.0110261	total: 5.21s	remaining: 618ms
894:	learn: 0.0109997	total: 5.22s	remaining: 612ms
895:	learn: 0.0109738	total: 5.22s	remaining: 606ms
896:	learn: 0.0109560	total: 5.23s	remaining: 601ms
897:	learn: 0.0109461	total: 5.24s	remaining: 595ms
898:	learn: 0.0109238	total: 5.24s	remaining: 589ms
899:	learn: 0.0109033	total: 5.25s	remaining: 583ms
900:	learn: 0.0108849	total: 5.25s	remaining: 577ms
901:	learn: 0.0108645	total: 5.26s	remaining: 572ms
902:	learn: 0.0108464	total: 5.27s	remaining: 566ms
903:	learn: 0.0108370	total: 5.27s	remaining: 560ms
904:	learn: 0.

0.9736842105263158

**Regression**

In [53]:
from catboost import CatBoostRegressor
np.random.seed(42)
X, y = ch_df.drop("Target",axis= 1), ch_df.Target
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)
model = CatBoostRegressor()
model.fit(X_train,y_train)
model.score(X_test,y_test)

Learning rate set to 0.063766
0:	learn: 1.1147097	total: 8.56ms	remaining: 8.55s
1:	learn: 1.0778029	total: 16.8ms	remaining: 8.4s
2:	learn: 1.0430071	total: 23.9ms	remaining: 7.94s
3:	learn: 1.0106614	total: 29.2ms	remaining: 7.26s
4:	learn: 0.9819674	total: 36.2ms	remaining: 7.2s
5:	learn: 0.9546991	total: 43.2ms	remaining: 7.16s
6:	learn: 0.9281868	total: 47.9ms	remaining: 6.8s
7:	learn: 0.9037831	total: 53.3ms	remaining: 6.61s
8:	learn: 0.8825569	total: 57.2ms	remaining: 6.29s
9:	learn: 0.8608727	total: 62.6ms	remaining: 6.19s
10:	learn: 0.8418439	total: 67.1ms	remaining: 6.03s
11:	learn: 0.8242003	total: 70.5ms	remaining: 5.81s
12:	learn: 0.8085311	total: 74.2ms	remaining: 5.63s
13:	learn: 0.7918148	total: 78.4ms	remaining: 5.52s
14:	learn: 0.7785217	total: 82.3ms	remaining: 5.4s
15:	learn: 0.7662624	total: 85.8ms	remaining: 5.28s
16:	learn: 0.7528313	total: 90.5ms	remaining: 5.23s
17:	learn: 0.7416752	total: 94.3ms	remaining: 5.14s
18:	learn: 0.7316411	total: 97.9ms	remaining: 5.

195:	learn: 0.4696456	total: 790ms	remaining: 3.24s
196:	learn: 0.4689133	total: 793ms	remaining: 3.23s
197:	learn: 0.4684947	total: 797ms	remaining: 3.23s
198:	learn: 0.4680273	total: 802ms	remaining: 3.23s
199:	learn: 0.4676281	total: 807ms	remaining: 3.23s
200:	learn: 0.4672739	total: 813ms	remaining: 3.23s
201:	learn: 0.4668811	total: 818ms	remaining: 3.23s
202:	learn: 0.4665357	total: 823ms	remaining: 3.23s
203:	learn: 0.4661845	total: 829ms	remaining: 3.23s
204:	learn: 0.4656797	total: 834ms	remaining: 3.23s
205:	learn: 0.4652623	total: 838ms	remaining: 3.23s
206:	learn: 0.4649389	total: 842ms	remaining: 3.22s
207:	learn: 0.4646272	total: 846ms	remaining: 3.22s
208:	learn: 0.4641647	total: 850ms	remaining: 3.21s
209:	learn: 0.4638000	total: 853ms	remaining: 3.21s
210:	learn: 0.4635835	total: 857ms	remaining: 3.21s
211:	learn: 0.4628980	total: 861ms	remaining: 3.2s
212:	learn: 0.4625342	total: 865ms	remaining: 3.2s
213:	learn: 0.4623055	total: 869ms	remaining: 3.19s
214:	learn: 0.

399:	learn: 0.4148640	total: 1.78s	remaining: 2.67s
400:	learn: 0.4147118	total: 1.78s	remaining: 2.66s
401:	learn: 0.4145522	total: 1.79s	remaining: 2.66s
402:	learn: 0.4144364	total: 1.79s	remaining: 2.65s
403:	learn: 0.4142999	total: 1.79s	remaining: 2.65s
404:	learn: 0.4141017	total: 1.8s	remaining: 2.64s
405:	learn: 0.4139109	total: 1.8s	remaining: 2.64s
406:	learn: 0.4137930	total: 1.81s	remaining: 2.63s
407:	learn: 0.4135319	total: 1.81s	remaining: 2.63s
408:	learn: 0.4133318	total: 1.82s	remaining: 2.63s
409:	learn: 0.4132142	total: 1.82s	remaining: 2.62s
410:	learn: 0.4130245	total: 1.82s	remaining: 2.62s
411:	learn: 0.4128411	total: 1.83s	remaining: 2.61s
412:	learn: 0.4126216	total: 1.83s	remaining: 2.61s
413:	learn: 0.4124096	total: 1.84s	remaining: 2.6s
414:	learn: 0.4122860	total: 1.84s	remaining: 2.6s
415:	learn: 0.4120373	total: 1.84s	remaining: 2.59s
416:	learn: 0.4119071	total: 1.85s	remaining: 2.58s
417:	learn: 0.4117772	total: 1.85s	remaining: 2.58s
418:	learn: 0.41

589:	learn: 0.3851137	total: 2.56s	remaining: 1.78s
590:	learn: 0.3850243	total: 2.56s	remaining: 1.77s
591:	learn: 0.3849142	total: 2.57s	remaining: 1.77s
592:	learn: 0.3847424	total: 2.57s	remaining: 1.76s
593:	learn: 0.3846666	total: 2.58s	remaining: 1.76s
594:	learn: 0.3845686	total: 2.58s	remaining: 1.75s
595:	learn: 0.3844296	total: 2.58s	remaining: 1.75s
596:	learn: 0.3843326	total: 2.59s	remaining: 1.75s
597:	learn: 0.3842326	total: 2.59s	remaining: 1.74s
598:	learn: 0.3841192	total: 2.6s	remaining: 1.74s
599:	learn: 0.3839724	total: 2.6s	remaining: 1.73s
600:	learn: 0.3838454	total: 2.6s	remaining: 1.73s
601:	learn: 0.3836326	total: 2.61s	remaining: 1.73s
602:	learn: 0.3835149	total: 2.61s	remaining: 1.72s
603:	learn: 0.3834526	total: 2.62s	remaining: 1.72s
604:	learn: 0.3833570	total: 2.62s	remaining: 1.71s
605:	learn: 0.3832236	total: 2.62s	remaining: 1.71s
606:	learn: 0.3830879	total: 2.63s	remaining: 1.7s
607:	learn: 0.3829536	total: 2.63s	remaining: 1.7s
608:	learn: 0.382

775:	learn: 0.3636285	total: 3.37s	remaining: 974ms
776:	learn: 0.3634953	total: 3.38s	remaining: 970ms
777:	learn: 0.3633941	total: 3.38s	remaining: 965ms
778:	learn: 0.3633098	total: 3.38s	remaining: 960ms
779:	learn: 0.3632252	total: 3.39s	remaining: 956ms
780:	learn: 0.3631540	total: 3.4s	remaining: 952ms
781:	learn: 0.3630184	total: 3.4s	remaining: 948ms
782:	learn: 0.3629240	total: 3.4s	remaining: 944ms
783:	learn: 0.3627158	total: 3.41s	remaining: 939ms
784:	learn: 0.3626153	total: 3.41s	remaining: 934ms
785:	learn: 0.3625291	total: 3.41s	remaining: 930ms
786:	learn: 0.3624033	total: 3.42s	remaining: 925ms
787:	learn: 0.3623070	total: 3.42s	remaining: 921ms
788:	learn: 0.3621686	total: 3.43s	remaining: 916ms
789:	learn: 0.3620567	total: 3.43s	remaining: 912ms
790:	learn: 0.3618838	total: 3.43s	remaining: 907ms
791:	learn: 0.3617983	total: 3.44s	remaining: 903ms
792:	learn: 0.3617360	total: 3.44s	remaining: 898ms
793:	learn: 0.3616648	total: 3.44s	remaining: 893ms
794:	learn: 0.3

981:	learn: 0.3444015	total: 4.17s	remaining: 76.4ms
982:	learn: 0.3443130	total: 4.17s	remaining: 72.1ms
983:	learn: 0.3442872	total: 4.18s	remaining: 67.9ms
984:	learn: 0.3441862	total: 4.18s	remaining: 63.7ms
985:	learn: 0.3441019	total: 4.19s	remaining: 59.4ms
986:	learn: 0.3440254	total: 4.19s	remaining: 55.2ms
987:	learn: 0.3439183	total: 4.19s	remaining: 50.9ms
988:	learn: 0.3438827	total: 4.2s	remaining: 46.7ms
989:	learn: 0.3438028	total: 4.2s	remaining: 42.5ms
990:	learn: 0.3437500	total: 4.21s	remaining: 38.2ms
991:	learn: 0.3435975	total: 4.21s	remaining: 34ms
992:	learn: 0.3435398	total: 4.21s	remaining: 29.7ms
993:	learn: 0.3434853	total: 4.22s	remaining: 25.5ms
994:	learn: 0.3433978	total: 4.22s	remaining: 21.2ms
995:	learn: 0.3432816	total: 4.23s	remaining: 17ms
996:	learn: 0.3431981	total: 4.23s	remaining: 12.7ms
997:	learn: 0.3430989	total: 4.24s	remaining: 8.49ms
998:	learn: 0.3430292	total: 4.24s	remaining: 4.24ms
999:	learn: 0.3429485	total: 4.24s	remaining: 0us


0.8492011667520589

for more information on this topic for to this page: 

https://machinelearningmastery.com/gradient-boosting-with-scikit-learn-xgboost-lightgbm-and-catboost/

## AdaBoost Algorithm - Ensemble learning
Ada-boost or Adaptive Boosting is one of ensemble boosting classifier proposed by Yoav Freund and Robert Schapire in 1996. It combines multiple classifiers to increase the accuracy of classifiers. AdaBoost is an iterative ensemble method. AdaBoost classifier builds a strong classifier by combining multiple poorly performing classifiers so that you will get high accuracy strong classifier. The basic concept behind Adaboost is to set the weights of classifiers and training the data sample in each iteration such that it ensures the accurate predictions of unusual observations. Any machine learning algorithm can be used as base classifier if it accepts weights on the training set. Adaboost should meet two conditions:

The classifier should be trained interactively on various weighed training examples.
In each iteration, it tries to provide an excellent fit for these examples by minimizing training error.

In [54]:
from sklearn.ensemble import AdaBoostClassifier
np.random.seed(42)
X, y = BreastCancer_df.drop("Target",axis=1), BreastCancer_df.Target
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)
clf = AdaBoostClassifier()
clf.fit(X_train,y_train)
clf.score(X_test,y_test)

0.9736842105263158

In [57]:
from sklearn.ensemble import AdaBoostRegressor
np.random.seed(42)
X, y = ch_df.drop("Target",axis= 1), ch_df.Target
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)
model = AdaBoostRegressor()
model.fit(X_train,y_train)
model.score(X_test,y_test) # Not all time boosting algorithms are better

0.45596647284058933