# 1. Bagging, Boosting , Stacking
Source(EN):https://stats.stackexchange.com/questions/18891/bagging-boosting-and-stacking-in-machine-learning  

Bagging:
1. parallel ensemble: each model is built independently
2. aim to decrease variance, not bias
3. suitable for high variance low bias models (complex models)
4. an example of a tree based method is random forest

Boosting: 
1. sequential ensemble: try to add new models that do well where previous models lack
2. aim to decrease bias, not variance
3. suitable for low variance high bias models
4. an example of a tree based method is gradient boosting

![Bagging](Figure/Bagging_vs_Boosting.png)

# 2. Application Example
Source: Python Machine Learning

## 2.1 Stacking
训练样本一样，基础模型不一样

In [26]:
import warnings
warnings.filterwarnings("ignore")

In [2]:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import LabelEncoder

iris = datasets.load_iris()
X,y = iris.data[50:,[1,2]],iris.target[50:]
le = LabelEncoder()
y = le.fit_transform(y)

In [9]:
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.5,random_state=1)

In [10]:
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.pipeline import Pipeline
import numpy as np

In [11]:
clf1 = LogisticRegression(penalty='l2',C=0.001,random_state=0)
clf2 = DecisionTreeClassifier(max_depth=1,criterion='entropy',random_state=0)
clf3 = KNeighborsClassifier(n_neighbors=1,p=2,metric='minkowski')
pipe1 = Pipeline([['sc',StandardScaler()],['clf',clf1]])
#pipe2 = Pipeline([['sc',StandardScaler()],['clf',clf2]])
pipe3 = Pipeline([['sc',StandardScaler()],['clf',clf3]])

In [12]:
clf_labels = ['LR','DT','KNN']

In [15]:
for clf,label in zip([pipe1,clf2,pipe3],clf_labels):
    scores = cross_val_score(estimator=clf,X=X_train,y=y_train,cv=10,scoring='roc_auc')
    print("ROC AUC: %0.2f (+/- %0.2f) [%s]"% (scores.mean(), scores.std(), label))

ROC AUC: 0.92 (+/- 0.20) [LR]
ROC AUC: 0.92 (+/- 0.15) [DT]
ROC AUC: 0.93 (+/- 0.10) [KNN]




In [17]:
from sklearn.ensemble import VotingClassifier

In [20]:
mv_clf = VotingClassifier(estimators=[('LR', pipe1), ('DT', clf2), ('KNN', pipe3)], voting='soft')

In [25]:
score = cross_val_score(estimator=mv_clf,X=X_train,y=y_train,cv=10,scoring='roc_auc')

In [22]:
print("Accuracy: %0.2f (+/- %0.2f) [%s]"% (score.mean(), score.std(), 'Voting'))


Accuracy: 0.97 (+/- 0.10) [Voting]


## 2.2  Bagging

有放回的抽样，基础模型一样，训练样本不一样

In [28]:
import pandas as pd
df_wine = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data', header=None)

In [29]:
df_wine.columns = ['Class label', 'Alcohol',
                     'Malic acid', 'Ash',
                       'Alcalinity of ash',
                       'Magnesium', 'Total phenols',
                       'Flavanoids', 'Nonflavanoid phenols',
                       'Proanthocyanins',
                       'Color intensity', 'Hue',
                       'OD280/OD315 of diluted wines',
                      'Proline']

In [30]:
df_wine = df_wine[df_wine['Class label'] != 1]

In [31]:
y = df_wine['Class label'].values
X = df_wine[['Alcohol', 'Hue']].values

In [32]:
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
le = LabelEncoder()
y = le.fit_transform(y)
X_train, X_test, y_train, y_test  = train_test_split(X,y,test_size=0.4,random_state=1)

In [33]:
from sklearn.ensemble import BaggingClassifier
tree = DecisionTreeClassifier(criterion='entropy',max_depth=None)
bag = BaggingClassifier(base_estimator=tree,
                        n_estimators=500,
                        max_samples = 1.0,
                        max_features=1.0,
                        bootstrap=True,
                        bootstrap_features=False,
                        n_jobs=1,
                        random_state=1)

In [34]:
from sklearn.metrics import accuracy_score
tree = tree.fit(X_train, y_train)
y_train_pred = tree.predict(X_train)
y_test_pred = tree.predict(X_test)
tree_train = accuracy_score(y_train, y_train_pred)
tree_test = accuracy_score(y_test, y_test_pred)
print('Decision tree train/test accuracies %.3f/%.3f' % (tree_train, tree_test))

Decision tree train/test accuracies 1.000/0.854


In [35]:
bag = bag.fit(X_train, y_train)
y_train_pred = bag.predict(X_train)
y_test_pred = bag.predict(X_test)
bag_train = accuracy_score(y_train, y_train_pred)
bag_test = accuracy_score(y_test, y_test_pred)
print('Bagging train/test accuracies %.3f/%.3f'% (bag_train, bag_test))

Bagging train/test accuracies 1.000/0.896


## 2.3 AdaBoost

In [37]:
from sklearn.ensemble import AdaBoostClassifier
tree = DecisionTreeClassifier(criterion='entropy',
                               max_depth=1)
ada = AdaBoostClassifier(base_estimator=tree,
                          n_estimators=500,
                          learning_rate=0.1,
                          random_state=0)
tree = tree.fit(X_train, y_train)
y_train_pred = tree.predict(X_train)
y_test_pred = tree.predict(X_test)
tree_train = accuracy_score(y_train, y_train_pred)
tree_test = accuracy_score(y_test, y_test_pred)
print('Decision tree train/test accuracies %.3f/%.3f'% (tree_train, tree_test))

Decision tree train/test accuracies 0.845/0.854


In [38]:
ada = ada.fit(X_train, y_train)
y_train_pred = ada.predict(X_train)
y_test_pred = ada.predict(X_test)
ada_train = accuracy_score(y_train, y_train_pred)
ada_test = accuracy_score(y_test, y_test_pred)
print('AdaBoost train/test accuracies %.3f/%.3f'% (ada_train, ada_test))

AdaBoost train/test accuracies 1.000/0.875
