[View in Colaboratory](https://colab.research.google.com/github/yikayiyo/yikayiyo_py/blob/master/stacking.ipynb)

It is based on a simple idea: instead of using trivial functions (such as hard voting) to aggregate the predictions of all predictors in an ensemble, why don’t we **train a model to perform this aggregation**.  

一种common方法，使用hold-out set.  
训练数据被分为两个子集，第一个子集用来训练第一层的学习器，这些学习器在第二个数据集上预测会得到一些结果（有几个学习器就是有几个结果），形成了一个个向量。  
然后使用这些向量作为样本特征，标签还是原来的标签，训练blender(搅拌器，即第二层的模型)，预测结果。


一般这样的搅拌器可以训练很多个（可以使用不同的算法，得到不同的搅拌器）
结合搅拌器的结果，输出最终预测  
这种情况下数据集需要被分为三个集合，第一个集合训练第一层的学习器，第二个集合训练第二层的搅拌器，第三个集合训练第三层最终的meta-learner  

开源的stacking包，有deslib（py）、stackNet（java）

In [56]:
! pip install deslib



In [0]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Perceptron
from sklearn.calibration import CalibratedClassifierCV
from sklearn.ensemble import BaggingClassifier

#importing DCS techniques from DESlib
from deslib.dcs.ola import OLA
from deslib.dcs.a_priori import APriori
from deslib.dcs.mcb import MCB

#import DES techniques from DESlib
from deslib.des.des_p import DESP
from deslib.des.knora_u import KNORAU
from deslib.des.knora_e import KNORAE
from deslib.des.meta_des import METADES

In [0]:
data = load_breast_cancer()
X = data.data
y = data.target
# 训练集:测试集=9:1
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1)

# Scale the variables to have 0 mean and unit variance
scalar = StandardScaler()
X_train = scalar.fit_transform(X_train) 
X_test = scalar.transform(X_test)  # 用训练数据tranform测试数据

# 将训练集等分为两部分，一部分用来训练第一层学习器，另一部分用来训练blender
X_train, X_dsel, y_train, y_dsel = train_test_split(X_train, y_train, test_size=0.5)

In [59]:
X_train.shape

(256, 30)

In [60]:
from sklearn import linear_model
model1 = linear_model.LogisticRegression(C=1.0,
                         multi_class='multinomial',
                         penalty='l1', solver='saga', tol=0.1)
model1.fit(X_train,y_train)

LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='multinomial',
          n_jobs=1, penalty='l1', random_state=None, solver='saga',
          tol=0.1, verbose=0, warm_start=False)

In [61]:
from sklearn.ensemble import RandomForestClassifier
model2 = RandomForestClassifier(n_estimators=10)
model2.fit(X_train, y_train)


RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=None, max_features='auto', max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1,
            oob_score=False, random_state=None, verbose=0,
            warm_start=False)

In [62]:
from sklearn.ensemble import ExtraTreesClassifier
model3 = ExtraTreesClassifier(n_estimators=20)
model3.fit(X_train,y_train)

ExtraTreesClassifier(bootstrap=False, class_weight=None, criterion='gini',
           max_depth=None, max_features='auto', max_leaf_nodes=None,
           min_impurity_decrease=0.0, min_impurity_split=None,
           min_samples_leaf=1, min_samples_split=2,
           min_weight_fraction_leaf=0.0, n_estimators=20, n_jobs=1,
           oob_score=False, random_state=None, verbose=0, warm_start=False)

In [63]:
from sklearn.ensemble import GradientBoostingClassifier
model4 = GradientBoostingClassifier(n_estimators=100, learning_rate=1.0,max_depth=1, random_state=0)
model4.fit(X_train,y_train)
model4.score(X_test,y_test)

0.9649122807017544

In [64]:
pool_classifiers = [model1,model2,model4]

mcb = MCB(pool_classifiers)
aposteriori = APriori(pool_classifiers)

# Fitting the techniques
mcb.fit(X_dsel, y_dsel)
aposteriori.fit(X_dsel, y_dsel)

print("base learners' score:")
print('LR :',model1.score(X_test,y_test))
print('RF:',model2.score(X_test,y_test))
# print('Extra tree:',model3.score(X_test,y_test))
print('GB:',model4.score(X_test,y_test))

# Calculate classification accuracy of each technique
print('Evaluating DS techniques:')
print('Classification accuracy MCB: ', mcb.score(X_test, y_test))
print('Classification accuracy A posteriori: ', aposteriori.score(X_test, y_test))

base learners' score:
LR : 0.9824561403508771
RF: 0.9649122807017544
GB: 0.9649122807017544
Evaluating DS techniques:
Classification accuracy MCB:  0.9473684210526315
Classification accuracy A posteriori:  0.9649122807017544


In [65]:
desp = DESP(pool_classifiers)
desp.fit(X_dsel, y_dsel)
print('Classification accuracy A DESP: ', desp.score(X_test, y_test))

Classification accuracy A DESP:  0.9649122807017544


In [66]:
desp.predict(X_test)

array([0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0,
       0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1,
       0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1])