### Stacking
Stacking is an ensemble learning technique that uses predictions from multiple models (for example decision tree, knn or svm) to build a new model. This model is used for making predictions on the test set. Below is a step-wise explanation for a simple stacked ensemble:

    Step 1. The train set is split into 10 parts
    Step 2. A base model (suppose a decision tree) is fitted on 9 parts and predictions are made for the 10th part. This is done for each part of the train set.
    Step 3. The base model (in this case, decision tree) is then fitted on the whole train dataset.
    Step 4. Using this model, predictions are made on the test set.
    Step 5. Steps 2 to 4 are repeated for another base model (say knn) resulting in another set of predictions for the train set and test set.
    Step 6. The predictions from the train set are used as features to build a new model.
    Step 7. This model is used to make final predictions on the test prediction set.   

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
# pip install mlxtend for importing the library
from mlxtend.classifier import StackingClassifier

In [2]:
data = pd.read_csv('bank.csv')
data.head()

Unnamed: 0,age,job,marital,education,default,balance,housing,loan,contact,day,month,duration,campaign,pdays,previous,poutcome,deposit
0,59,0,1,1,0,2343,1,0,2,5,8,1042,1,-1,0,3,1
1,56,0,1,1,0,45,0,0,2,5,8,1467,1,-1,0,3,1
2,41,9,1,1,0,1270,1,0,2,5,8,1389,1,-1,0,3,1
3,55,7,1,1,0,2476,1,0,2,5,8,579,1,-1,0,3,1
4,54,0,1,2,0,184,0,0,2,5,8,673,2,-1,0,3,1


In [3]:
X = data.iloc[:, :-1]
y = data.iloc[:,-1]

In [4]:
X_train, X_test, y_train, y_test= train_test_split(X,y, test_size=0.3, 
                                                   random_state=0)

In [5]:
# As stacking uses single algorithm to make the final decision. 
# We shall use decision tree for the same
classifier1 = DecisionTreeClassifier(random_state=0)
classifier2= DecisionTreeClassifier(random_state=1)
classifier3 = DecisionTreeClassifier(random_state=2)
classifier4= DecisionTreeClassifier(random_state=3)
classifier_list=[classifier1,classifier2,classifier3,classifier4]

In [6]:
# Now the above inputs will be feeded into another alogrithm to get 
# the best result.
# This is called meta_classifier in stacking
# We shall use the meta_classifier as LogisticRegression
m_classifier=LogisticRegression(random_state=0)

In [7]:
sclf=StackingClassifier(classifiers=classifier_list,
                        meta_classifier=m_classifier)
sclf.fit(X_train,y_train)



StackingClassifier(average_probas=False,
                   classifiers=[DecisionTreeClassifier(class_weight=None,
                                                       criterion='gini',
                                                       max_depth=None,
                                                       max_features=None,
                                                       max_leaf_nodes=None,
                                                       min_impurity_decrease=0.0,
                                                       min_impurity_split=None,
                                                       min_samples_leaf=1,
                                                       min_samples_split=2,
                                                       min_weight_fraction_leaf=0.0,
                                                       presort=False,
                                                       random_state=0,
                                                    

In [8]:
s_score=sclf.score(X_test,y_test)
print("Stacking Score: ", s_score)

Stacking Score:  0.7751567632128994
