# Stacking
While bagging and boosting ensemble uses homogenous weak learners for ensemble, Stacking often considers heterogeneous weak learners trained with the same dataset, learns them in parallel and combines them by training a meta-learner to output a prediction based on the different weak learner’s predictions. A meta learner inputs the predictions as the features and the target being the ground truth values in data, it attempts to learn how to best combine the input predictions to make a better output prediction.

![1_T-JHq4AK3dyRNi7gpn9-Xw.png](attachment:1_T-JHq4AK3dyRNi7gpn9-Xw.png)

In [2]:
import numpy as np
import pandas as pd

In [3]:
df = pd.read_csv(r'G:\course\Youtobe ML\ML Algorithm mathematical and geometric intuition - Copy\Ensemble Learning\Stacking\heart.csv')
df.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1


In [4]:
X = df.drop(columns=['target'])
y = df['target']

In [5]:
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=8)

In [6]:
print(X_train.shape)

(242, 13)


In [7]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import GradientBoostingClassifier

In [8]:
estimators = [
    ('rf', RandomForestClassifier(n_estimators=10, random_state=42)),
    ('knn', KNeighborsClassifier(n_neighbors=10)),
    ('gbdt',GradientBoostingClassifier())
]   # Base Learner

In [9]:
from sklearn.ensemble import StackingClassifier

clf = StackingClassifier(
    estimators=estimators, 
    final_estimator=LogisticRegression(), # Meta Learner
    cv=10
)

In [10]:
clf.fit(X_train, y_train)

In [12]:
y_pred = clf.predict(X_test)

In [13]:
from sklearn.metrics import accuracy_score
accuracy_score(y_test,y_pred)

0.8688524590163934