# Stacking Lab


### Intro and objectives


### In this lab you will learn:
1. a basic example of a stacked-based ensemble classifier
### What I hope you'll get out of this lab
* Worked Examples
* How to interpret the results obtained

In [1]:
import sys

assert sys.version_info >= (3, 7)

In [2]:
!pip3 install -U scikit-learn

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [3]:
from packaging import version
import sklearn

assert version.parse(sklearn.__version__) >= version.parse("1.1")

In [4]:
sklearn.__version__

'1.2.1'

### In this lab we will learn how apply stacking

In [5]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.ensemble import StackingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn import datasets
import numpy as np

## Let's load the iris dataset

In [6]:
np.random.seed(42)
X = np.random.rand(100, 1) - 0.5
y = 3 * X[:, 0] ** 2 + 0.05 * np.random.randn(100)  # y = 3x² + Gaussian noise

In [7]:
iris = datasets.load_iris()
X = iris.data
y = iris.target

In [8]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42, stratify=y)

## Let's create and train three independent classifiers

In [9]:
# Create a Randomforest classifier
#
rnf_clf = RandomForestClassifier(n_estimators=100,random_state=123)

In [10]:
# Gradient boosting
grad_clf = GradientBoostingClassifier(n_estimators=100, learning_rate=1.0,max_depth=1, random_state=0)

In [11]:
# Create a Logistic regression classifier
#
lg_clf = LogisticRegression(random_state=123)

## Let's stack the previous classifiers

In [12]:
# Create a stacking classifier
#
estimators = [
     ('rf', rnf_clf),
     ('xgb', grad_clf)
]
sclf = StackingClassifier(estimators=estimators,
                            final_estimator=lg_clf,
                            cv=10)

## Let's fit and evaluate the stacked-based classifier


In [13]:
sclf.fit(X_train, y_train)
print(f"\nStacking classifier training Accuracy: {sclf.score(X_train, y_train):0.2f}")
print(f"Stacking classifier test Accuracy: {sclf.score(X_test, y_test):0.2f}")


Stacking classifier training Accuracy: 1.00
Stacking classifier test Accuracy: 0.91


## Let's make some predictions

In [14]:
iris.data[90:110]

array([[5.5, 2.6, 4.4, 1.2],
       [6.1, 3. , 4.6, 1.4],
       [5.8, 2.6, 4. , 1.2],
       [5. , 2.3, 3.3, 1. ],
       [5.6, 2.7, 4.2, 1.3],
       [5.7, 3. , 4.2, 1.2],
       [5.7, 2.9, 4.2, 1.3],
       [6.2, 2.9, 4.3, 1.3],
       [5.1, 2.5, 3. , 1.1],
       [5.7, 2.8, 4.1, 1.3],
       [6.3, 3.3, 6. , 2.5],
       [5.8, 2.7, 5.1, 1.9],
       [7.1, 3. , 5.9, 2.1],
       [6.3, 2.9, 5.6, 1.8],
       [6.5, 3. , 5.8, 2.2],
       [7.6, 3. , 6.6, 2.1],
       [4.9, 2.5, 4.5, 1.7],
       [7.3, 2.9, 6.3, 1.8],
       [6.7, 2.5, 5.8, 1.8],
       [7.2, 3.6, 6.1, 2.5]])

In [15]:
iris.target[90:110]

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [16]:
sclf.predict(iris.data[90:110])

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2])