<font color="red" size=6>Ensemble methods</font>
<p><font color="yellow" size=5>6_Stacking</font>

Stacking (Stacked Generalization) is an ensemble learning method that combines multiple machine learning models (often called base models or level-0 models) to create a stronger overall predictive model (called a meta-model or level-1 model). The idea is to stack the predictions of multiple base models and use them as input for a higher-level model that learns to combine the predictions effectively.

<font color="pink" size=4> Key Concepts:</font>
<ol>
     <li><font color="orange">Base Models (Level-0 Models):</font> These are individual models trained on the original training data. They can be of different types (e.g., decision trees, support vector machines, logistic regression, etc.).</li>
     <li><font color="orange">Meta-Model (Level-1 Model):</font> This model takes the predictions of the base models as features and learns to combine them. It can be any machine learning model (e.g., linear regression, logistic regression, etc.).</li></ol>

In [1]:
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.ensemble import StackingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

In [2]:
# 1. Load the Wine dataset
data = load_wine()
X = data.data
y = data.target

# 2. Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [3]:
# 3. Define base models
base_learners = [
    ('lr', LogisticRegression(max_iter=1000)),
    ('dt', DecisionTreeClassifier(random_state=42)),
    ('knn', KNeighborsClassifier())
]

In [4]:
# 4. Define meta-model
meta_model = LogisticRegression(max_iter=1000)


In [5]:
# 5. Create the stacking classifier
stacking_model = StackingClassifier(estimators=base_learners, final_estimator=meta_model)


In [6]:
# 6. Train the stacking model
stacking_model.fit(X_train, y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver opt

In [7]:
# 7. Make predictions on the test set
y_pred = stacking_model.predict(X_test)

In [8]:
# 8. Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.4f}")

Accuracy: 0.9815


In [9]:
# Display the classification report and confusion matrix
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))


Classification Report:
              precision    recall  f1-score   support

           0       1.00      0.95      0.97        19
           1       0.95      1.00      0.98        21
           2       1.00      1.00      1.00        14

    accuracy                           0.98        54
   macro avg       0.98      0.98      0.98        54
weighted avg       0.98      0.98      0.98        54


Confusion Matrix:
[[18  1  0]
 [ 0 21  0]
 [ 0  0 14]]


<font color="pink" size=4>Hyperparameters in Stacking:</font>

The StackingClassifier has several key hyperparameters that you can adjust:
<ol>
    <li><font color="orange">estimators:</font> List of base models (base learners) used to generate predictions. Each item in the list is a tuple with the name and the model.</li>
    <li><font color="orange">final_estimator:</font> The meta-model that combines the predictions of the base models.</li>
    <li><font color="orange">cv:</font> Cross-validation splitting strategy for training the base models. You can specify the number of folds or a custom cross-validation strategy.</li>
    <li><font color="orange">n_jobs:</font> Number of jobs to run in parallel. Setting it to -1 will use all available cores.</li>
    <li><font color="orange">verbose:</font> Controls the verbosity of the training process.</li></ol>

<font color="pink" size=4>Advantages of Stacking:</font>
<ol>
    <li><font color="orange">Improved Performance:</font> By combining multiple models, stacking typically improves performance compared to individual base models.</li>
    <li><font color="orange">Flexibility:</font> Stacking can work with any machine learning algorithm as long as you have a way to generate predictions.</li>
    <li><font color="orange">Better Generalization:</font> It helps to reduce overfitting and improves generalization by combining the strengths of multiple models.</li></ol>

<font color="pink" size=4>Disadvantages of Stacking:</font>
<ol>
    <li><font color="orange">Increased Complexity:</font> Stacking introduces additional complexity because it involves training multiple models.</li>
    <li><font color="orange">Computationally Expensive:</font> Since you train multiple base models and a meta-model, stacking can be more computationally expensive than using a single model.</li>
    <li><font color="orange">Overfitting:</font> If not implemented correctly (e.g., by not using cross-validation when generating base model predictions), stacking can lead to overfitting.</li></ol>