##Implement a stacking ensemble model using the sklearn library in python. Your task is to use a base model consisting of a decision tree and a logistic regression model and then stack them using a logistic regression model as the final estimator. Train your ensemble model on a dataset of your choice (e.g., Iris or any suitable dataset) and evaluate its accuracy on a test set. Ensure to properly split the data into training and testing sets and report the accuracy of the model.

In [3]:
# Stacking is an ensemble learning technique where we combine the predictions of multiple models (called base learners) using another model (called a meta-learner or final estimator) to improve overall performance.
#First we here inital libraries and wine dataset from sklearn is being imported for use.
import numpy as np
import pandas as pd
from sklearn.datasets import load_wine #Wine datasets
from sklearn.model_selection import train_test_split #This is used for test and training dataset spliting of the data.



In [4]:
#Now first we load X and Y values i.e label and target values of wine dataset.
wine = load_wine()
X = wine.data
y = wine.target

#X → feature matrix (chemical properties of wine).

#y → target values of wine class (wine class: 0, 1, or 2).

In [12]:
#Now we do the test and train split of the dataset.
#Splits data into 80% training and 20% testing.
#random_state=42 ensures reproducibility.
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
    
)


In [13]:
#Now we import libraries used for different classifiers used as  base learners and stakking classifier to create stacking ensemble.
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import StackingClassifier

In [20]:
#Base models: Decision Tree + Logistic Regression.
base_learners = [
    ('decision_tree', DecisionTreeClassifier(random_state=42)),
    ('logistic_reg', LogisticRegression(max_iter=10000, random_state=42))
]


In [21]:
#The meta-model (final estimator) is also a Logistic Regression.
final_estimator = LogisticRegression(max_iter=10000, random_state=42)


In [24]:
#Creating a stack model with cross validaiton parameter.
#cv=5 mean 5 fold cross validation.
stacking_clf = StackingClassifier(
    estimators=base_learners,
    final_estimator=final_estimator,
    cv=5
)


In [25]:
#Trains the ensemble on training data to fit it to input datapoints.
stacking_clf.fit(X_train, y_train)


In [26]:
#Prediction of test data set after training the model.
y_pred = stacking_clf.predict(X_test)


In [30]:
#To evaluate the performance of data set we import accuracy score form sklearn metrics.
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

In [37]:
#Now we calculate the accuracy of our trained model.
accuracy = accuracy_score(y_test, y_pred)*100
print("Test Set Accuracy of Stacking Ensemble (Wine Dataset): ", accuracy,"%")


Test Set Accuracy of Stacking Ensemble (Wine Dataset):  100.0 %
