# Ensembles Assignment
- import all necessary libraries

In [6]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, VotingClassifier, StackingClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.datasets import fetch_openml


# 1. Load the MNIST dataset (given below) and split it into training, validation and test sets .

In [3]:
mnist = fetch_openml('mnist_784', version=1)
# mnist_df = pd.DataFrame(mnist.data, columns=mnist.feature_names)
mnist.keys()

  warn(


dict_keys(['data', 'target', 'frame', 'categories', 'feature_names', 'target_names', 'DESCR', 'details', 'url'])

# The first train_test_split is used to split the dataset into a training set (X_train, y_train) and a temporary set (X_temp, y_temp). The second train_test_split then splits the temporary set into a validation set (X_val, y_val) and a test set (X_test, y_test).

In [10]:
X, y = mnist["data"], mnist["target"]

# Split the dataset into training, validation, and test sets
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.2, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

# I have trained the individual classifiers on the train sets  and evaluated them on the validation set in order to check their individual performance before using the voting classifier.

# The validation set (X_val, y_val) is used here because we are interested in assessing the classifier's performance during the development phase to avoid overfitting. Diverse models capture different aspects of the data. The validation set helps fine-tune model parameters, ensuring better adaptability to the underlying patterns in the data.

In [13]:
rf_clf = RandomForestClassifier(n_estimators=100, random_state=42)
nn_clf = MLPClassifier(hidden_layer_sizes=(100,), max_iter=20, random_state=42)
svm_clf = SVC(probability=True, random_state=42)

classifiers = [rf_clf, nn_clf, svm_clf]

for clf in classifiers:
    clf.fit(X_train, y_train)
    y_val_pred = clf.predict(X_val)
    print(f"{clf.__class__.__name__} accuracy: {accuracy_score(y_val, y_val_pred)}")


RandomForestClassifier accuracy: 0.9648571428571429




MLPClassifier accuracy: 0.951
SVC accuracy: 0.975


# The voting ensemble leverages the collective wisdom of all models improving the overall accuracy and robustness. I've opted to test it on an independent set  to ensure it's ability to generalize. I have chosen the soft vote rule based on performance. Because instead of taking the majority of the votes it takes the average of the probabilities of each class and predicts the class with the highest probability. This is more accurate than the hard vote rule.

In [14]:
voting_clf = VotingClassifier(
    estimators=[('rf', rf_clf), ('nn', nn_clf), ('svm', svm_clf)],
    voting='soft'
)

voting_clf.fit(X_train, y_train)



# Evaluate the voting classifier on the test set.
# The voting classifier is able to use the strengths of all 3 models in order to have a higher prediction score than their standalone alternatives. Using the soft vot rule it takes only the prediction with the highest average of probabilities. Hence why we got better result. 

In [17]:
# Evaluate the voting classifier on the validation set
y_val_pred_voting = voting_clf.predict(X_val)
print(f"Voting Classifier accuracy: {accuracy_score(y_val, y_val_pred_voting)}")

Voting Classifier accuracy: 0.9727142857142858


# After training individual classifiers, I applied stacking to harness their diverse insights effectively. I generated predictions from each classifier on the validation set, creating a new training set with these predictions as features. Using a Random Forest as a blender, I trained it on this augmented dataset, enabling it to learn to weigh and combine the strengths of individual classifiers. This stacking ensemble was then evaluated on the test set by using predictions from each classifier as input for the blender. This approach aims to capture intricate patterns and relationships while maintaining robustness. The blending step allows for a more nuanced and accurate model, considering the diverse perspectives provided by individual classifiers.

In [18]:
# Generate predictions on the validation set for blender training
X_val_blend = np.column_stack([clf.predict(X_val) for clf in classifiers])

# Train a blender on the new training set
blender = RandomForestClassifier(n_estimators=100, random_state=42)
blender.fit(X_val_blend, y_val)

# Evaluate the stacking ensemble on the test set
X_test_blend = np.column_stack([clf.predict(X_test) for clf in classifiers])
y_test_pred_stack = blender.predict(X_test_blend)
print(f"Stacking Ensemble accuracy: {accuracy_score(y_test, y_test_pred_stack)}")

# Use StackingClassifier instead
stacking_clf = StackingClassifier(estimators=[('rf', rf_clf), ('nn', nn_clf), ('svm', svm_clf)],
                                  final_estimator=RandomForestClassifier(n_estimators=100, random_state=42))
stacking_clf.fit(X_train, y_train)

# Evaluate the stacking classifier on the test set
y_test_pred_stacking = stacking_clf.predict(X_test)
print(f"Stacking Classifier accuracy: {accuracy_score(y_test, y_test_pred_stacking)}")


Stacking Ensemble accuracy: 0.9725714285714285




Stacking Classifier accuracy: 0.9801428571428571


# The Stacking Ensemble accuracy is 97.26%, while the Stacking Classifier accuracy is slightly higher at 98.01%. The reason for the stacking classifier to outperform the stacking ensemble is the final layer. In the stacking ensemble, the blender takes the individual predictions from the classifiers and makes the final decision. While in the stacking classifier, the final layer is a classifier itself. This allows the stacking classifier to learn the best way to combine the predictions from the individual classifiers. This additional layer enables it to outperform the stacking ensemble by capturing more intricate relationships and patterns in the data, leading to better generalization and overall better accuracy.