In [1]:
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
# Load MNIST dataset
mnist = fetch_openml('mnist_784', version=1)
X, y = mnist["data"], mnist["target"]

# Split the data into training, validation, and test sets
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.2, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

  warn(


Initializes and trains three different classifiers (Random Forest, MLP, and SVM) using the training set (`X_train` and `y_train`) and evaluates their performance on the validation set (`X_val`). The accuracy scores are then printed for each classifier.


In [2]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Initialize classifiers
rf_classifier = RandomForestClassifier(n_estimators=100, random_state=42)
mlp_classifier = MLPClassifier(hidden_layer_sizes=(100,), max_iter=100, random_state=42)
svm_classifier = SVC(probability=True, random_state=42)

# Train classifiers
rf_classifier.fit(X_train, y_train)
mlp_classifier.fit(X_train, y_train)
svm_classifier.fit(X_train, y_train)

# Evaluate on validation set
rf_val_preds = rf_classifier.predict(X_val)
mlp_val_preds = mlp_classifier.predict(X_val)
svm_val_preds = svm_classifier.predict(X_val)

# Print accuracy for each classifier
print("Random Forest Accuracy:", accuracy_score(y_val, rf_val_preds))
print("MLP Accuracy:", accuracy_score(y_val, mlp_val_preds))
print("SVM Accuracy:", accuracy_score(y_val, svm_val_preds))


Random Forest Accuracy: 0.9648571428571429
MLP Accuracy: 0.9595714285714285
SVM Accuracy: 0.975


Random Forest (RF): Achieved an accuracy of 96.49%. Random Forest is an ensemble learning method that builds multiple decision trees and merges their predictions. It is known for handling complex relationships in data and providing robust performance.

MLP (Multi-Layer Perceptron): Achieved an accuracy of 95.96%. MLP is a type of neural network with one hidden layer of 100 neurons. Although neural networks can capture intricate patterns, this specific configuration might need further tuning for better performance.

SVM (Support Vector Machine): Achieved the highest accuracy of 97.50%. SVM is a powerful classification algorithm that finds the optimal hyperplane to separate different classes. The high accuracy suggests that the SVM was effective in discriminating between the classes in the validation set.

In [3]:
from sklearn.ensemble import VotingClassifier

# Create a voting ensemble (soft or hard voting)
voting_classifier = VotingClassifier(estimators=[
    ('rf', rf_classifier),
    ('mlp', mlp_classifier),
    ('svm', svm_classifier)],
    voting='soft')  # Use 'hard' for hard voting

# Train the voting ensemble
voting_classifier.fit(X_train, y_train)

# Evaluate on the test set
voting_test_preds = voting_classifier.predict(X_test)

# Print accuracy for the ensemble
print("Voting Ensemble Accuracy:", accuracy_score(y_test, voting_test_preds))


Voting Ensemble Accuracy: 0.977


creates a voting ensemble using scikit-learn's `VotingClassifier` class. It combines predictions from three previously trained classifiers (Random Forest, MLP, and SVM) and evaluates the ensemble's performance on the test set.

The voting ensemble achieved an accuracy of 97.70% on the test set. This accuracy is higher than the individual classifiers' accuracies on the validation set, indicating that combining the predictions from multiple classifiers improved overall performance.


In [4]:
import numpy as np
# Get predictions on validation set
rf_val_preds = rf_classifier.predict(X_val)
mlp_val_preds = mlp_classifier.predict(X_val)
svm_val_preds = svm_classifier.predict(X_val)

# Create a new training set for the blender
blender_X_train = np.column_stack((rf_val_preds, mlp_val_preds, svm_val_preds))
blender_y_train = y_val


In [6]:
# from sklearn.ensemble import StackingClassifier
from sklearn.linear_model import LogisticRegression

# Create a blender classifier (you can use any classifier, e.g., Logistic Regression)
blender_classifier = LogisticRegression(max_iter=1000, random_state=42)

# Train the blender
blender_classifier.fit(blender_X_train, blender_y_train)

# Get predictions from individual classifiers on the test set
rf_test_preds = rf_classifier.predict(X_test)
mlp_test_preds = mlp_classifier.predict(X_test)
svm_test_preds = svm_classifier.predict(X_test)

# Create input for blender from test set predictions
blender_X_test = np.column_stack((rf_test_preds, mlp_test_preds, svm_test_preds))

# Get predictions from the blender on the test set
blender_test_preds = blender_classifier.predict(blender_X_test)

# Print accuracy for the stacking ensemble
print("Stacking Ensemble Accuracy:", accuracy_score(y_test, blender_test_preds))


Stacking Ensemble Accuracy: 0.9627142857142857


The stacking ensemble achieved an accuracy of 96.27% on the test set. This accuracy is comparable to or slightly below the accuracy achieved by the individual classifiers and the voting ensemble.

Stacking aims to leverage the strengths of diverse models, and its success can depend on factors such as the choice of base classifiers, the complexity of the problem, and the quality of predictions from individual models.

## Conclusion

## Further Analysis and Considerations:
Stacking can be sensitive to the choice of base models and the diversity of their predictions. Experimenting with different base models or tweaking hyper-parameters might lead to improvements.

The performance of the stacking ensemble might benefit from additional fine-tuning, such as adjusting hyperparameters of the blender model or exploring different types of blenders.

It's important to note that the performance of stacking may vary across datasets. The dataset characteristics, the number of instances, and the inherent complexity of the problem can all impact the effectiveness of ensemble methods.

While stacking didn't significantly outperform the other ensemble methods in this specific case, it remains a powerful technique that can yield improvements in various scenarios, particularly when dealing with diverse base models.

## Final Thoughts:
The choice between different ensemble methods (voting, stacking) and individual classifiers depends on the specific characteristics of the dataset and the nature of the problem being addressed.

Experimentation, hyperparameter tuning, and understanding the strengths and weaknesses of each model are essential steps in building effective ensembles.

The reported accuracy scores provide valuable insights into the relative performance of different models and ensembles, helping guide further refinement and exploration in the machine learning workflow.