# Exercise 8
Load the MNIST data (introduced in Chapter 3), and split it into a training set, a validation set, and a test set (e.g., use 50,000 instances for training, 10,000 for validation, and 10,000 for testing). Then train various classifiers, such as a Random Forest classifier, an Extra-Trees classifier, and an SVM classifier. Next, try to combine them into an ensemble that outperforms each individual classifier on the validation set, using soft or hard voting. Once you have found one, try it on the test set. How much better does it perform compared to the individual classifiers?

In [1]:
# load libraries
from sklearn.datasets import fetch_openml
import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.ensemble import RandomForestClassifier, ExtraTreesClassifier
import warnings
# ignore warnings because they're annoying
warnings.filterwarnings("ignore")
# load MNIST dataset
mnist = fetch_openml('mnist_784', version=1)
# set up data
X = np.array(mnist.data)
y = np.array(mnist.target).astype(int)

# split the data
X_train, X_valid, y_train, y_valid = train_test_split(X,y, test_size = (2/7)) # have 20k samples in valid
X_valid, X_test, y_valid, y_test = train_test_split(X_valid, y_valid, test_size = .5) # split 20k into 10k for both

In [3]:
# build the SVC
svm_clf = Pipeline([
    ("scaler", StandardScaler()),
    ("svc", SVC(gamma = "auto", random_state = 42))
])
# Random Forest doesn't care for standardized values
# Set everything to default because no point in playing with hyperparameters just yet
rdm_for = RandomForestClassifier(random_state = 42)
# Now for Extra Trees Classifier
extra = ExtraTreesClassifier(random_state = 42)

In [4]:
# train all the models
svm_clf.fit(X_train, y_train)
rdm_for.fit(X_train, y_train)
extra.fit(X_train, y_train)

ExtraTreesClassifier(random_state=42)

In [5]:
# make predictions for all values
svm_pred = svm_clf.predict(X_valid)
rdm_pred = rdm_for.predict(X_valid)
extra_pred = extra.predict(X_valid)

print("SVM Accuracy: ", accuracy_score(svm_pred, y_valid))
print("Random Forrest Accuracy: ", accuracy_score(rdm_pred, y_valid))
print("Extra Trees Accuracy: ", accuracy_score(extra_pred, y_valid))

SVM Accuracy:  0.9655
Random Forrest Accuracy:  0.9672
Extra Trees Accuracy:  0.9711


In [6]:
# let's use a simple voting classifier ensemble for our models
from sklearn.ensemble import VotingClassifier
voting_clf = VotingClassifier(
    estimators = [('svm', svm_clf),('rf', rdm_for),('extra',extra)],
    voting = 'hard')
voting_clf.fit(X_train, y_train)

VotingClassifier(estimators=[('svm',
                              Pipeline(steps=[('scaler', StandardScaler()),
                                              ('svc',
                                               SVC(gamma='auto',
                                                   random_state=42))])),
                             ('rf', RandomForestClassifier(random_state=42)),
                             ('extra', ExtraTreesClassifier(random_state=42))])

In [7]:
voting_pred = voting_clf.predict(X_valid)

print("Ensemble Accuracy: ", accuracy_score(voting_pred, y_valid))

Ensemble Accuracy:  0.9728


In [8]:
# seeing how it would perform using the test set data
voting_clf.score(X_test, y_test)

0.9746

As can be seen from the accuracy score alone, my ensemble performs better than all 3 of the classifiers working my themselves. This however should be taken with a grain of salt as we need to see our confusion matrix, ROC-AUC curves, and classification report. 

# Exercise 9
Run the individual classifiers from the previous exercise to make predictions on the validation set, and create a new training set with the resulting predictions: each training instance is a vector containing the set of predictions from all your classifiers for an image, and the target is the image’s class. Train a classifier on this new training set. Congratulations, you have just trained a blender, and together with the classifiers it forms a stacking ensemble! Now evaluate the ensemble on the test set. For each image in the test set, make predictions with all your classifiers, then feed the predictions to the blender to get the ensemble’s predictions. How does it compare to the voting classifier you trained earlier?

In [11]:
# create an estimators list object
estimators = [svm_clf, rdm_for, extra]
# make a 10,000 x 3 empty matrix
X_val_predictions = np.empty((len(X_valid), len(estimators)), dtype=np.float32)

# for loop to make things easier and faster :)
for index, estimator in enumerate(estimators):
    X_val_predictions[:, index] = estimator.predict(X_valid)

In [12]:
# create a blender using a random forest classifier
rnd_forest_blender = RandomForestClassifier(oob_score=True, random_state=42)
rnd_forest_blender.fit(X_val_predictions, y_valid)

RandomForestClassifier(oob_score=True, random_state=42)

In [13]:
# get the predicted accuracy score of our blender model
rnd_forest_blender.oob_score_

0.9713

In [15]:
X_test_predictions = np.empty((len(X_test), len(estimators)), dtype=np.float32)

for index, estimator in enumerate(estimators):
    X_test_predictions[:,index] = estimator.predict(X_test)

In [16]:
# have stacked ensemble predict X_test
y_pred = rnd_forest_blender.predict(X_test_predictions)

# Accuracy score of stacked ensemble on X_test
print("Stacked Ensemble Accuracy: ", accuracy_score(y_test, y_pred))

Stacked Ensemble Accuracy:  0.973
