### Exercise 8
Load the MNIST data (introduced in Chapter 3), and split it into a training set, a validation set, and a
test set (e.g., use 40,000 instances for training, 10,000 for validation, and 10,000 for testing). Then
train various classifiers, such as a Random Forest classifier, an Extra-Trees classifier, and an SVM.
Next, try to combine them into an ensemble that outperforms them all on the validation set, using a **soft or hard voting classifier**. Once you have found one, try it on the test set. How much better does it
perform compared to the individual classifiers?

In [1]:
from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784', version=1, as_frame=False)

In [2]:
# split MNIST data into a training set, a validation set, and a test set
from sklearn.model_selection import train_test_split
X_train_val, X_test, y_train_val, y_test = train_test_split(
    mnist.data, mnist.target, test_size=10000, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(
    X_train_val, y_train_val, test_size=10000, random_state=42)

In [3]:
# shuffle the training set
import numpy as np
shuffle_index = np.random.permutation(50000)
X_train, y_train = X_train[shuffle_index], y_train[shuffle_index]

#### Train various classifiers, such as a Random Forest classifier, an Extra-Trees classifier, and an SVM

In [4]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.svm import LinearSVC
rnd_clf = RandomForestClassifier(n_estimators=100, random_state=42)
extra_tree_clf = ExtraTreesClassifier(n_estimators=100, random_state=42)
svm_clf = LinearSVC(max_iter=100, tol=20, random_state=42)

In [5]:
estimators = [rnd_clf, extra_tree_clf, svm_clf]
for estimator in estimators:
    print("Training the ", estimator)
    estimator.fit(X_train, y_train)

Training the  RandomForestClassifier(random_state=42)
Training the  ExtraTreesClassifier(random_state=42)
Training the  LinearSVC(max_iter=100, random_state=42, tol=20)


In [6]:
# check the score of model
[estimator.score(X_val, y_val) for estimator in estimators]

[0.9698, 0.9715, 0.8616]

SVM does not work well

In [8]:
from sklearn.ensemble import VotingClassifier
named_estimators = [
    ("rnd_clf", rnd_clf),
    ("extra_tree_clf", extra_tree_clf),
    ("svm_clf", svm_clf)
]

In [9]:
voting_clf = VotingClassifier(named_estimators)

In [10]:
voting_clf.fit(X_train, y_train)

VotingClassifier(estimators=[('rnd_clf',
                              RandomForestClassifier(random_state=42)),
                             ('extra_tree_clf',
                              ExtraTreesClassifier(random_state=42)),
                             ('svm_clf',
                              LinearSVC(max_iter=100, random_state=42,
                                        tol=20))])

In [11]:
voting_clf.score(X_val, y_val)

0.9697

In [42]:
[estimator.score(X_val, y_val.astype(np.uint8)) for estimator in voting_clf.estimators_]

[0.9698, 0.9715]

After we removed SVM, let us see the performance of voting classifier

In [13]:
voting_clf.set_params(svm_clf=None)

VotingClassifier(estimators=[('rnd_clf',
                              RandomForestClassifier(random_state=42)),
                             ('extra_tree_clf',
                              ExtraTreesClassifier(random_state=42)),
                             ('svm_clf', None)])

In [14]:
voting_clf.estimators

[('rnd_clf', RandomForestClassifier(random_state=42)),
 ('extra_tree_clf', ExtraTreesClassifier(random_state=42)),
 ('svm_clf', None)]

In [15]:
voting_clf.estimators_

[RandomForestClassifier(random_state=42),
 ExtraTreesClassifier(random_state=42),
 LinearSVC(max_iter=100, random_state=42, tol=20)]

we can either fit the VotingClassifier again, or just remove the SVM from the list of trained estimators:

In [16]:
del voting_clf.estimators_[2]

let's evaluate the `VotingClassifier` again:

In [17]:
voting_clf.score(X_val, y_val)

0.9706

Let's try using a soft voting classifier

In [18]:
voting_clf.voting = "soft"

In [19]:
voting_clf.score(X_val, y_val)

0.9722

Once you have found one, try it on the test set. How much better does it perform compared to the individual classifiers?

In [20]:
voting_clf.voting = "hard"
voting_clf.score(X_test, y_test)

0.966

In [43]:
[estimator.score(X_test, y_test.astype(np.uint8)) for estimator in voting_clf.estimators_]

[0.9655, 0.9691]

The voting classifier only very slightly reduced the error rate of the best model in this case.

### Exercise 9
Run the individual classifiers from the previous exercise to **make predictions** on the validation set,
and create **a new training set** with the **resulting predictions**: each training instance is a vector containing the set of predictions from all your classifiers for an image, and the target is the image’s class. Congratulations, you have just trained a **blender**, and together with the classifiers they form a
**stacking ensemble**! <br/>

Now let’s evaluate the ensemble on the test set. For each image in the test set,
make predictions with all your classifiers, then feed the predictions to the blender to get the ensemble’s predictions. How does it compare to the **voting classifier** you trained earlier?

In [22]:
X_val_predictions = np.empty((len(X_val), len(estimators)), dtype=np.float32)
X_val_predictions

array([[1.1e-44, 0.0e+00, 5.6e-45],
       [0.0e+00, 1.1e-44, 0.0e+00],
       [9.8e-45, 0.0e+00, 9.8e-45],
       ...,
       [0.0e+00, 9.8e-45, 0.0e+00],
       [1.4e-45, 0.0e+00, 2.8e-45],
       [0.0e+00, 7.0e-45, 0.0e+00]], dtype=float32)

In [23]:
# make predictions on the validation set, X_val_prediction is the new training set
for index, estimator in enumerate(estimators):
    X_val_predictions[:, index] = estimator.predict(X_val)
X_val_predictions

array([[5., 5., 5.],
       [8., 8., 8.],
       [2., 2., 2.],
       ...,
       [7., 7., 7.],
       [6., 6., 6.],
       [7., 7., 7.]], dtype=float32)

In [24]:
rnd_blender = RandomForestClassifier(n_estimators=200, oob_score=True, random_state=42)

In [25]:
rnd_blender.fit(X_val_predictions, y_val)

RandomForestClassifier(n_estimators=200, oob_score=True, random_state=42)

In [26]:
rnd_blender.oob_score_

0.9693

In [27]:
# let’s evaluate the ensemble on the test set
X_test_predictions = np.empty((len(X_test), len(estimators)), dtype=np.float32)

for index, estimator in enumerate(estimators):
    X_test_predictions[:, index] = estimator.predict(X_test)

In [28]:
y_pred = rnd_blender.predict(X_test_predictions)

In [29]:
from sklearn.metrics import accuracy_score
accuracy_score(y_test, y_pred)

0.9681