# 1. 

If you have trained five different models on the exact same training data, and they all achieve 95% precision, is there any chance that you can combine these models to get better results? If so, how? If not, why?

## My Solution 

Yes. Ensemble learning with weak or strong models generally produces better results than a single model. We can use a voting classifier to combine the models. 


# 2. 
What is the difference between hard and soft voting classifiers?

## My solution 

Hard voting take the classification from each model and uses the majority prediction as the end result. Soft voting is similar but takes in the probability the model used to come to that conclusion. The latter often leads to better results. 

# 3. 

Is it possible to speed up training of a bagging ensemble by distributing it across multiple servers? What about pasting ensembles, boosting ensembles, Random Forests, or stacking ensembles?

## My Solution 

From what I call, bagging is out of core. Out of core algoritms can be ran across multiple servers.

Pasting requires the algorithm to remove the training instances after selection, thus it cannot be out of core. Boosting requires us to train the models in a sequence, thus we cannot do it across multiple servers. 

Random forests use bagging on the many models it trains, since there is no need for the various models to communicate with one another during training, we should be able to distribute it over multiple servers. 

Stacking ensembles should able to ran across multiple servers. Each model is trained on the first inital set. Then they can communicate predicitions for the second training set. 



# 4. 

What is the benefit of out-of-bag evaluation?

## My solution 

Does not need a validation(testing) set.

# 5. 

What makes Extra-Trees more random than regular Random Forests? How can this extra randomness help? Are Extra-Trees slower or faster than regular Random Forests?

## My solution 

Extra-trees use a random threshold for the decision nodes, while random forests use some expensive algorithm like Gini. The randomness trades in some bias for a lower variance. Extra-trees are faster than random forests. 

# 6. 

If your AdaBoost ensemble underfits the training data, which hyperparameters should you tweak and how?

## My solution 

If your AdaBoost ensemble is underfitting the training set, you can try increasing the number of estimators or reduce regularizing the base estimator.

# 7. 

If your Gradient Boosting ensemble overfits the training set, should you increase or decrease the learning rate?

## My solution 

Decrease the learning rate. 

# 8. 
Load the MNIST data, and split it into a training set, a validation set, and a test set (e.g., use 50,000 instances for training, 10,000 for validation, and 10,000 for testing). Then train various classifiers, such as a Random Forest classifier, an Extra-Trees classifier, and an SVM classifier. Next, try to combine them into an ensemble that outperforms each individual classifier on the validation set, using soft or hard voting. Once you have found one, try it on the test set. How much better does it perform compared to the individual classifiers?



In [1]:
#Fetches MNIST Dataset
from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784', version=1)
X, y = mnist["data"], mnist["target"]
X.shape

(70000, 784)

In [2]:
#The model training was taking too long. 
# I decreased the number of instances to speed this up 
X_train, X_val, X_test = X[:20000],X[20000:25000],X[25000:30000]
y_train, y_val, y_test = y[:20000],y[20000:25000],y[25000:30000]

In [4]:
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import VotingClassifier
from sklearn.svm import SVC
rf_cl = RandomForestClassifier()
SVC = SVC()
et_cl = ExtraTreesClassifier()

voting_clf = VotingClassifier(
    estimators=[('et', et_cl), ('rf', rf_cl), ('svc',SVC)],
    voting='hard')
voting_clf.fit(X_train, y_train)

VotingClassifier(estimators=[('et',
                              ExtraTreesClassifier(bootstrap=False,
                                                   ccp_alpha=0.0,
                                                   class_weight=None,
                                                   criterion='gini',
                                                   max_depth=None,
                                                   max_features='auto',
                                                   max_leaf_nodes=None,
                                                   max_samples=None,
                                                   min_impurity_decrease=0.0,
                                                   min_impurity_split=None,
                                                   min_samples_leaf=1,
                                                   min_samples_split=2,
                                                   min_weight_fraction_leaf=0.0,
                                 

In [7]:
from sklearn.metrics import accuracy_score
for clf in (et_cl, rf_cl,  SVC, voting_clf):
    clf.fit(X_train, y_train)
    y_pred = clf.predict(X_test)
    print(clf.__class__.__name__, accuracy_score(y_test, y_pred))

ExtraTreesClassifier 0.9558
RandomForestClassifier 0.9516
SVC 0.9654
VotingClassifier 0.9578


# 9. 

Run the individual classifiers from the previous exercise to make predictions on the validation set, and create a new training set with the resulting predictions: each training instance is a vector containing the set of predictions from all your classifiers for an image, and the target is the image’s class. Train a classifier on this new training set. Congratulations, you have just trained a blender, and together with the classifiers it forms a stacking ensemble! Now evaluate the ensemble on the test set. For each image in the test set, make predictions with all your classifiers, then feed the predictions to the blender to get the ensemble’s predictions. How does it compare to the voting classifier you trained earlier?

In [10]:
import numpy as np

estimators = [et_cl, rf_cl,  SVC, voting_clf]
X_val_predictions = np.empty((len(X_val), len(estimators)), dtype=np.float32)


for index, estimator in enumerate(estimators):
    X_val_predictions[:, index] = estimator.predict(X_val)

In [11]:
rnd_forest_blender = RandomForestClassifier(n_estimators=100, oob_score=True, random_state=42)
rnd_forest_blender.fit(X_val_predictions, y_val)
rnd_forest_blender.oob_score_

0.9672

In [None]:
X_test_predictions = np.empty((len(X_test), len(estimators)), dtype=np.float32)

for index, estimator in enumerate(estimators):
    X_test_predictions[:, index] = estimator.predict(X_test)

In [None]:
y_pred = rnd_forest_blender.predict(X_test_predictions)

In [None]:
accuracy_score(y_test, y_pred)