#### 1. If you have trained five different models on the exact same training data, and they all achieve 95% precision, is there any chance that you can combine these models to get better results? If so, how? If not, why?

Even if they did give 51% but 95% accuracy, the combiner of all those three would've given 75% accuracy and here they are already giving 95% accuracy, so it's damn evident that the consequent combiner learner is gonna give the more better results. The aggregated answer is almost always better than the individual's one. This is something called **wisdom of the crowd**. 
<br><br>The whole Ensemble learning domain bases on this very idea.

#### 2. What's the difference between hard and soft voting classifiers?

**Hard Voting Classifier** bases its final decision on the number of votes each class get among all the learners from an ensemble and apparently chooses the class having the highest number of votes.<br>However, **Soft Voting Classifier** does the same by computing the probablities averaged over all the classifiers present in an ensemble and chooses the class which have the highest probablity. For this to happen, each classifier in the very ensemble should have `predict_proba` method.

#### 3. Is it possible to speed up training of a bagging ensemble by distributing it across multiple servers? What about pasting ensembles, boosting ensembles, Random Forests, or stacking ensembles?

This idea holds true for all the ensemble mentioned except for the boosting ensemble. **Because all ensembles other than the boosting here works internally by training their predictors parallely.** Thus if they are to be trained on the multiple servers, it's definitely gonna speed up the training.

However, **Boosting** interally trains its predictors in a sequential way (or serially, can say), with each predictor working on the mistakes of its predecessor. Thus the whole process has to be in the sequential order only, and apparently it won't do any good, if they are to be trained on multiple servers.

#### 4. What is the benefit of out-of-bag evaluation?

In the **bagging** ensemble, there's a very high that some of the training instances gets repeated among different predictors and even for the same predictor and at the same time, some instances do not even make in training subsets of any of the predictors in an ensemble. Thus, it's quite evident that the final ensemble has never seen those instances and by making set of them to be called **out-of-bag instances**, they can very well be used as the validation set to check the final predictor's performance.

**Since, the model has never seen those instances during training, they gauge the model's true performance.** And apparently, there's no need to hold out additional set for the validation, and thus more number of instances would be available to the ensemble and hence performance level up.

#### 5. What makes Extra-Trees more random than regular Random Forests? How can this extra randomness help? Are Extra-Trees slower or faster than regular Random Forests?

The more randomness in the **Extra-Trees** is due to the reason that rather than selecting the best feature for the node to be split using gini or entropy criterions, they just randomly pick any feature for the node that's to be split from the subset of features (after feature sampling) available to them.

In turn, they get to have rid of calculation of the aforementioned criterions for each feature from the subset of features which is a very time-consming task and hence, it also makes these Extra-Trees train much faster than the regular Random Forests. **This technique trades even more bias for the lower variance.**

**Additionally, the extra randomness acts like a form of `regularization`-- if a random forest overfits the training data, Extra-Trees might perform better.**

#### 6. If your AdaBoost ensemble underfits the training data, which hyperparameters should you tweak and how?

I believe, by adding more predictors to the ensemble i.e. by increasing **n_estimators**, we can get rid of underfitting. 

**Reducing the regularization params** or **slightly increasing the learning rate** are viable options too.

#### 7. If your Gradient Boosting ensemble overfits the training set, should you increase or decrease the learning rate?

In case of overfitting, the learning rate must be decreased so that the boosting with each consequent predictor that leads to overfitting gets in control.

Could also use early stopping to find the right number of predictors.

### 8. 
Load the MNIST data, and split it into a trainig set, a validation set, and a test set (use 50,000 instances for training, 10k for testting and 10k for validation). Then train various classifiers, such as a Random Forest classifier, an Extra-Trees classifier, and an SVM classifier. Next, try to combine them into an ensemble that outperforms each individual classifier on the validation set, using soft or hard voting. Once  you've found one, try it on the test set. How much better does it perform compared to the individual classifiers?

In [1]:
import pandas as pd
import numpy as np
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split

from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.svm import SVC

from sklearn.ensemble import VotingClassifier



In [4]:
## loading the MNIST dataset

mnist = fetch_openml('mnist_784', version=1)
X, Y = mnist["data"], mnist["target"]


## Split the data into training and test subsets
x_train, X_test, y_train, Y_test = train_test_split(X, Y, train_size=50000,
                                                   random_state=1999)
print(f"Training Instances: {x_train.shape[0]}")


## Now, split the test set further into test and validation subsets
x_test, x_val, y_test, y_val = train_test_split(X_test, Y_test, train_size=10000,
                                                random_state=1999)
print(f"Test Instances: {x_test.shape[0]}\nValidation Instances: {x_val.shape[0]}")

Training Instances: 50000
Test Instances: 10000
Validation Instances: 10000


#### Traning various classifers:

In [5]:
## Training a RandomForestClassifier

rand_clf = RandomForestClassifier(random_state=42)
rand_clf.fit(x_train, y_train)

In [6]:
## Training an ExtraTreesClassifier

extra_trees_clf = ExtraTreesClassifier(random_state=42)
extra_trees_clf.fit(x_train, y_train)

In [7]:
## Training a SupportVectorClassifier

svm_clf = SVC(kernel='rbf', probability=True, random_state=42)
svm_clf.fit(x_train, y_train)

#### Ensemble of the above classifiers:

In [8]:
soft_voting_clf = VotingClassifier(estimators=[('RandomForestClassifier', rand_clf), 
                                               ('ExtraTreesClassifier', extra_trees_clf),
                                               ('SVC', svm_clf)], voting='soft')
soft_voting_clf.fit(x_train, y_train)

#### Saving these models locally:

In [9]:
import os, joblib

os.makedirs("artifacts", exist_ok=True)

joblib.dump(rand_clf, open("./artifacts/RandomForestClf_mod.joblib", "wb"))
joblib.dump(extra_trees_clf, open("./artifacts/ExtraTreesClf_mod.joblib", "wb"))
joblib.dump(svm_clf, open("./artifacts/SVC_mod.joblib", "wb"))
joblib.dump(soft_voting_clf, open("./artifacts/SoftVotingClf_mod.joblib", "wb"))

#### Evaluation on the Validation set

In [10]:
print("Performances:")
print("RandomForestClassifier: ", rand_clf.score(x_val, y_val))
print("ExtraTressClassifier: ", extra_trees_clf.score(x_val, y_val))
print("SupportVectorClassifier: ", svm_clf.score(x_val, y_val))

print("\nEnsembleOFAllThree Classifier: ", soft_voting_clf.score(x_val, y_val))

Performances:
RandomForestClassifier:  0.964
ExtraTressClassifier:  0.9691
SupportVectorClassifier:  0.9783

EnsembleOFAllThree Classifier:  0.9785


#### Now, Evaluation on the test set:

In [11]:
print("Performances on the TestSet:")
print("RandomForestClassifier: ", rand_clf.score(x_test, y_test))
print("ExtraTressClassifier: ", extra_trees_clf.score(x_test, y_test))
print("SupportVectorClassifier: ", svm_clf.score(x_test, y_test))

print("\nEnsembleOFAllThree Classifier: ", soft_voting_clf.score(x_test, y_test))

Performances on the TestSet:
RandomForestClassifier:  0.9677
ExtraTressClassifier:  0.9715
SupportVectorClassifier:  0.9768

EnsembleOFAllThree Classifier:  0.9782


#### => Measle improvement but point proved! : )

### 9. 
Run the individual classifiers from the above to make predictions on the validation set, and create a new training set with the resulting predictions: each training instance is a vector containing the set of the predictions from all your classifiers for an image, and the target is the image's class.<br>Train a classifier on this new training set. Congrats, you have just trained a blender, and together with the classifers it forms a stacking ensemble!<br>Now, evaluate the ensemble on the test set. For each image in the test set, make predictions with all your classifiers, then feed the predictions to the blender to get the ensemble's predictions. How does it compare to the voting classifier you trained earlier?

#### Treating predictions from each classifier as features to be trained by the blender:

In [22]:
f1 = rand_clf.predict(x_val).astype(int)
f2 = extra_trees_clf.predict(x_val).astype(int)
f3 = svm_clf.predict(x_val).astype(int)

X_new = pd.DataFrame(np.c_[f1, f2, f3], columns=["yhat__rand_clf", "yhat__extra_trees_clf",
                                                "yhat__svc"])

In [23]:
## Let's first convert each target set into 'int' datatype

y_train = y_train.astype('int')
y_test = y_test.astype('int')
y_val = y_val.astype('int')

#### Training a new classifer which can be called as `blender` on the new training set:

In [27]:
import xgboost as xg
from sklearn.metrics import accuracy_score
from sklearn.model_selection import GridSearchCV

grid_params = {
    "eta": [.3, .5, 1],
    "colsample_bytree": [.2, .5, 1],
    "colsample_bylevel": [.5, 1],
    "colsample_bynode": [.5, 1],
    "max_depth": [3, 6],
    "random_state": [42],
    "n_estimators": [100, 300, 500]
}

grid_search = GridSearchCV(xg.XGBClassifier(), param_grid=grid_params, cv=5, verbose=2)
grid_search.fit(X_new, y_val)

Fitting 5 folds for each of 216 candidates, totalling 1080 fits
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.3, max_depth=3, n_estimators=100, random_state=42; total time=   0.6s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.3, max_depth=3, n_estimators=100, random_state=42; total time=   0.6s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.3, max_depth=3, n_estimators=100, random_state=42; total time=   0.5s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.3, max_depth=3, n_estimators=100, random_state=42; total time=   0.5s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.3, max_depth=3, n_estimators=100, random_state=42; total time=   0.5s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.3, max_depth=3, n_estimators=300, random_state=42; total time=   1.8s
[CV] END colsample_bylev

[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.5, max_depth=6, n_estimators=300, random_state=42; total time=   3.2s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.5, max_depth=6, n_estimators=500, random_state=42; total time=   5.4s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.5, max_depth=6, n_estimators=500, random_state=42; total time=   5.6s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.5, max_depth=6, n_estimators=500, random_state=42; total time=   5.4s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.5, max_depth=6, n_estimators=500, random_state=42; total time=   5.7s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.5, max_depth=6, n_estimators=500, random_state=42; total time=   5.5s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.2, eta=1, max_d

[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.5, eta=0.3, max_depth=6, n_estimators=100, random_state=42; total time=   1.1s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.5, eta=0.3, max_depth=6, n_estimators=300, random_state=42; total time=   3.2s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.5, eta=0.3, max_depth=6, n_estimators=300, random_state=42; total time=   3.3s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.5, eta=0.3, max_depth=6, n_estimators=300, random_state=42; total time=   3.2s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.5, eta=0.3, max_depth=6, n_estimators=300, random_state=42; total time=   3.5s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.5, eta=0.3, max_depth=6, n_estimators=300, random_state=42; total time=   3.3s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.5, eta=0.3, max

[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.5, eta=1, max_depth=3, n_estimators=500, random_state=42; total time=   4.2s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.5, eta=1, max_depth=6, n_estimators=100, random_state=42; total time=   1.1s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.5, eta=1, max_depth=6, n_estimators=100, random_state=42; total time=   1.1s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.5, eta=1, max_depth=6, n_estimators=100, random_state=42; total time=   1.1s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.5, eta=1, max_depth=6, n_estimators=100, random_state=42; total time=   1.1s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.5, eta=1, max_depth=6, n_estimators=100, random_state=42; total time=   1.1s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=0.5, eta=1, max_depth=6, n_es

[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=1, eta=0.5, max_depth=3, n_estimators=300, random_state=42; total time=   4.5s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=1, eta=0.5, max_depth=3, n_estimators=500, random_state=42; total time=   7.4s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=1, eta=0.5, max_depth=3, n_estimators=500, random_state=42; total time=   6.6s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=1, eta=0.5, max_depth=3, n_estimators=500, random_state=42; total time=   5.7s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=1, eta=0.5, max_depth=3, n_estimators=500, random_state=42; total time=   5.7s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=1, eta=0.5, max_depth=3, n_estimators=500, random_state=42; total time=   5.7s
[CV] END colsample_bylevel=0.5, colsample_bynode=0.5, colsample_bytree=1, eta=0.5, max_depth=6, n_es

[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.2, eta=0.3, max_depth=3, n_estimators=300, random_state=42; total time=   2.6s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.2, eta=0.3, max_depth=3, n_estimators=300, random_state=42; total time=   2.5s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.2, eta=0.3, max_depth=3, n_estimators=300, random_state=42; total time=   2.6s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.2, eta=0.3, max_depth=3, n_estimators=300, random_state=42; total time=   2.7s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.2, eta=0.3, max_depth=3, n_estimators=300, random_state=42; total time=   2.6s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.2, eta=0.3, max_depth=3, n_estimators=500, random_state=42; total time=   5.1s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.2, eta=0.3, max_depth=3, n_es

[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.2, eta=1, max_depth=3, n_estimators=100, random_state=42; total time=   0.8s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.2, eta=1, max_depth=3, n_estimators=100, random_state=42; total time=   0.8s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.2, eta=1, max_depth=3, n_estimators=100, random_state=42; total time=   0.8s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.2, eta=1, max_depth=3, n_estimators=100, random_state=42; total time=   0.8s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.2, eta=1, max_depth=3, n_estimators=100, random_state=42; total time=   0.8s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.2, eta=1, max_depth=3, n_estimators=300, random_state=42; total time=   2.5s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.2, eta=1, max_depth=3, n_estimators=300, 

[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.5, eta=0.3, max_depth=6, n_estimators=500, random_state=42; total time=   6.2s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.5, eta=0.3, max_depth=6, n_estimators=500, random_state=42; total time=   5.7s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.5, eta=0.3, max_depth=6, n_estimators=500, random_state=42; total time=   6.6s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.5, eta=0.3, max_depth=6, n_estimators=500, random_state=42; total time=   5.6s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.5, eta=0.5, max_depth=3, n_estimators=100, random_state=42; total time=   0.8s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.5, eta=0.5, max_depth=3, n_estimators=100, random_state=42; total time=   0.9s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.5, eta=0.5, max_depth=3, n_es

[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.5, eta=1, max_depth=6, n_estimators=300, random_state=42; total time=   4.0s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.5, eta=1, max_depth=6, n_estimators=300, random_state=42; total time=   3.7s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.5, eta=1, max_depth=6, n_estimators=300, random_state=42; total time=   3.5s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.5, eta=1, max_depth=6, n_estimators=300, random_state=42; total time=   3.3s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.5, eta=1, max_depth=6, n_estimators=500, random_state=42; total time=   5.4s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.5, eta=1, max_depth=6, n_estimators=500, random_state=42; total time=   5.5s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=0.5, eta=1, max_depth=6, n_estimators=500, 

[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=1, eta=0.5, max_depth=6, n_estimators=100, random_state=42; total time=   1.8s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=1, eta=0.5, max_depth=6, n_estimators=100, random_state=42; total time=   1.6s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=1, eta=0.5, max_depth=6, n_estimators=100, random_state=42; total time=   1.6s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=1, eta=0.5, max_depth=6, n_estimators=300, random_state=42; total time=   6.3s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=1, eta=0.5, max_depth=6, n_estimators=300, random_state=42; total time=   4.7s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=1, eta=0.5, max_depth=6, n_estimators=300, random_state=42; total time=   4.5s
[CV] END colsample_bylevel=0.5, colsample_bynode=1, colsample_bytree=1, eta=0.5, max_depth=6, n_estimators=300, 

[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.3, max_depth=3, n_estimators=500, random_state=42; total time=   6.3s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.3, max_depth=3, n_estimators=500, random_state=42; total time=   6.2s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.3, max_depth=6, n_estimators=100, random_state=42; total time=   1.7s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.3, max_depth=6, n_estimators=100, random_state=42; total time=   1.3s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.3, max_depth=6, n_estimators=100, random_state=42; total time=   1.2s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.3, max_depth=6, n_estimators=100, random_state=42; total time=   1.3s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.2, eta=0.3, max_depth=6, n_es

[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.2, eta=1, max_depth=3, n_estimators=300, random_state=42; total time=   4.1s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.2, eta=1, max_depth=3, n_estimators=300, random_state=42; total time=   4.0s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.2, eta=1, max_depth=3, n_estimators=500, random_state=42; total time=   7.0s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.2, eta=1, max_depth=3, n_estimators=500, random_state=42; total time=   6.8s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.2, eta=1, max_depth=3, n_estimators=500, random_state=42; total time=   6.3s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.2, eta=1, max_depth=3, n_estimators=500, random_state=42; total time=   6.5s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.2, eta=1, max_depth=3, n_estimators=500, 

[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.5, eta=0.5, max_depth=3, n_estimators=100, random_state=42; total time=   1.1s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.5, eta=0.5, max_depth=3, n_estimators=100, random_state=42; total time=   1.1s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.5, eta=0.5, max_depth=3, n_estimators=300, random_state=42; total time=   3.6s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.5, eta=0.5, max_depth=3, n_estimators=300, random_state=42; total time=   3.6s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.5, eta=0.5, max_depth=3, n_estimators=300, random_state=42; total time=   3.7s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.5, eta=0.5, max_depth=3, n_estimators=300, random_state=42; total time=   3.9s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.5, eta=0.5, max_depth=3, n_es

[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.5, eta=1, max_depth=6, n_estimators=500, random_state=42; total time=   8.5s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=0.5, eta=1, max_depth=6, n_estimators=500, random_state=42; total time=   8.0s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=1, eta=0.3, max_depth=3, n_estimators=100, random_state=42; total time=   1.3s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=1, eta=0.3, max_depth=3, n_estimators=100, random_state=42; total time=   1.0s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=1, eta=0.3, max_depth=3, n_estimators=100, random_state=42; total time=   1.2s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=1, eta=0.3, max_depth=3, n_estimators=100, random_state=42; total time=   1.1s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=1, eta=0.3, max_depth=3, n_estimators=100, 

[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=1, eta=0.5, max_depth=6, n_estimators=300, random_state=42; total time=   5.5s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=1, eta=0.5, max_depth=6, n_estimators=500, random_state=42; total time=  11.2s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=1, eta=0.5, max_depth=6, n_estimators=500, random_state=42; total time=   9.4s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=1, eta=0.5, max_depth=6, n_estimators=500, random_state=42; total time=   8.8s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=1, eta=0.5, max_depth=6, n_estimators=500, random_state=42; total time=   8.0s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=1, eta=0.5, max_depth=6, n_estimators=500, random_state=42; total time=   8.1s
[CV] END colsample_bylevel=1, colsample_bynode=0.5, colsample_bytree=1, eta=1, max_depth=3, n_estimators=100, ra

[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.2, eta=0.3, max_depth=6, n_estimators=300, random_state=42; total time=   4.2s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.2, eta=0.3, max_depth=6, n_estimators=300, random_state=42; total time=   5.0s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.2, eta=0.3, max_depth=6, n_estimators=300, random_state=42; total time=   5.0s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.2, eta=0.3, max_depth=6, n_estimators=300, random_state=42; total time=   6.8s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.2, eta=0.3, max_depth=6, n_estimators=300, random_state=42; total time=   5.3s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.2, eta=0.3, max_depth=6, n_estimators=500, random_state=42; total time=   7.5s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.2, eta=0.3, max_depth=6, n_estimators=500, 

[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.2, eta=1, max_depth=6, n_estimators=100, random_state=42; total time=   1.6s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.2, eta=1, max_depth=6, n_estimators=100, random_state=42; total time=   1.5s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.2, eta=1, max_depth=6, n_estimators=100, random_state=42; total time=   1.5s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.2, eta=1, max_depth=6, n_estimators=100, random_state=42; total time=   2.0s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.2, eta=1, max_depth=6, n_estimators=300, random_state=42; total time=   6.4s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.2, eta=1, max_depth=6, n_estimators=300, random_state=42; total time=   7.8s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.2, eta=1, max_depth=6, n_estimators=300, random_state=4

[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.5, eta=0.5, max_depth=3, n_estimators=500, random_state=42; total time=   8.3s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.5, eta=0.5, max_depth=3, n_estimators=500, random_state=42; total time=   6.6s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.5, eta=0.5, max_depth=3, n_estimators=500, random_state=42; total time=   7.0s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.5, eta=0.5, max_depth=6, n_estimators=100, random_state=42; total time=   2.5s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.5, eta=0.5, max_depth=6, n_estimators=100, random_state=42; total time=   2.3s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.5, eta=0.5, max_depth=6, n_estimators=100, random_state=42; total time=   2.0s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.5, eta=0.5, max_depth=6, n_estimators=100, 

[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, eta=0.3, max_depth=3, n_estimators=300, random_state=42; total time=   5.0s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, eta=0.3, max_depth=3, n_estimators=300, random_state=42; total time=   4.5s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, eta=0.3, max_depth=3, n_estimators=500, random_state=42; total time=   7.9s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, eta=0.3, max_depth=3, n_estimators=500, random_state=42; total time=   8.3s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, eta=0.3, max_depth=3, n_estimators=500, random_state=42; total time=   6.5s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, eta=0.3, max_depth=3, n_estimators=500, random_state=42; total time=   8.1s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, eta=0.3, max_depth=3, n_estimators=500, random_state=4

[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, eta=1, max_depth=3, n_estimators=300, random_state=42; total time=   6.5s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, eta=1, max_depth=3, n_estimators=300, random_state=42; total time=   5.2s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, eta=1, max_depth=3, n_estimators=300, random_state=42; total time=   4.0s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, eta=1, max_depth=3, n_estimators=300, random_state=42; total time=   4.3s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, eta=1, max_depth=3, n_estimators=300, random_state=42; total time=   4.6s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, eta=1, max_depth=3, n_estimators=500, random_state=42; total time=   8.7s
[CV] END colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, eta=1, max_depth=3, n_estimators=500, random_state=42; total time=

In [28]:
## Best params

grid_search.best_params_

{'colsample_bylevel': 0.5,
 'colsample_bynode': 0.5,
 'colsample_bytree': 0.2,
 'eta': 0.3,
 'max_depth': 3,
 'n_estimators': 300,
 'random_state': 42}

In [29]:
## Best estimator

blender_xgb = grid_search.best_estimator_
blender_xgb

In [30]:
## Saving the best XGBoostClassifier locally

joblib.dump(blender_xgb, open("./artifacts/blender_xgb.joblib", "wb"))

#### Prediction:

In [31]:
def evaluate_stacking_ensemble(x_sample=x_test):
    """
    Followed the same pipeline as the training did in Stacking to evaluate its performance.
    """
    # predcitions from the layer 1
    f1 = rand_clf.predict(x_sample).astype('int')
    f2 = extra_trees_clf.predict(x_sample).astype('int')
    f3 = svm_clf.predict(x_sample).astype('int')
    
    # Readying predictions from above as features for the second layer
    X_new = pd.DataFrame(np.c_[f1, f2, f3], columns=["yhat__rand_clf", "yhat__extra_trees_clf"
                                                     , "yhat__svc"])
    
    # Final predictions via Blender
    y_pred = blender_xgb.predict(X_new)
    
    return accuracy_score(y_test, y_pred)

evaluate_stacking_ensemble()

0.9758

#### => Our `VotingClassifier` prevailed.