### ENSEMBLE TECHNIQUES DEMYSTIFIED

So you came here, let me guess, it's either you're in a data science competition and you read somewhere about how winners of most competitions always win with ensembles or you're  just a curious data scientist who wants to learn about ensembles.

Whichever you are, knowledge of ensembles is one of the most important skill every data scientist/machine learning engineer should have. 
An ensemble 90 percent of the time would always outperform a single model, and it is the recommended technique for squeezing out more accuracies or lower errors from a machine learning project.

If you've done a couple of data science projects, then you have probably used a type of ensemble. Popular algorithms like RandomForest, AdaBoost, XGBoost or CatBoost are different implementations of ensembles.

In this article, we will walk through the basic concept of ensembles and you'll learn just enough to construct good ones. So let's begin.

##### What you will learn:

1. A brief introduction to Ensemble.

2. Introduction to the sample data.
3. Simple Ensemble Techniques
     a. Averaging
     b. Weighted Average
     c. Max Voting
4. Advanced Ensemble Techniques
     a. Bagging
     b. Boosting
     c. Stacking
     

#### A brief introduction to Ensembles

Suppose you want to save money to buy a new laptop, but you do not know how much it sells for and as such cannot set a saving target. Of course, we assume you're out of power and access to the internet because of an alien invasion the previous day.
One reasonable thing to do in order to know the price is to ask someone, presumably a friend-we don't want you stopping people on their way to work asking about laptop prices!

You find your friend and asks him; he thinks for a moment and mumbles something around $900. Now you know your friend may love tech gadgets, but he certainly does not know the actual price of the laptop. You also know he may not be able to give you the actual price, but all you really care about is that he gives you an estimate close to the true price (Your friend is a single model).

Next, you find five of your colleagues from work. Luckily, they are arguing about MAC vs PC for software development. You jump in and pose the question-the price of your dream laptop. Well, as you guessed, they all had a thing or two to say.

Person 1 said $1000

Person 2 said $950

Person 3 said $800

Person 4 said $1100

Person 5 said $900

You look at the following prices from the five different people and notice that they are all within a certain range (900 - 1100). You decided to take the average which is $950. Well, I'm happy to inform you that you just created your first ensemble. A combination of five predictions and taking the average. 
Surely, you would trust this average better than your one friend right? Yes I would, well, unless your friend is the dealer. 

That my friend is the intuition behind ensembles, albeit what we illustrated above is just a simple type of ensemble called __averaging__ for regression problems. As we proceed, we'll see other techniques. 

Formally, [wiki](source:https://en.wikipedia.org/wiki/Ensemble_learning) said an ensemble is a method that uses multiple learning algorithms to obtain better predictive performance than could be obtained by any of the single learning algorithm. 


#### Introduction to the sample data

For this tutorial, we're going to use a [German bank Credit data](). Let's take a peak at the data



In [3]:
import pandas as pd
import numpy as np
from statistics import mode

german_cred = pd.read_csv('credit_preped.csv')
german_cred.head()

Unnamed: 0,customer_id,loan_duration_mo,loan_amount,payment_pcnt_income,time_in_residence,age_yrs,number_loans,dependents,bad_credit,checking_account_status_0 - 200 DM,...,home_ownership_own,home_ownership_rent,job_category_highly skilled,job_category_skilled,job_category_unemployed-unskilled-non-resident,job_category_unskilled-resident,telephone_none,telephone_yes,foreign_worker_no,foreign_worker_yes
0,1122334,6,1169,4,4,67,2,1,0,0,...,1,0,0,1,0,0,0,1,0,1
1,6156361,48,5951,2,2,22,1,1,1,1,...,1,0,0,1,0,0,1,0,0,1
2,2051359,12,2096,2,3,49,1,2,0,0,...,1,0,0,0,0,1,1,0,0,1
3,8740590,42,7882,2,4,45,1,2,0,0,...,0,0,0,1,0,0,1,0,0,1
4,3924540,24,4870,3,4,53,2,2,1,0,...,0,0,0,1,0,0,1,0,0,1


Note: The data has already been cleaned and all features have been converted to numerical types. The data was original meant to be a classification problem.
I.e The task was to predict if a customer's loan application will result in a good or bad credit. 

Since We'll be explaining ensembles for regression as well as classification tasks, we'll often rephrase the problem using the same data. For classification ensembles, we'll use the feature __bad_credit__ as the target and for regression task, we'll use  __age_yrs__.

First, we'll import some modules we'll be using and then we'll drop the customer_id column

In [4]:
german_cred.drop('customer_id', axis=1, inplace=True)

#Metric calculations
from sklearn.model_selection import KFold, cross_val_score
from sklearn.metrics import mean_absolute_error, accuracy_score


from sklearn.preprocessing import RobustScaler
from sklearn.model_selection import train_test_split

#set seed
rand_seed = 234
np.random.seed = rand_seed

Before we start creating our ensembles, let's train single models and get their performance.
For classification we use three simple models: Support Vector Classifiers (SVC), Logistic Regression and K-Nearest Neighbor Classifier.
For Regression task, we'll use Linear regression, Support Vector Regressor (SVR) and K-Nearest Neighbor Regressor.

We append these models to a list and loop over each as we train and cross validate.

In [5]:
#Import single models
from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.svm import SVR, SVC
from sklearn.neighbors import KNeighborsRegressor,KNeighborsClassifier

#Classification models
log_cf = LogisticRegression(solver='lbfgs', random_state=rand_seed)
svc_cf = SVC(gamma='scale', random_state=rand_seed)
knn_cf = KNeighborsClassifier()

classification_models = [log_cf, svc_cf, knn_cf]

#Regression models
linear_reg = LinearRegression()
svr_reg = SVR(gamma='scale')
knn_reg = KNeighborsRegressor()

regression_models = [linear_reg, svr_reg, knn_reg]


Next, we define some functions:
The first is used to standardize the data set. 

The second to split data into train and validation set.

The third and fourth measures performance for regression and classification tasks respectively.

The final function is used to train and validate models.

In [6]:
#Define a function to standardize the data set
def standardize_data(df):
    scaler = RobustScaler()
    data = scaler.fit_transform(df)
    return data

#Create a function to split our data into train and validation set for both task

def get_split_data(features, target_name=None):
    ## Get the target column
    target = features[target_name]
    ## Drop the target from the data
    temp_data = features.drop(target_name, axis=1)
    temp_data = standardize_data(temp_data)
    
    #split data
    X_train, X_val, y_train, y_val = train_test_split(temp_data, target, test_size=0.1)
    return (X_train, X_val, y_train, y_val)
    
def get_mae(pred, true_value):
    return mean_absolute_error(true_value, pred)


def get_acc(pred, true_value):
    return accuracy_score(true_value, pred) * 100

# A Function to train and cross validate a model
def model_train(model, features=None, target_name=None, nfolds = 10, task = 'class'):
    ## Get the target column
    target = features[target_name]
    ## Drop the target from the data
    temp_data = features.drop(target_name, axis=1)
    temp_data = standardize_data(temp_data)
    
    if task == 'reg':
        score = -1 * (cross_val_score(model, temp_data, target, cv=nfolds, scoring='neg_mean_absolute_error'))
        print("Mean Absolute Error of {} is {}".format(model.__class__.__name__, round(score[0], 4)))
        print("-------------------------------------")

    else:
        score = cross_val_score(model, temp_data, target, cv=nfolds, scoring='accuracy')
        print("Accuracy of {} is {} %".format(model.__class__.__name__, round(score[0] * 100)))
        print("-------------------------------------")



Now that we have our functions, lets test the base models. Remember, we use the feature __bad_credit__ for classification and the feature __age_yrs__ for regression. 

In [7]:
#Classification 
for model in classification_models:
    model_train(model, features=german_cred, target_name='bad_credit')
    
#Regression  
for model in regression_models:
    model_train(model, features=german_cred, target_name='age_yrs', task='reg')

Accuracy of LogisticRegression is 82.0 %
-------------------------------------
Accuracy of SVC is 82.0 %
-------------------------------------
Accuracy of KNeighborsClassifier is 74.0 %
-------------------------------------
Mean Absolute Error of LinearRegression is 8.0344
-------------------------------------
Mean Absolute Error of SVR is 8.8995
-------------------------------------
Mean Absolute Error of KNeighborsRegressor is 8.508
-------------------------------------


Note: We are not trying to compare algorithms in this articles. So we'll not perform heavy hyperparameter tuning and majority of the time we'll just used out-of-box parameters.

Now that we have our base models, let's learn about ensembles.

### Simple Ensemble Techniques

#### Averaging

Averaging is the simplest and most intuitive type of ensemble for regression task. Just like the name implies, it combines predictions from different models and takes the average/mean.
For example, since we're predicting age, if our three base models predicted 24, 23 and 26 respectively, we would take the average as (24 + 23 + 26) / 3 which is approx. 24.3. This becomes our prediction. 
Let's see this in code:

In [104]:
#get the data sets
X_train, X_val, y_train, y_val = get_split_data(german_cred, target_name='age_yrs')

#fit base models
linear_reg.fit(X_train, y_train)
knn_reg.fit(X_train, y_train)
svr_reg.fit(X_train, y_train)

#make predictions with trained models
pred1 = linear_reg.predict(X_val)
pred2 = knn_reg.predict(X_val)
pred3 = svr_reg.predict(X_val)

#Take average as final prediction
avgpred = (pred1 + pred2 + pred3) / 3

Now, do you think the average prediction does better than the single model? Well, let's find out. We'll calculate the mean absolute error of the individual models and compare with the average.

We'll see that the average prediction gives us the lowest MAE and as such does better.

In [117]:
print("Linear Regression Model")
print(get_mae(pred1, y_val))
print("KNN Regression Model")
print(get_mae(pred2, y_val))
print("SVR Regression Model")
print(get_mae(pred3, y_val))
print("Average Model")
print(get_mae(avgpred, y_val))

Linear Regression Model
6.86126953125
KNN Regression Model
7.078
SVR Regression Model
6.9133617373257
Average Model
6.786586857234646


#### Weighted Average

Weighted Average is a modification of Averaging. The intuition behind this is that some of the base models we want to average may have higher predictive powers than others, as such taking the average may not really capture this individual predictive ability. In cases like this, we assign different weights to different models based on their predictive ability. 

Note: Weights here simply means decimal numbers that add up to 1

Looking at the MAEs of our base regression models above, we see that the Linear Regression model does better than the others, so let's assign a higher weight to it. 

We demonstrate this below, by assigning 0.5 to the linear model and o.25 to the other two. This can be interpreted as saying "take the linear model 50% more serious than the other two"

We'll observe that the weighted average does better, even better than (slightly though) than the averaging ensemble. 

In [120]:
#fit base models
linear_reg.fit(X_train, y_train)
knn_reg.fit(X_train, y_train)
svr_reg.fit(X_train, y_train)

#make predictions with trained models
pred1 = linear_reg.predict(X_val)
pred2 = knn_reg.predict(X_val)
pred3 = svr_reg.predict(X_val)

#Take average as final prediction
w_avgpred = (0.5 * pred1 + 0.25 * pred2 + 0.25* pred3)

print("Linear Regression Model")
print(get_mae(pred1, y_val))
print("KNN Regression Model")
print(get_mae(pred2, y_val))
print("SVR Regression Model")
print(get_mae(pred3, y_val))
print("Weighted Average Model")
print(get_mae(w_avgpred, y_val))

Linear Regression Model
6.86126953125
KNN Regression Model
7.078
SVR Regression Model
6.9133617373257
Weighted Average Model
6.753724322613486


#### Max Voting

The Max Voting is similar to averaging except it is used for classification problems. In max voting as the name implies, we train multiple models, make predictions and then take the maximum/modal/most popular class as the predicted class. 

Let's understand this better with an example; suppose you ask your friends to rate the laptop you are saving up for.

On a scale of 1 to 5, your friends rated as follow:

Friend 1 = 3

Friend 2 = 5

Friend 3 = 4

Friend 4 = 3

Friend 5 = 3

Now, looking at the ratings, if we use max voting, we simply pick the rating that occur most which is 5. 

We demonstrate this in code below.

Note: we are now working on a classification task, so we're predicting __bad_credit__

In [79]:
#get the data sets
X_train, X_val, y_train, y_val = get_split_data(german_cred, target_name='bad_credit')

#fit single models
log_cf.fit(X_train, y_train)
knn_cf.fit(X_train, y_train)
svc_cf.fit(X_train, y_train)

#make predictions with trained models
pred1 = log_cf.predict(X_val)
pred2 = knn_cf.predict(X_val)
pred3 = svc_cf.predict(X_val)

#Take max voting as final prediction
maxpred = []

for i in range(0, len(X_val)):
    #calculate the mode and append to maxpred vector
    maxpred.append(mode([pred1[i], pred2[i], pred3[i]]))
    
    
print("Logistic Regression Model")
print(get_acc(pred1, y_val))
print("KNN Classifier Model")
print(get_acc(pred2, y_val))
print("SVR Classifier Model")
print(get_acc(pred3, y_val))

print("Max Voting Model")
print(get_acc(np.array(maxpred), y_val))

Logistic Regression Model
70.0
KNN Classifier Model
69.0
SVR Classifier Model
71.0
Max Voting Model
70.0


For ease of it is worth mentioning that sklearn has an implementation of Max Voting (VotingClassifier) that you can use. An example of using this module is shown below.

In [80]:
#Import the module
from sklearn.ensemble import VotingClassifier

#Pass the classifiers as a list of tuples with model names and the models themselves
max_model = VotingClassifier(estimators=[('logistic_reg', log_cf), ('KNN Classifier', knn_cf), ("SVC", svc_cf)], voting='hard')
max_model.fit(X_train, y_train)

print("Max Voting in sklearn")
print(get_acc(max_model.predict(X_val), y_val))

Max Voting in sklearn
70.0


#### Advanced Ensemble Techniques

##### Bagging

The intuition behind bagging (Bootstrap Aggregating) is quite simple. It is similar to averaging except one tiny change which is made on the data set we use to train the model.

In averaging we train multiple models on the same data set and take the average, but in bagging we train the models on different sub-samples of the orignal dataset before taking the combined predictions.

One question you may ask is that if we train on subsets of a data set, are we not still training on the same dataset? will the result not be similar?
The answer is actually a gray 'no". We are not training on the exactly the same dataset, yes the sub-samples are similar, but about 30% of the time we get different observations in them.

More on why bagging works can be found in this [paper]()

While it is possible to write your own bagging algorithm, it is advisable to use already existing bagging implementations. 
Many popular algorithms like RandomForest and ExtraTrees are implementations of bagging. 

In sklearn library, it is possible to also create your own bagging classifier or regressor from a specified base models. 

Let's see this in code below.
First, we import some bagging implementations in sklearn, we also import the bagging meta-estimator; this allows us to choose our own base model.

Note: As earlier said, the goal of the tutorial is to teach you the techniques of ensembling, not how to fine tune the models. So the MAEs or Accuracy may be high or low depending on the default parameter we use. 

In [8]:
#Bagging and Boosting models for both classification and regression problems
from sklearn.ensemble import RandomForestRegressor, ExtraTreesRegressor, BaggingRegressor
from sklearn.ensemble import RandomForestClassifier, ExtraTreesClassifier, BaggingClassifier
from sklearn.ensemble import GradientBoostingRegressor, AdaBoostRegressor
#import xgboost as xgb


#bagging algorithms for regression
rand_forest_reg = RandomForestRegressor(n_estimators=100, random_state=rand_seed)
extra_tree_reg = ExtraTreesRegressor(n_estimators=100,random_state=rand_seed)
#We use linear regressor as our base model for bagging
bagging_meta_reg = BaggingRegressor(svr_reg, n_estimators=100, random_state=rand_seed)

#bagging algorithms for classification
rand_forest_cf = RandomForestClassifier(n_estimators=100, random_state=rand_seed)
extra_tree_cf = ExtraTreesClassifier(n_estimators=100, random_state=rand_seed)
#We use svc as our base model for bagging
bagging_meta_cf = BaggingClassifier(svc_cf, n_estimators=10, random_state=rand_seed)


Next, we get our train and validation set for regression task, train the model and then test the performance.

In [32]:
#get data for regression task
X_train, X_val, y_train, y_val = get_split_data(german_cred, target_name='age_yrs')

#Train and fit these models
rand_forest_reg.fit(X_train, y_train)
extra_tree_reg.fit(X_train, y_train)
bagging_meta_reg.fit(X_train, y_train)

#check their performance
print("MAE of Random Forest is : ", get_mae(rand_forest_reg.predict(X_val), y_val))
print("MAE of Extra Trees is : ", get_mae(extra_tree_reg.predict(X_val), y_val))
print("MAE of Bagging estimator is : ", get_mae(bagging_meta_reg.predict(X_val), y_val))

MAE of Random Forest is :  6.6429
MAE of Extra Trees is :  7.4214
MAE of Bagging estimator is :  6.29636897349609



We do the same for the classification task

In [34]:
#get data for classification task
X_train, X_val, y_train, y_val = get_split_data(german_cred, target_name='bad_credit')

#Train and fit these models
rand_forest_cf.fit(X_train, y_train)
extra_tree_cf.fit(X_train, y_train)
bagging_meta_cf.fit(X_train, y_train)

#check their performance
print("ACC of Random Forest is : ", get_acc(rand_forest_cf.predict(X_val), y_val))
print("ACC of Extra Trees is : ", get_acc(extra_tree_cf.predict(X_val), y_val))
print("ACC of Bagging estimator is : ", get_acc(bagging_meta_cf.predict(X_val), y_val))

MAE of Random Forest is :  74.0
MAE of Extra Trees is :  72.0
MAE of Bagging estimator is :  75.0


#### Boosting

Boosting is another popular and effective ensembling technique. In Boosting, multiple models are trained sequentially. The goal is to train models that do better than their predecessors. This means we have to take into account the areas where the previous models performed poorly and improve on those area. If we keep doing this-improving on the failures of the predecessors, theoritically, it can be shown that we'll acheive a perfect model. But the world is never perfect and as such this may not be achievable in practice. 
Boosting works really well and is definitely some of the go to algorithms when doing data science competitions.

There exist many implementation of Boosting some of which are XGBoost, LigthGBM, AdaBoost, CatBoost etc.

Let's see some implementation of these boosting algorithms

In [9]:
#Import boosting regressoion algorithms
# import xgboost.XGBRegressor as xgb_reg
# import lightgbm.LGBRegressor as lgb_reg
from sklearn.ensemble import AdaBoostRegressor, GradientBoostingRegressor

#Import boosting regressoion algorithms
# import xgboost.XGBClassifier as xgb_cf
# import lightgbm.LGBClassifier as lgb_cf
from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier

#Regression
ada_reg = AdaBoostRegressor(base_estimator=svr_reg,n_estimators=100, random_state=rand_seed)
gb_reg = GradientBoostingRegressor(n_estimators=100, random_state=rand_seed)

#Classification
ada_cf = AdaBoostClassifier(base_estimator=log_cf, random_state=rand_seed)
gb_cf = GradientBoostingClassifier(random_state=rand_seed)


Now that we have initialize our boosting algorithms, we simply train and calculate performance on both regression and classification task

In [10]:
#get data for regression task
X_train, X_val, y_train, y_val = get_split_data(german_cred, target_name='age_yrs')

#Train and fit these models
ada_reg.fit(X_train, y_train)
gb_reg.fit(X_train, y_train)

#check their performance
print("MAE of AdaBoost is : ", get_mae(ada_reg.predict(X_val), y_val))
print("MAE of Gradient Boosting is : ", get_mae(gb_reg.predict(X_val), y_val))


#get data for regression task
X_train, X_val, y_train, y_val = get_split_data(german_cred, target_name='bad_credit')

#Train and fit these models
ada_cf.fit(X_train, y_train)
gb_cf.fit(X_train, y_train)

#check their performance
print("ACC of AdaBoost is : ", get_acc(ada_cf.predict(X_val), y_val))
print("ACC of Gradient Boosting is : ", get_acc(gb_cf.predict(X_val), y_val))



MAE of AdaBoost is :  7.366257179997746
MAE of Gradient Boosting is :  6.850286630903513
ACC of AdaBoost is :  77.0
ACC of Gradient Boosting is :  78.0


out of the box without tuning hyperparameters, we can see that boosting performed better in the classification task than both bagging and single models.

#### Stacking


Stacking is an advance ensemble technique that has also been proven to give higher performance. Stacking is almost always behind the success of most data science competitions on kaggle.

In Stacking approach, we simply train and make predictions on the original training data set using the base algorithms which are called first-level learners. The predictions from these base learners are then combined to make up a new training data
for another algorithm called a meta-learner. I.e. The output of the first level learners serves as input for the meta-learner.

The first level learners are often made up of different, simple and diverse algorithms although it is possible to create stacked ensembles from the same learning algorithms. 

The procedure, for the satcking is may be described as follows:


1. Split the total training set into two disjoint sets (here **train** and .**test** )

2. Train several base models on the first part (**train**)

3. Test these base models on the second part (**test**)

4. Use the predictions from 3) as the inputs, and the correct responses (target) as the outputs  to train a higher level learner called **meta-model**.

The first three steps are done iteratively . If we take for example a 10-fold stacking , we first split the training data into 10 folds. Then we will do 10 iterations. In each iteration,  we train every base model on 9 folds and predict on the remaining fold (holdout fold). 

So, we will be sure, after 10 iterations , that the entire data is used to get test predictions which we use as 
new feature to train our meta-model in the step 4.

For the prediction part , We average the predictions of  all base models on the test data  and used them as **meta-features**  on which, the final prediction is done with the meta-model.

You do not need to code a stacking ensemble yourself as there alredy exist many efficient implementation of it. Some popular implementations are [ML Ensemble](http://ml-ensemble.com/), [H20](https://en.wikipedia.org/wiki/H2O_(software)).
__NOTE:__ It is advisable to use more efficient implementations of a stacking ensembles.

In this article we'll write our own simple stacking ensemble just to demonstrate the idea. 


In [11]:
from sklearn.model_selection import KFold

def stackingModel(base_models, meta_model, features, target, nfolds=10):
    #Split data into folds
    kfold = KFold(n_splits=nfolds, shuffle=True, random_state=rand_seed)
    #initialize arrays to hold predictions
    test_predictions = np.zeros((features.shape[0], len(base_models)))
    train_predictions = np.zeros((features.shape[0], len(base_models)))
    
    # Train base models
    for i, model in enumerate(base_models):
        for train_index, test_index in kfold.split(features, target):
            #Fit train data on the model
            model.fit(np.array(features)[train_index], np.array(target)[train_index])
            
            #Make prediction on the holdout data
            y_pred = model.predict(np.array(features)[test_index])
            #make predictions on train data
            t_pred = model.predict(np.array(features)[train_index])
            
            #Append the prediction to out of folds
            test_predictions[test_index, i] = y_pred
            #Append predictions to train predictions
            train_predictions[train_index, i] = t_pred


    # Now train the meta-model using the train predictions as new feature
    meta_model.fit(train_predictions, target)
    #Make fianl predictions on the average of out of fold predictions
    final_preds = meta_model.predict(np.mean([test_predictions], axis=0))
    
    return final_preds

Now that we have a simple stacking ensemble, lets train and test on a regression task first.

In [12]:
#get data for regression task
target = german_cred['age_yrs']
data = german_cred.drop('age_yrs', axis=1)
data = standardize_data(data)

#first level learners
base_learners = [linear_reg, svr_reg, knn_reg]
#meta learner
meta_ln = svr_reg

pred = stackingModel(base_learners, meta_ln, data, target)

#check performance
print("MAE of Stacking Model is : ", get_mae(pred ,target))


#get data for classification task
target = german_cred['bad_credit']
data = german_cred.drop('bad_credit', axis=1)
data = standardize_data(data)

#first level learners
base_learners = [log_cf, svc_cf, knn_cf]
#meta learner
meta_ln = svc_cf

pred = stackingModel(base_learners, meta_ln, data, target)

#check performance
print("ACC of Stacking Model is : ", get_acc(pred ,target))

MAE of Stacking Model is :  7.23564330035632
ACC of Stacking Model is :  74.9


In the stacking ensemble above, we created just 2 levels; 

level 1 for the base models and level 2 for the meta-model. 

You can create as many levels as you wish but make sure tyhe models are well diversed as stacking performs better on diverse set of base learners.


### Final Thoughts

Ensembles are tried and tested methods for greatly improving the performance of your machine learning models and most times is the difference between 1st and second place in a data science competition. In this article, we have covered some of the basic ideas beind ensembles. It is worth mentionng here that we can combine ensembles together to create more complex ensembles. While this may help sometimes, most times the performance drops. Remember the "No Free Lunch Rule"? Well that happens.

I'm sure this article has given you a solid background in ensembles, please clap and share. If you have any questions, suggestions or corrections, do share in the comment section below.

[LINK](https://github.com/mediumtutorial) to Notebook on Github. Don't forget to leave a star if this helped you.

[LINK](https://www.academia.edu/39060549/An_Empirical_Study_of_Ensemble_Techniques_Bagging_Boosting_and_Stacking?source=swp_share) to my paper on ensemble techniques.

CONNECT WITH ME: 
TWITTER: @risin_developer
Linkedin: https://www.linkedin.com/in/risingdeveloper

HAPPY CODING!
