# Ensemble Learning Algorithm

    Ensemble model in Machine Learning operate on multiple models to improve the overall performance.
    
**This can be achieved in various ways:-**

### 1). Max Voting:

    a). The max voting method is generally used for classification problem.
    b). In this technique multiple models are used to make predictions for each data points.
    c). The prediction by each model are considered as vote.
    d). The prediction which we get from the majority of the models are useful as the final prediction.
    
    Example:- When you ask your friends to suggest you for a mobile phone, there will be multiple options being suggested,
                
                Friend 1 -> Iphone
                Friend 2 -> Oppo
                Friend 3 -> One plus
                Friend 4 -> One Plus
                Friend 5 -> Red Mi
                
    - In this case the maximum vote is for One Plus hence the decision of purchasing one plus is taken by you.

### 2). Averaging:

    a). Similar to Max voting technique, multiple predictions are made for each data points in averaging.
    b). In this method we take an average of predictions from all the models and use it to make final predicitions.
    c). Averaging can be used for making predicitons in regression problem or while calculating the probablities for
        classification problems.
    
    Example:- You have asked for opinion to purchase mobile phone from 5 friends,
        
                F1 -> IPhone
                F2 -> Red Mi
                F3 -> One Plus
                F4 -> One Plus
                F5 -> Oppo
                
    - Here the average would be take:-
    
                (1 + 1 + 2 + 1)/5
    
    - And the final value would be the predictive value for the model.

### 3). Weighted Average:

    a). Weighted Average Technique is an extension of the average method.
    b). All models are assigned different weights, defining the importance of each model for prediction.
    
    - For instance if 4 of your friends have prior experience in using mobile phone and while 3 of them have no prior
      experience of using mobile phone.
    - In this scenario the opinion of 4 friends who have experience are weighted higher as compared to the rest of the 3.
        
            F1 - IPhone
            F2 - Redmi
            F3 - Oppo
            F4 - Vivo
            F5 - Samsung
            F6 - One Plus
            F7 - One Plus
            
                  F1   F2   F3   F4   F5   F6   F7
        Weight:  0.23 0.23 0.23 0.23 0.21 0.10 0.10
        Rating:   1    2    2    3    4    5    5

### Hard Voting Vs Soft Voting

**Hard Voting:-**

    - In classification a voting ensemble involves making a prediction.
    - A Hard voting ensemble involves summing the votes for crisp and labelled data.
    - It also involves summing up all the votes and predicting with the most vote.
    - Typically used in numerical & categorical data.

**Soft Voting:-**

    - In soft voting ensemble involves summing up the predicted probablities.
    - Typically used in classed labeled data.
    - It predicts class with the largest summed of probablity for the model.

## Advanced Technique In Ensemble Learning Algorithm

### 1) Stacking:-

    a). Stacking also known as "Stacked Generalization" is an ensemble technique that combines multiple classification or
        regression model via meta-classifier or meta-regressor.
    b). The base level model are trained on a complete training set then the meta model id trained on the features that are
        output of the base level model.
    c). The base level often consists of different learning algorithm and therefore stacking algorithm is often
        heterogenous(different).

**Meta Classifier:-**
        
    - It is a classifier that makes a final prediciton among all the predictions by using those predictions as a feature.
    - It takes classes by various classifiers and pick the final one as the result.
        
**Meta Regressor:-**
        
    - Meta regressor is defined to be a meta analysis to combine, compare and synthesize research findings.
    - A meta regressor analysis aim to reconcile conflicting studies or colaborate consisting once.
    - It combines the data of multiple studies to identify overall trend of the data.

## Example : Max Voting

In [61]:
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier, VotingClassifier
from sklearn.tree import DecisionTreeClassifier

In [62]:
iris = load_iris()
X = iris.data[:,1:3]
Y = iris.target

In [63]:
clf1 = LogisticRegression(random_state=1)
clf2 = RandomForestClassifier(random_state=1)
clf3 = KNeighborsClassifier()
clf4 = GaussianNB()
clf5 = DecisionTreeClassifier(random_state=1)

In [64]:
labels = ['Logistic', 'Random Forest', 'KNN', 'Gaussian' ,'Decision Tree']
for clf,label in zip([clf1, clf2, clf3, clf4, clf5], labels):
    score = cross_val_score(clf, X, Y, cv=5, scoring='accuracy')
    
    print("Accuracy Score: %0.2f (+/- %0.2f)[%s]"%(score.mean(), score.std(),label))
    

Accuracy Score: 0.95 (+/- 0.04)[Logistic]
Accuracy Score: 0.94 (+/- 0.04)[Random Forest]
Accuracy Score: 0.95 (+/- 0.04)[KNN]
Accuracy Score: 0.91 (+/- 0.04)[Gaussian]
Accuracy Score: 0.91 (+/- 0.03)[Decision Tree]


In [65]:
voting_clf_hard = VotingClassifier(estimators=[
    (labels[0],clf1),
    (labels[1],clf2),
    (labels[2],clf3),
    (labels[3],clf4),
    (labels[4],clf5)],voting='hard')

voting_clf_soft = VotingClassifier(estimators=[
    (labels[0],clf1),
    (labels[1],clf2),
    (labels[2],clf3),
    (labels[3],clf4),
    (labels[4],clf5)],voting='soft')

In [66]:
labels_new = ['Logistics','Random Forest', 'KNN', 'Gaussian', 'Decision Tree', 'Hard Voting','Soft Voting']
for (clf,label) in zip([clf1, clf2, clf3, clf4, clf5, voting_clf_hard, voting_clf_soft],labels_new):
    scores = cross_val_score(clf, X, Y, cv=5, scoring='accuracy')
    
    print("Accuracy Score: %0.2f (+/- %0.2f)[%s]"%(scores.mean(),scores.std(),label))

Accuracy Score: 0.95 (+/- 0.04)[Logistics]
Accuracy Score: 0.94 (+/- 0.04)[Random Forest]
Accuracy Score: 0.95 (+/- 0.04)[KNN]
Accuracy Score: 0.91 (+/- 0.04)[Gaussian]
Accuracy Score: 0.91 (+/- 0.03)[Decision Tree]
Accuracy Score: 0.95 (+/- 0.04)[Hard Voting]
Accuracy Score: 0.95 (+/- 0.04)[Soft Voting]
