[Please follow this.](https://towardsdatascience.com/simple-guide-for-ensemble-learning-methods-d87cc68705a2)

A collection of several models working together on a single set is called an Ensemble.

# Voting

1. usage of multiple classifier/regression models to predict and selection of output based on some voting scheme

## Hard Voting
1. model is selected from an ensemble to make the final prediction by a simple majority vote for accuracy.

## Soft Voting
1.  arrives at the best result by averaging out the probabilities calculated by individual algorithms.

2. can only be done when all your classifiers can calculate probabilities for the outcomes.


[`sklearn.VotingClassifier`](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.VotingClassifier.html) and [`sklearn.VotingRegressor`](https://scikit-learn.org/1.5/modules/generated/sklearn.ensemble.VotingRegressor.html) accepts all kinds of estimators , and notice that its a non-default argument, meaning the user has to provide a list of estimators.

An analogy

Take a look at the following picture; we can see a group of blindfolded children playing the game of “Touch and tell” while examining an elephant which none of them had ever seen before. Each of them will have a different version as to how does an elephant looks like because each of them is exposed to a different part of the elephant. Now, if we give them a task of submitting a report on elephant description, their individual reports will be able to describe only one part accurately as per their experience but collectively they can combine their observations to give a very accurate report on the description of an elephant.
Similarly, ensemble learning methods employ a group of models where the combined result out of them is almost always better in terms of prediction accuracy as compared to using a single model.

<img src="elephant-ensemble.png" />

In [None]:
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import VotingClassifier

log_clf = LogisticRegression()
rnd_clf = RandomForestClassifier()
svm_clf = SVC()

voting_clf = VotingClassifier(
	estimators = [('lr', log_clf), ('rf', rnd_clf), ('svc',svm_clf)],
	voting = 'hard')

* The accuracy of the VotingClassifier is generally higher than the individual classifiers. 

* Make sure to include diverse classifiers so that models which fall prey to similar types of errors do not aggregate the errors.

# Simple Ensemble Techniques

## Taking MODE of results
1. most frequently occurring output from all regressors/classifiers

2. can be for both regression and classification type problems, but more suitable for the latter. 

## Taking MEAN of results
1. take an average of predictions from all the models and use it to make the final prediction.

2. Naturally possible for regression-type problems.

## Taking WEIGHTED MEAN of the results

1. All models are assigned different weights defining the importance of each model for prediction.