# Ensembles

Before looking at Random Forest, we explore ensembles in general.
Ensemble works best if the individual models are different (by algorithm or training set).

## RF
Random Forest is an Ensemble: same algorithm, random training sets.

## Voting algorithms
Hard voting: choose class with most votes.
Soft voting: choose class with highest confidence (probability) predition.



In [1]:
# First example demonstrates that three heads are better than one.
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.ensemble import VotingClassifier
from sklearn.metrics import accuracy_score

def compare_accuracy(classifiers):
    for classifier in classifiers:
        classifier.fit(X_train,y_train)
        y_pred = classifier.predict(X_test)
        print(classifier.__class__.__name__,accuracy_score(y_test,y_pred))

from sklearn.datasets import make_moons
X,y = make_moons(n_samples=200, noise=0.15)
# This is a wrapper for ShuffleSplit
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = \
    train_test_split(X, y, test_size=0.20, random_state=42)

# Hard voting.
lrc = LogisticRegression()  # inappropriate: linear model on nonlinear data
rfc = RandomForestClassifier()
svm = SVC()
vot = VotingClassifier(
    estimators=[('LRC',lrc),('RFC',rfc),('SVM',svm)],
    voting='hard')
vot.fit(X_train,y_train) # recursively trains the sub models
compare_accuracy([lrc,rfc,svm,vot])
# Book says ensemble always best. Not true for me.

LogisticRegression 0.85
RandomForestClassifier 0.975
SVC 0.975
VotingClassifier 0.975


In [17]:
# Soft voting.
lrc = LogisticRegression()
rfc = RandomForestClassifier()
svm = SVC(probability=True)
vot = VotingClassifier(
    estimators=[('LRC',lrc),('RFC',rfc),('SVM',svm)],
    voting='soft')
compare_accuracy([lrc,rfc,svm,vot])

LogisticRegression 0.85
RandomForestClassifier 0.975
SVC 1.0
VotingClassifier 0.975


## Conclusion
Book says ensemble always best. Not true for me.