## Voting Classifier

### Hard Voting

In [8]:
from sklearn.datasets import make_moons
from sklearn.ensemble import VotingClassifier, RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC 

In [9]:
X, y = make_moons(n_samples = 500, noise = 0.30, random_state = 42)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state = 42)

In [10]:
voting_clf = VotingClassifier(
    estimators = [
        ('lr', LogisticRegression(random_state = 42)),
        ('rf', RandomForestClassifier(random_state = 42)),
        ('svc', SVC(random_state =42))
    ]
)

voting_clf.fit(X_train, y_train)

In [11]:
#testing each classifiers in test datas
for name, clf in voting_clf.named_estimators_.items():
    print(name,":", clf.score(X_test, y_test))

lr : 0.864
rf : 0.896
svc : 0.896


In [12]:
#class chosen by voting classifier
voting_clf.predict(X_test[:1])

array([1])

In [13]:
[clf.predict(X[:1]) for clf in voting_clf.estimators_]

[array([1]), array([1]), array([1])]

In [14]:
voting_clf.score(X_test, y_test)

0.912

See, VotingClassifier as a whole, performed so well than those individual classifiers !! But This is hard voting, now let us try soft voting and compare the result afterwards !!

### Soft Classifier

In [24]:
#making the VotingClassifier a Soft Classifier, by default the VotingCLassifier is a hard Classifier
voting_clf.voting = "soft"
voting_clf.named_estimators["svc"].probability = True  #by default svc doesn't estimate class probabilities

voting_clf.fit(X_train, y_train)
voting_clf.score(X_test, y_test)

0.92

This worked better than hard voting classifier !! COOL !!

## Bagging (Bootstrap Aggregating)

In [53]:
from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier

bag_clf = BaggingClassifier(DecisionTreeClassifier(), n_estimators = 500, n_jobs = -1, oob_score = True, random_state = 42)
bag_clf.fit(X_train, y_train)
bag_clf.oob_score_

0.896

It worked fine in out of bag (OOB) Evaluation. Let's check test sets now !!

In [55]:
from sklearn.metrics import accuracy_score

y_pred = bag_clf.predict(X_test)
accuracy_score(y_test, y_pred)

0.912

Worked better than in Out of Bag Evaluation !!

### Random Forest

In [75]:
rf_clf = RandomForestClassifier(n_estimators = 600, max_leaf_nodes = 12, n_jobs = -1, random_state= 42)
rf_clf.fit(X_train, y_train)
rf_y_pred = rf_clf.predict(X_test)
accuracy_score(y_test, rf_y_pred)1

0.92

In [81]:
rf_clf.feature_importances_

array([0.41360197, 0.58639803])

## Boosting

### AdaBoost

In [115]:
from sklearn.ensemble import AdaBoostClassifier

ada_clf = AdaBoostClassifier(
    DecisionTreeClassifier(max_depth = 1), n_estimators = 30,
    learning_rate = 0.2, random_state = 42)

ada_clf.fit(X_train, y_train)
ada_y_pred = ada_clf.predict(X_test)
accuracy_score(y_test, ada_y_pred)

0.92