# Ensemble Methods:

### Voting:

- Generally used for classification related tasks.
- Combining the  different models to convert it into one model.
- typically follows bagging based methods.

**1. Hard Voting:**
- Each model casts vote for predicted class label. [Majority VOte wins]

**2. Soft Voting:**
- Each model provides prob. distribution over the classes. Averages are taken and the one with the highest is chosen.

---

In [14]:
import os, warnings
import numpy as np
import pandas as pd
from sklearn.datasets import load_breast_cancer

from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
# 
from sklearn.ensemble import VotingClassifier

warnings.filterwarnings('ignore')

Data Load:

In [15]:
data = load_breast_cancer()

X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=40)

Base Models creation:

In [16]:
models = [
    ('gnb', GaussianNB()),
    ('lr', LogisticRegression()),
    ('dt', DecisionTreeClassifier(max_depth=3, random_state=42)),
    ('knn3', KNeighborsClassifier(n_neighbors=3)),
    ('knn10', KNeighborsClassifier(n_neighbors=10))
]

In [17]:
soft_voting = VotingClassifier(estimators=models, voting='soft')
hard_voting = VotingClassifier(estimators=models, voting='hard')

In [20]:
soft_voting.fit(X_train, y_train)

In [21]:
hard_voting.fit(X_train, y_train)

---
Model Evaluation:

In [22]:
from sklearn.metrics import accuracy_score, classification_report

In [23]:
y_pred_soft = soft_voting.predict(X_test)
y_pred_hard = hard_voting.predict(X_test)

In [24]:
print(f'Accuracy [Hard Voting]: {accuracy_score(y_test, y_pred_hard)}')
print(f'Accuracy [Soft Voting]: {accuracy_score(y_test, y_pred_soft)}')

Accuracy [Hard Voting]: 0.9649122807017544
Accuracy [Soft Voting]: 0.9649122807017544


In [27]:
print(f"""classification report [HARD]: 
{classification_report(y_test, y_pred_hard)}""")

classification report [HARD]: 
              precision    recall  f1-score   support

           0       0.97      0.92      0.95        39
           1       0.96      0.99      0.97        75

    accuracy                           0.96       114
   macro avg       0.97      0.95      0.96       114
weighted avg       0.97      0.96      0.96       114



In [28]:
print(f"""classification report [SOFT]: 
{classification_report(y_test, y_pred_soft)}""")

classification report [SOFT]: 
              precision    recall  f1-score   support

           0       0.97      0.92      0.95        39
           1       0.96      0.99      0.97        75

    accuracy                           0.96       114
   macro avg       0.97      0.95      0.96       114
weighted avg       0.97      0.96      0.96       114



---
By Kirtan Ghelani $@SculptSoft$