# Ensemble Methods
Ensemble learning is a method where we use many small models instead of just one. Each of these models may not be very strong on its own, but when we put their results together, we get a better and more accurate answer. 

## Types of Ensemble Learning in ML
There are 3 main types of ensemble methods:
1. Bagging (Bootstrap Aggregating):
Models are trained independently on different random subsets of the training data. Their results are then combined by averaging (for regression) or voting (for classification).
Examples include: Bagging classifier with different base models, Random Forest, Bagging Regressor with different base models,

2. Boosting:
Models are trained one after another. Each new model focuses on fixing the errors made by the previous ones. The final prediction is a weighted combination of all models.
Examples include: AdaBoost(Adaptive Boosting),Gradient Boosting, XGBoost(Extreme Gradient Boosting)

3. Stacking (Stacked Generalization):
Multiple different models (often of different types) are trained, and their predictions are used as inputs to a final model, called a meta-model. The meta-model learns how to best combine the predictions of the base models, aiming for better performance than any individual model.


# Bagging Explained
1. Bootstrap Sampling: Divides the original training data into ‘N’ subsets and randomly selects a subset with replacement in some rows from other subsets.
2. Base Model Training: For each bootstrapped sample we train a base model independently on that subset of data. These weak models are trained in parallel to increase computational efficiency and reduce time consumption. We can use different base learners that is different ML models as base learners to bring variety and robustness.
3. Prediction Aggregation: To make a prediction on testing data combine the predictions of all base models. 
4. Out-of-Bag (OOB) Evaluation: Some samples are excluded from the training subset of particular base models during the bootstrapping method. These “out-of-bag” samples can be used to estimate the model’s performance without the need for cross-validation.
5. Final Prediction: After aggregating the predictions from all the base models, Bagging produces a final prediction.

In [2]:
from sklearn.ensemble import BaggingClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import seaborn as sns

In [3]:
df = sns.load_dataset('titanic')
df.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.25,S,Third,man,True,,Southampton,no,False
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.925,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.05,S,Third,man,True,,Southampton,no,True


In [77]:
# Drop columns with too many missing values or those that are irrelevant
df.drop(['deck', 'embark_town', 'alive', 'who', 'adult_male', 'class'], axis=1, inplace=True)

# Drop rows with missing values 
df.dropna(inplace=True)

# Encode categorical variables
df['sex'] = df['sex'].map({'male': 0, 'female': 1})
df['embarked'] = df['embarked'].map({'S': 0, 'C': 1, 'Q': 2})

# Features and target
X = df[['pclass', 'sex', 'age', 'sibsp', 'parch', 'fare', 'embarked']]
y = df['survived']

In [53]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [78]:
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

In [79]:
base_classifier = KNeighborsClassifier(n_neighbors=8)

In [80]:
bagging_classifier = BaggingClassifier(base_classifier, n_estimators=50, random_state=42)
bagging_classifier.fit(X_train_scaled,y_train)

BaggingClassifier(base_estimator=KNeighborsClassifier(n_neighbors=8),
                  n_estimators=50, random_state=42)

In [81]:
y_pred = bagging_classifier.predict(X_test_scaled)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

Accuracy: 0.8111888111888111


In [60]:
from sklearn.ensemble import RandomForestClassifier

rf = RandomForestClassifier(n_estimators=100, random_state=42)

rf.fit(X_train, y_train)

y_pred = rf.predict(X_test)

# Evaluate the model
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")


Accuracy: 0.7832167832167832


# Boosting Explained
1. Initialization - Use weak learners and all training examples are given equal weights.
2. Iterative Training - After training the weak learner, its performance is evaluated. The error is calculated by summing the weights of the misclassified instances and in the next iteration the weak learner tries to minimze the weighted error by focusing on misclassified instances.
3. Weights Updating - After each iteration, the weights of misclassified examples are increased so that subsequent models focus more on those examples. Correctly classified examples typically have their weights reduced or kept the same.
4. Final Model - After a predefined number of iterations or until no significant improvement is made, the weak learners are combined to form the final strong learner. 

In [66]:
from sklearn.ensemble import GradientBoostingClassifier

gb_classifier = GradientBoostingClassifier(n_estimators=100, learning_rate=0.3, max_depth=3, random_state=42)

gb_classifier.fit(X_train, y_train)

y_pred = gb_classifier.predict(X_test)

#Evaluate the model
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")

Accuracy: 0.7622377622377622


In [74]:
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier

base_classifier = DecisionTreeClassifier(max_depth=3)

ada_classifier = AdaBoostClassifier(base_classifier, n_estimators=100,learning_rate= 1.0,random_state=42)

ada_classifier.fit(X_train, y_train)

ada_classifier.score(X_test, y_test)
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")

Accuracy: 0.7622377622377622
