<a href="https://colab.research.google.com/github/Aswinramesh04/100-Days-of-DataScience/blob/main/Day55_Ensemble_Learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

                   Day55: Ensemble Learning    By: Loga Aswin

**Ensemble learning**

> A machine learning technique that combines predictions from multiple models to improve accuracy.


> Aims to mitigate errors or biases that may exist in individual models.


> Utilizes the strengths of different models to create a more precise prediction.





**Simple Ensemble Techniques:**


> **Max Voting:** The predictions by each model are considered as a 'vote'. The predictions which we get the majority of the models agree on are used as the final prediction.





In [14]:
from sklearn.ensemble import VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Generating some sample data
X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initializing models
model1 = LogisticRegression()
model2 = DecisionTreeClassifier()
model3 = SVC(probability=True)

# Max Voting classifier
model = VotingClassifier(estimators=[('lr', model1), ('dt', model2), ('svc', model3)], voting='hard')

# Training model
model.fit(X_train, y_train)

# Predicting test results
y_pred = model.predict(X_test)

# Calculating accuracy
accuracy = accuracy_score(y_test, y_pred)
print("Max Voting Accuracy:", accuracy)


Max Voting Accuracy: 0.855



> **Averaging**: Averaging aggregates predictions by taking the average probability (for classification) or the mean prediction (for regression) across multiple models.


> [We Use probability=True, is used to enable the prediction of probabilities for classes in models that support it, providing more information for soft voting]






In [13]:
from sklearn.ensemble import VotingClassifier

# Averaging classifier
model = VotingClassifier(estimators=[('lr', model1), ('dt', model2), ('svc', model3)], voting='soft')

# Train model
model.fit(X_train, y_train)

# Predicting test result
y_pred = model.predict(X_test)

# Calculating accuracy
accuracy = accuracy_score(y_test, y_pred)
print("Averaging Accuracy:", accuracy)


Averaging Accuracy: 0.87


> **Weighted Averaging**: All models are assigned different weights defining the importance of each model for prediction.



In [16]:
# Define weights for models
weights = [0.3, 0.4, 0.3]

model = VotingClassifier(estimators=[('lr', model1), ('dt', model2), ('svc', model3)], voting='soft', weights=weights)

# Training model
model.fit(X_train, y_train)

# Predicting test results
y_pred = model.predict(X_test)

# Calculating accuracy
accuracy = accuracy_score(y_test, y_pred)
print("Weighted Averaging Accuracy:", accuracy)

Weighted Averaging Accuracy: 0.895


**Advanced Ensemble Techniques:**


> **Stacking**: A new model is built on the predictions of other models.


> **Blending**: A new model is built on the predictions of other models and the actual values of the training set.

**Algorithms based on Bagging and Boosting:**

> **Bagging**: Multiple subsets are created from the original dataset, selecting observations with replacement. A base model is created on each of these subsets.

In [22]:
from sklearn.ensemble import BaggingClassifier
from sklearn import tree
model = BaggingClassifier(tree.DecisionTreeClassifier(random_state=1))
model.fit(X_train, y_train)
model.score(X_test, y_test)

0.875

In [24]:
from sklearn.ensemble import BaggingRegressor
model = BaggingRegressor(tree.DecisionTreeRegressor(random_state=1))
model.fit(X_train, y_train)
model.score(X_test,y_test)

0.6504873882021907

> **Boosting**: A sequential process, where each subsequent model attempts to correct the errors of the previous model.

**AdaBoost:**


> **AdaBoost** (Adaptive Boosting) is an ensemble learning algorithm that combines multiple weak learners to create a strong learner.


> It is an iterative algorithm that sequentially builds weak learners, where each weak learner focuses on the hardest examples from the previous round.


> AdaBoost is known for its ability to handle noisy data and its robustness to overfitting.







In [26]:
# Sample code for classification
from sklearn.ensemble import AdaBoostClassifier
model = AdaBoostClassifier(random_state=1)
model.fit(X_train, y_train)
model.score(X_test,y_test)

0.87

**Sample code for regression problem:**

In [28]:
from sklearn.ensemble import AdaBoostRegressor
model = AdaBoostRegressor()
model.fit(X_train, y_train)
model.score(X_test,y_test)

0.4132620642489211

**Gradient Boosting Machines (GBM)**:



> **Gradient Boosting Machines (GBM)** is an ensemble learning algorithm that builds a sequence of weak learners, where each weak learner is trained to minimize the gradient of the loss function with respect to the predictions of the previous weak learner.


> GBM is a powerful algorithm that can achieve high accuracy on a variety of tasks.


In [30]:
from sklearn.ensemble import GradientBoostingClassifier
model= GradientBoostingClassifier(learning_rate=0.01,random_state=1)
model.fit(X_train, y_train)
model.score(X_test,y_test)

0.89

In [32]:
# Sample code for Regressor
from sklearn.ensemble import GradientBoostingRegressor
model= GradientBoostingRegressor()
model.fit(X_train, y_train)
model.score(X_test,y_test)

0.6132034021878043

**XGBoost:**


> **XGBoost** is an optimized version of GBM that includes several improvements, such as:

1.   Parallel Processing: XGBoost implements parallel processing and is faster than GBM .
2.   Regularization techniques: XGBoost uses regularization techniques to prevent overfitting, which is a common problem in machine learning.

[*Since XGBoost takes care of the missing values itself, you do not have to impute the missing values. ]

In [33]:
import xgboost as xgb
model=xgb.XGBClassifier(random_state=1,learning_rate=0.01)
model.fit(X_train, y_train)
model.score(X_test,y_test)

0.88

In [34]:
import xgboost as xgb
model=xgb.XGBRegressor()
model.fit(X_train, y_train)
model.score(X_test,y_test)

0.6251452430469582

**LightGBM:**


> **LightGBM** is another optimized version of GBM that is known for its speed and efficiency.


> It uses a novel tree-growing algorithm that is specifically designed for boosting algorithms.


> **LightGBM** also includes several other optimizations that make it faster than XGBoost, such as:



1.   Parallel processing: LightGBM can be trained on multiple CPUs or GPUs, which can significantly reduce training time.
2.   Histogram-based tree learning: LightGBM uses a histogram-based tree learning algorithm that is faster than traditional tree learning algorithms.











In [36]:
import lightgbm as lgb

model = lgb.LGBMClassifier(n_estimators=100, learning_rate=0.1, random_state=42)

model.fit(X_train, y_train)

y_pred = lgb_classifier.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print("LightGBM Accuracy:", accuracy)

[LightGBM] [Info] Number of positive: 393, number of negative: 407
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000193 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 5100
[LightGBM] [Info] Number of data points in the train set: 800, number of used features: 20
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.491250 -> initscore=-0.035004
[LightGBM] [Info] Start training from score -0.035004
LightGBM Accuracy: 0.895
