# Boosting 


Boosting is a powerful machine learning ensemble technique designed to improve the accuracy of models by combining several weak learners (models that perform slightly better than random guessing) to form a strong learner. 

A single weak model may not be enough for our complex problems such cases we aggregate various weak models to make a powerful and more accurate model for our problem this process of aggregating several small problems to create a strong model is what we do in boosting. 

Boosting is an ensemble modeling technique that attempts to build a strong classifier from the number of weak classifiers. It is done by building a model by using weak models in series. Firstly, a model is built from the training data. Then the second model is built which tries to correct the errors present in the first model. This procedure is continued and models are added until either the complete training data set is predicted correctly or the maximum number of models are added.

The key idea behind boosting is to iteratively adjust the weights of incorrectly classified instances, so that the model focuses more on the harder-to-classify cases in subsequent rounds. 

![image.png](attachment:49f1c8c4-1583-456e-ae4f-01c00fddbdbd.png)

## Steps in Boosting
## Initialize Weights:

- Start by assigning equal weights to all instances in the training data.
- These weights determine the importance of each data point in subsequent steps.
- If there are 
𝑛
n training examples, each instance will have an initial weight of 1/n
## Train the First Weak Learner:

- A weak learner (often a simple decision tree, also called a stump) is trained on the weighted dataset.
- The model makes predictions, and its performance is evaluated (i.e., it identifies correctly and incorrectly classified instances).
- For a weak learner, this model might have low accuracy, but it performs slightly better than random guessing.
## Calculate the Error:
- The error is calculated based on the weights of misclassified instances.
- The error rate
ϵ is computed as the weighted sum of the misclassified instances

 - ![image.png](attachment:51b48605-bc1f-4710-95a5-d17b6f88f1ea.png)
- If the error is greater than 0.5, the model is discarded, and another model is trained.
- If it's less than 0.5, the model is accepted, and the process continues.

## Update the Weights:
- Increase the weights of the instances that were misclassified so that the next learner will focus more on these harder-to-classify instances.
![image.png](attachment:9cf72101-f755-4d84-a566-3ae496252472.png)
- This increases the influence of misclassified points for the next round of learning.


## Train the Next Weak Learner:

- Using the updated weights, train another weak learner that will now focus more on the instances that were incorrectly classified by the previous model.
- The process of training, calculating errors, and updating weights continues for a predetermined number of iterations or until the model achieves a satisfactory performance.
## Combine Weak Learners:

- After a set number of weak learners have been trained, their outputs are combined to form the final strong learner.
- The combination is typically a weighted vote (in classification) or a weighted sum (in regression), where the weights are proportional to the accuracy of each model.
## Final Prediction:

- For classification, the final model combines predictions of all weak learners using a weighted majority vote.
- For regression, the final prediction is the weighted average of predictions from all weak learners.


# Types of Boosting Algorithms


There are several types of boosting algorithms some of the most famous and useful models are as :



## Gradient Boosting –
It is a boosting technique that builds a final model from the sum of several weak learning algorithms that were trained on the same dataset. It operates on the idea of stagewise addition. The first weak learner in the gradient boosting algorithm will not be trained on the dataset; instead, it will simply return the mean of the relevant column. The residual for the first weak learner algorithm’s output will then be calculated and used as the output column or target column for the next weak learning algorithm that will be trained. The second weak learner will be trained using the same methodology, and the residuals will be computed and utilized as an output column once more for the third weak learner, and so on until we achieve zero residuals. The dataset for gradient boosting must be in the form of numerical or categorical data, and the loss function used to generate the residuals must be differential at all times.
## XGBoost – 
In addition to the gradient boosting technique, XGBoost is another boosting machine learning approach. The full name of the XGBoost algorithm is the eXtreme Gradient Boosting algorithm, which is an extreme variation of the previous gradient boosting technique. The key distinction between XGBoost and GradientBoosting is that XGBoost applies a regularisation approach. It is a regularised version of the current gradient-boosting technique. Because of this, XGBoost outperforms a standard gradient boosting method, which explains why it is also faster than that. Additionally, it works better when the dataset contains both numerical and categorical variables.
## Adaboost –
AdaBoost is a boosting algorithm that also works on the principle of the stagewise addition method where multiple weak learners are used for getting strong learners. The value of the alpha parameter, in this case, will be indirectly proportional to the error of the weak learner, Unlike Gradient Boosting in XGBoost, the alpha parameter calculated is related to the errors of the weak learner, here the value of the alpha parameter will be indirectly proportional to the error of the weak learner.
## CatBoost –
The growth of decision trees inside CatBoost is the primary distinction that sets it apart from and improves upon competitors. The decision trees that are created in CatBoost are symmetric. As there is a unique sort of approach for handling categorical datasets, CatBoost works very well on categorical datasets compared to any other algorithm in the field of machine learning. The categorical features in CatBoost are encoded based on the output columns. As a result, the output column’s weight will be taken into account while training or encoding the categorical features, increasing its accuracy on categorical datasets.

## AdaBoost with Decision Tree Weak Learner

In [1]:
# Importing necessary libraries
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

In [2]:
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5, random_state=42)


In [3]:
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)


In [6]:
# Define the weak learner (a simple decision tree with max depth of 1)
weak_learner = DecisionTreeClassifier(max_depth=1)


In [8]:
# Initialize AdaBoost classifier with the weak learner
adaboost = AdaBoostClassifier(estimator=weak_learner, n_estimators=50, learning_rate=1.0, random_state=42)


In [9]:
# Train the AdaBoost model
adaboost.fit(X_train, y_train)



In [10]:
# Predict on the test set
y_pred = adaboost.predict(X_test)


In [11]:
y_pred

array([0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0,
       1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1,
       0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1,
       1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0,
       1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1,
       1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1,
       0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0,
       0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0,
       1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1,
       0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0,
       0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1,
       1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0,
       0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0,
       0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0])