### Adaboost Classifier

![Adaboost](https://almablog-media.s3.ap-south-1.amazonaws.com/image_28_7cf514b000.png)

### The core idea behind AdaBoost

AdaBoost, short for Adaptive Boosting, is a machine learning algorithm that belongs to the family of ensemble methods. Ensemble methods combine multiple weak learners (models that perform slightly better than random guessing) to create a strong learner (a model with higher accuracy). The core idea behind AdaBoost can be summarized as follows:

1. **Weighted Training**: AdaBoost assigns a weight to each training example initially, so all samples have equal weight. After each iteration, the weights of incorrectly classified samples are increased, and the weights of correctly classified samples are decreased.

2. **Sequential Learning**: AdaBoost builds a sequence of weak learners, typically decision trees with a single split, called "decision stumps." Each weak learner focuses on the examples that the previous ones misclassified, attempting to correct their mistakes.

3. **Weighted Voting**: AdaBoost combines the weak learners into a strong classifier by weighting their votes based on their accuracy. The more accurate classifiers are given higher weights in the final model.

4. **Final Model Creation**: After a predefined number of iterations or when a perfect fit is achieved, AdaBoost combines the predictions of all weak learners to create the final model. Typically, the predictions are combined using a weighted majority vote, where the weights are the accuracy of the individual weak learners.

5. **Robustness**: AdaBoost is robust against overfitting because it focuses more on the misclassified examples in each iteration. By iteratively reweighting the examples, it learns to prioritize the difficult examples, improving the overall generalization of the model.

6. **Limitations**: AdaBoost can be sensitive to noisy data and outliers. If the weak learners are too complex or the dataset contains outliers, AdaBoost may overfit the training data.

In summary, AdaBoost combines the strengths of multiple weak learners to create a strong, robust model capable of handling complex datasets and achieving high accuracy. Its sequential learning approach and weighted voting mechanism make it particularly effective for classification tasks.

### Example

Example of how AdaBoost works using a scenario of classifying whether a person will buy a product based on two features: age and income.

Imagine we have the following dataset:

Now, let's say we want to use AdaBoost to build a classifier to predict whether a person will buy a product based on their age and income.

1. **Initial Weights**: Initially, each data point is assigned equal weight. So, each person in our dataset has a weight of 1/6.

2. **First Weak Learner**: We start with a simple decision stump, a decision tree with just one split. Let's say our first weak learner decides based on age alone. It chooses an age threshold of 35 and predicts "Yes" if the age is below 35 and "No" otherwise.

3. **Training the First Weak Learner**: The first weak learner misclassifies three points (person 4, 5, and 6), so their weights are increased. The weights of correctly classified points are decreased. 

4. **Second Weak Learner**: Now, we train a second weak learner. This time, it chooses to focus on income alone and selects an income threshold of $50,000.

5. **Training the Second Weak Learner**: The second weak learner also misclassifies three points (person 1, 2, and 3), so their weights are increased while the weights of correctly classified points are decreased.

6. **Combining Weak Learners**: After training multiple weak learners, AdaBoost combines them into a strong classifier. Each weak learner's vote is weighted based on its accuracy. In our example, the weak learner that focuses on age might have a higher weight since it correctly classified more examples.

7. **Final Prediction**: To make predictions, AdaBoost combines the predictions of all weak learners using their weighted votes. For each new data point, AdaBoost applies all the weak learners and combines their predictions to make a final decision.

This process continues until a predetermined number of weak learners are trained or until a perfect fit is achieved. The final model is then used to predict whether a new person will buy the product based on their age and income.

The AdaBoost algorithm combines the predictions of weak learners using a weighted majority vote. Here's a simplified formula for combining weak learners in AdaBoost:

Let's say we have (T) weak learners indexed by t = 1, 2, ..., T.

For a binary classification problem, each weak learner (t) outputs a prediction h_t(x), where h_t(x) is either -1 or 1.

The final prediction for a new data point (x) is obtained by summing the weighted predictions of all weak learners:

H(x) = sign(Σ[α_t * h_t(x)])

Where:
- H(x) is the final prediction for (x),
- (αt) is the weight assigned to weak learner (t) based on its accuracy,
- (sign(z)) is the sign function that returns -1 if (z < 0), 0 if (z = 0), and 1 if (z > 0).

The weight (αt) for each weak learner is computed using the AdaBoost weight update formula:

α_t = (1/2) * ln((1 - error_t) / error_t)

Where:
- (error_t) is the weighted error rate of weak learner (t), calculated as the sum of weights of misclassified samples divided by the total weight of all samples.

This weight (alpha_t) determines the influence of weak learner (t) in the final classification. A higher weight (alpha_t) means that the weak learner's predictions have a stronger influence on the final prediction.