<h1 style="background-color: #f8f0fa;
            border-left: 5px solid #1b4332;
            font-family: 'Trebuchet MS', sans-serif;
            border-right: 5px solid #1b4332;
            padding: 12px;
            border-radius: 50px 50px;
            color: #1b4332;
            text-align:center;
            font-size:45px;"><strong>😊AdaBoost Algorithm from Scratch🌟</strong></h1>
<hr style="border-top: 5px solid #264653;">

This notebook provides a step-by-step explanation and implementation of the **AdaBoost algorithm** using Python. 
We’ll go through the algorithm, building it from scratch, with a simple example to clarify the concepts of 
decision stumps, weighted errors, weight updates, and the final classifier. 

AdaBoost, or **Adaptive Boosting**, is an ensemble learning method that combines multiple weak classifiers 
to form a strong classifier. In this example, we use **decision stumps** (simple decision rules) as the weak classifiers.




## AdaBoost Algorithm Steps

Here’s a step-by-step guide on how the AdaBoost algorithm works:

1. **Initialize Weights**: Start by assigning equal weights to all training samples.

2. **Train Weak Classifiers**: For each weak classifier (e.g., decision stump), calculate the weighted error.

3. **Calculate Error for Each Weak Classifier**: For each classifier $( h_t $), calculate the weighted error 
based on misclassified samples.

4. **Calculate Classifier Weight**: The classifier’s weight depends on its error rate. A more accurate classifier 
receives a higher weight.

5. **Update Sample Weights**: Increase the weights of misclassified samples, emphasizing challenging samples in 
subsequent rounds.

6. **Final Classifier**: Combine all weak classifiers using their weights to form the final strong classifier.



## Example Dataset

A dataset with **10 samples** and **4 features** will be used to illustrate AdaBoost.

| Sample | Feature 1 | Feature 2 | Feature 3 | Feature 4 | Label |
|--------|-----------|-----------|-----------|-----------|-------|
| 1      | 1         | 2         | 1         | 3         | +1    |
| 2      | 2         | 1         | 3         | 1         | -1    |
| 3      | 1         | 1         | 2         | 2         | +1    |
| 4      | 2         | 2         | 1         | 3         | -1    |
| 5      | 1         | 3         | 1         | 1         | +1    |
| 6      | 3         | 1         | 2         | 2         | -1    |
| 7      | 2         | 3         | 3         | 1         | +1    |
| 8      | 3         | 2         | 1         | 2         | -1    |
| 9      | 1         | 2         | 3         | 3         | +1    |
| 10     | 2         | 1         | 2         | 3         | -1    |


In [1]:

import pandas as pd

# Create the dataset
data = {
    'Sample': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    'Feature_1': [1, 2, 1, 2, 1, 3, 2, 3, 1, 2],
    'Feature_2': [2, 1, 1, 2, 3, 1, 3, 2, 2, 1],
    'Feature_3': [1, 3, 2, 1, 1, 2, 3, 1, 3, 2],
    'Feature_4': [3, 1, 2, 3, 1, 2, 1, 2, 3, 3],
    'Label': [1, -1, 1, -1, 1, -1, 1, -1, 1, -1]
}

df = pd.DataFrame(data)
df.set_index('Sample', inplace=True)
df


Unnamed: 0_level_0,Feature_1,Feature_2,Feature_3,Feature_4,Label
Sample,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,1,2,1,3,1
2,2,1,3,1,-1
3,1,1,2,2,1
4,2,2,1,3,-1
5,1,3,1,1,1
6,3,1,2,2,-1
7,2,3,3,1,1
8,3,2,1,2,-1
9,1,2,3,3,1
10,2,1,2,3,-1



## Step 1: Initialize Weights

With 10 samples, each starts with a weight of $( \frac{1}{10} = 0.1 $).


In [8]:

import numpy as np

# Initialize weights equally
n_samples = len(df)
weights = np.full(n_samples, 1 / n_samples)
weights


array([0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1])


## Step 2: Train a Weak Classifier (Decision Stump)

A decision stump is a simple rule like "If Feature 1 = 1, predict +1; else -1."
We'll generate decision stumps for each feature and calculate their performance.


In [9]:

# Define function to calculate weighted error
def weighted_error(predictions, weights, true_labels):
    return np.sum(weights[predictions != true_labels])

X = df[['Feature_1', 'Feature_2', 'Feature_3', 'Feature_4']].values
y = df['Label'].values

errors = []
for feature_idx in range(X.shape[1]):
    predictions = np.where(X[:, feature_idx] == 1, 1, -1)
    error = weighted_error(predictions, weights, y)
    errors.append((feature_idx + 1, error))
errors


[(1, 0.1), (2, 0.7), (3, 0.5), (4, 0.4)]


## Step 3: Select the Best Classifier Based on Weighted Error

The decision stump with the lowest error is selected as the weak classifier for this round.


In [4]:

# Select the best classifier based on minimum error
best_stump = min(errors, key=lambda x: x[1])
best_stump


(1, 0.1)

remember here the only error is the sample 7


## Step 4: Calculate Classifier Weight

The weight $( \alpha_t $) of each classifier depends on its error rate:

$$
\alpha_t = \frac{1}{2} \ln \left( \frac{1 - \text{Error}}{\text{Error}} \right)
$$


In [5]:

# Calculate classifier weight
best_error = best_stump[1]
alpha = 0.5 * np.log((1 - best_error) / best_error) if best_error > 0 else 1e10
alpha


1.0986122886681098


## Step 5: Update Sample Weights

Increase weights for misclassified samples to give them more importance in the next round.


In [None]:

feature_idx = best_stump[0] - 1  # Convert to 0-indexed
predictions = np.where(X[:, feature_idx] == 1, 1, -1)

# Update weights
weights *= np.exp(alpha * (predictions != y))
weights /= np.sum(weights)  # Normalize
weights


array([0.08333333, 0.08333333, 0.08333333, 0.08333333, 0.08333333,
       0.08333333, 0.25      , 0.08333333, 0.08333333, 0.08333333])

we can see here that the sample 7 has a bigger change to be selected.


## Step 6: Combine Weak Classifiers

We combine the weak classifiers to form the final model by weighting their predictions.



## Step 7: Test the Final Model

Using the final model, we predict on new data to evaluate performance.


In [11]:

# Example test data
test_data = np.array([[2, 2, 2, 2], [1, 3, 1, 1], [3, 1, 3, 2]])
test_data


array([[2, 2, 2, 2],
       [1, 3, 1, 1],
       [3, 1, 3, 2]])

In [None]:
test_predictions = np.sign(np.sum([alpha * np.where(test_data[:, feature_idx] == 1, 1, -1)
                                   for feature_idx, alpha in zip(range(X.shape[1]), [alpha])], axis=0))
test_predictions

array([-1.,  1., -1.])