# Implement AdaBoost Fit Method

Write a Python function `adaboost_fit` that implements the fit method for an AdaBoost classifier. The function should take in a 2D numpy array `X` of shape `(n_samples, n_features)` representing the dataset, a 1D numpy array `y` of shape `(n_samples,)` representing the labels, and an integer `n_clf` representing the number of classifiers. The function should initialize sample weights, find the best thresholds for each feature, calculate the error, update weights, and return a list of classifiers with their parameters.

Example:
```py
    X = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])
    y = np.array([1, 1, -1, -1])
    n_clf = 3

    clfs = adaboost_fit(X, y, n_clf)
    print(clfs)
    # Output (example format, actual values may vary):
    # [{'polarity': 1, 'threshold': 2, 'feature_index': 0, 'alpha': 0.5},
    #  {'polarity': -1, 'threshold': 3, 'feature_index': 1, 'alpha': 0.3},
    #  {'polarity': 1, 'threshold': 4, 'feature_index': 0, 'alpha': 0.2}]
```

## Understanding AdaBoost

AdaBoost, short for Adaptive Boosting, is an ensemble learning method that combines multiple weak classifiers to create a strong classifier. The basic idea is to fit a sequence of weak learners on weighted versions of the data.

Here's how to implement the fit method for an AdaBoost classifier:

- **Initialize Weights**: Start by initializing the sample weights uniformly:

    $w_i = \frac{1}{N}$, where $w_i$ is the weight of the $i$-th sample and $N$ is the total number of samples.
 
- **Iterate Through Classifiers**: For each classifier, determine the best threshold for each feature to minimize the error.

- **Calculate Error and Flip Polarity**: If the error is greater than 0.5, flip the polarity:
    
    $error = \sum_{i=1}^{N} w_i \cdot \mathbb{1}(y_i \neq h(x_i))$, where $\mathbb{1}$ is the indicator function, $y_i$ is the true label, and $h(x_i)$ is the predicted label.

    If $error > 0.5$, set $error = 1 - error$ and flip the polarity.

- **Calculate Alpha**: Compute the weight (alpha) of the classifier based on its error rate:
    
    $\alpha = \frac{1}{2} \ln\left(\frac{1 - error}{error + 1e-10}\right)$
 
- **Update Weights**: Adjust the sample weights based on the classifier's performance and normalize them:

    $w_i = w_i \cdot \exp(-\alpha \cdot y_i \cdot h(x_i))$
    
    $w_i = \frac{w_i}{\sum_{i=1}^{N} w_i}$
 
- **Save Classifier**: Store the classifier with its parameters.

This method helps in focusing more on the misclassified samples in subsequent rounds, thereby improving the overall performance.

In [1]:
import numpy as np
import math

def adaboost_fit(X, y, n_clf):
    n_samples, n_features = np.shape(X)
	# 1. Initialize Weights
    w = np.full(n_samples, (1 / n_samples))
    clfs = []
    
	# 2. Iterate Through Classifiers: For each classifier, 
	# determine the best threshold for each feature to minimize the error.
    for _ in range(n_clf):
        clf = {}
        min_error = float('inf')
        
        for feature_i in range(n_features):
            feature_values = np.expand_dims(X[:, feature_i], axis=1)
            unique_values = np.unique(feature_values)
            
            for threshold in unique_values:
                p = 1
                prediction = np.ones(np.shape(y))
                prediction[X[:, feature_i] < threshold] = -1
				
				# 3. Calculate Error and Flip Polarity: If the error is greater than 0.5, flip the polarity
                error = sum(w[y != prediction])
                if error > 0.5:
                    error = 1 - error
                    p = -1
                
                if error < min_error:
                    clf['polarity'] = p
                    clf['threshold'] = threshold
                    clf['feature_index'] = feature_i
                    min_error = error
        
		# 4. Calculate Alpha: Compute the weight (alpha) of the 'classifier' based on its error rate
        clf['alpha'] = 0.5 * math.log((1.0 - min_error) / (min_error + 1e-10))
		
		# 5. Update Weights: Adjust the 'sample' weights based on the classifier's performance and normalize them
        predictions = np.ones(np.shape(y))
        negative_idx = (clf['polarity'] * X[:, clf['feature_index']] < clf['polarity'] * clf['threshold'])
        predictions[negative_idx] = -1
		
        w *= np.exp(-clf['alpha'] * y * predictions)
        w /= np.sum(w)
        clfs.append(clf)

    return clfs

In [2]:
X = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])
y = np.array([1, 1, -1, -1])
n_clf = 3
clfs = adaboost_fit(X, y, n_clf)

print('Input:\n\
y = np.array([1, 1, -1, -1])\n\
n_clf = 3\n\
clfs = adaboost_fit(X, y, n_clf)\n\
print(clfs)')
print()
print('Output:')
print(clfs)
print()
print('Expected:')
print([{'polarity': -1, 'threshold': 3, 'feature_index': 0, 'alpha': 11.512925464970229}, {'polarity': -1, 'threshold': 3, 'feature_index': 0, 'alpha': 11.512924909859024}, {'polarity': -1, 'threshold': 1, 'feature_index': 0, 'alpha': 11.512925464970229}])


Input:
y = np.array([1, 1, -1, -1])
n_clf = 3
clfs = adaboost_fit(X, y, n_clf)
print(clfs)

Output:
[{'polarity': -1, 'threshold': 3, 'feature_index': 0, 'alpha': 11.512925464970229}, {'polarity': -1, 'threshold': 3, 'feature_index': 0, 'alpha': 11.512924909859024}, {'polarity': -1, 'threshold': 1, 'feature_index': 0, 'alpha': 11.512925464970229}]

Expected:
[{'polarity': -1, 'threshold': 3, 'feature_index': 0, 'alpha': 11.512925464970229}, {'polarity': -1, 'threshold': 3, 'feature_index': 0, 'alpha': 11.512924909859024}, {'polarity': -1, 'threshold': 1, 'feature_index': 0, 'alpha': 11.512925464970229}]


In [3]:
X = np.array([[8, 7], [3, 4], [5, 9], [4, 0], [1, 0], [0, 7], [3, 8], [4, 2], [6, 8], [0, 2]])
y = np.array([1, -1, 1, -1, 1, -1, -1, -1, 1, 1])
n_clf = 2
clfs = adaboost_fit(X, y, n_clf)

print('Input:\n\
X = np.array([[8, 7], [3, 4], [5, 9], [4, 0], [1, 0], [0, 7], [3, 8], [4, 2], [6, 8], [0, 2]])\n\
y = np.array([1, -1, 1, -1, 1, -1, -1, -1, 1, 1])\n\
n_clf = 2\n\
clfs = adaboost_fit(X, y, n_clf)\n\
print(clfs)')
print()
print('Output:')
print(clfs)
print()
print('Expected:')
print([{'polarity': 1, 'threshold': 5, 'feature_index': 0, 'alpha': 0.6931471803099453}, {'polarity': -1, 'threshold': 3, 'feature_index': 0, 'alpha': 0.5493061439673882}])

Input:
X = np.array([[8, 7], [3, 4], [5, 9], [4, 0], [1, 0], [0, 7], [3, 8], [4, 2], [6, 8], [0, 2]])
y = np.array([1, -1, 1, -1, 1, -1, -1, -1, 1, 1])
n_clf = 2
clfs = adaboost_fit(X, y, n_clf)
print(clfs)

Output:
[{'polarity': 1, 'threshold': 5, 'feature_index': 0, 'alpha': 0.6931471803099453}, {'polarity': -1, 'threshold': 3, 'feature_index': 0, 'alpha': 0.5493061439673882}]

Expected:
[{'polarity': 1, 'threshold': 5, 'feature_index': 0, 'alpha': 0.6931471803099453}, {'polarity': -1, 'threshold': 3, 'feature_index': 0, 'alpha': 0.5493061439673882}]
