Modify the AdaBoost scratch code in our lecture such that:
- Notice that if <code>err</code> = 0, then $\alpha$ will be undefined, thus attempt to fix this by adding some very small value to the lower term
- Notice that sklearn version of AdaBoost has a parameter <code>learning_rate</code>.  This is in fact the $\frac{1}{2}$ in front of the $\alpha$ calculation.  Attempt to change this $\frac{1}{2}$ into a parameter called <code>eta</code>, and try different values of it and see whether accuracy is improved.  Note that sklearn default this value to 1.
- Observe that we are actually using sklearn DecisionTreeClassifier.  If we take a look at it closely, it is actually using weighted gini index, instead of weighted errors that we learn above.  Attempt to write your own class of <code>class Stump</code> that actually uses weighted errors, instead of weighted gini index.   To check whether your stump really works, it should give you still relatively the same accuracy.  In addition, if you do not change y to -1, it will result in very bad accuracy.  Unlike sklearn version of DecisionTree, it will STILL work even y is not change to -1 since it uses gini index
- Put everything into a class

In [41]:
import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split


X, y = make_classification(n_samples = 500, random_state= 1)
y = np.where(y==0, -1, 1)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size= 0.2, random_state= 42)

# X_train.shape

In [42]:
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report


class DecisionStump:
    def __init__(self):
        self.polarity = 1
        self.threshold = None
        self.feature_index = None
        self.alpha = None

In [43]:
class AdaBoost:
    def __init__(self, S= 5, eta = 0.5):
        self.S = S
        self.eta = eta
        
    def fit(self, X, y):
        m, n = X.shape
        
        W = np.full(m, 1/m)
        
        self.clfs = []
         
        for _ in range(self.S):
            clf = DecisionStump()
            min_err = np.inf
            for feature in range(n):
                feature_vals = np.sort(np.unique(X[:, feature]))
                thresholds = (feature_vals[:-1] + feature_vals[1:])/2
                for threshold in thresholds:
                    for polarity in [1, -1]:
                        y_hat = np.ones(len(y))
                        y_hat[polarity * X[:, feature] < polarity * threshold] = -1
                        err = W[(y_hat != y)].sum()

                        if err < min_err:
                            clf.polarity = polarity
                            clf.threshold = threshold
                            clf.feature_index = feature
                            min_err = err
            
            eps = 1e-10
            clf.alpha = self.eta * np.log((1-min_err)/ (min_err + eps))
            W = W * np.exp(-clf.alpha * y_hat * y)
            W = W/ sum(W)

            self.clfs.append(clf)
        
    def predict(self, X):
        m, n = X.shape
        y_hat = np.zeros(m)
        for clf in self.clfs:
            pred = np.ones(m)
            pred[clf.polarity * X[:, clf.feature_index] < clf.polarity * clf.threshold] = -1
            y_hat += clf.alpha * pred
            
        return np.sign(y_hat)

In [46]:
model = AdaBoost(S = 30)
model.fit(X_train, y_train)
y_hat = model.predict(X_test)
print(classification_report(y_test, y_hat))

              precision    recall  f1-score   support

          -1       0.74      0.98      0.84        52
           1       0.97      0.62      0.76        48

    accuracy                           0.81       100
   macro avg       0.85      0.80      0.80       100
weighted avg       0.85      0.81      0.80       100

