Modify the AdaBoost scratch code in our lecture such that:
- Notice that if <code>err</code> = 0, then $\alpha$ will be undefined, thus attempt to fix this by adding some very small value to the lower term
- Notice that sklearn version of AdaBoost has a parameter <code>learning_rate</code>.  This is in fact the $\frac{1}{2}$ in front of the $\alpha$ calculation.  Attempt to change this $\frac{1}{2}$ into a parameter called <code>eta</code>, and try different values of it and see whether accuracy is improved.  Note that sklearn default this value to 1.
- Observe that we are actually using sklearn DecisionTreeClassifier.  If we take a look at it closely, it is actually using weighted gini index, instead of weighted errors that we learn above.  Attempt to write your own class of <code>class Stump</code> that actually uses weighted errors, instead of weighted gini index.   To check whether your stump really works, it should give you still relatively the same accuracy.  In addition, if you do not change y to -1, it will result in very bad accuracy.  Unlike sklearn version of DecisionTree, it will STILL work even y is not change to -1 since it uses gini index
- Put everything into a class

In [5]:
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_moons
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import classification_report

X, y = make_moons(n_samples=500, noise=0.30, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

X, y = make_classification(n_samples=500, random_state=1)
y = np.where(y==0,-1,1)  #change our y to be -1 if it is 0, otherwise 1

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42)

In [17]:
class Stump:
    def __init__(self):
        self.feature_index = 0
        self.threshold = 0
        self.polarity = 0
    
    def fit(self, X, y, weight):
        min_err = np.inf
        for feature in range(X.shape[1]):
            feature_vals = np.sort(np.unique(X[:, feature]))
            for threshold_idx in range(len(feature_vals) - 1):
                threshold = (feature_vals[threshold_idx] + feature_vals[threshold_idx+1]) / 2
                for polarity in [1, -1]:
                    yhat = np.ones(len(y))
                    yhat[polarity * X[:, feature] < polarity * threshold] = -1
                    err = weight[(yhat != y)].sum()

                    if err < min_err:
                        self.polarity = polarity
                        self.threshold = threshold
                        self.feature_index = feature
                        min_err = err

    def predict(self, X):
        predict = np.zeros(X.shape[0])
        for idx, value in enumerate(X[:, self.feature_index]):
            if value < self.threshold:
                predict[idx] = self.polarity
            else:
                predict[idx] = self.polarity * -1
        return predict

In [14]:
class AdaBoost:
    def __init__(self, S=20, learning_rate=1):
        self.S = S
        self.learning_rate = learning_rate
        self.models = [Stump() for _ in range(S)]

    def fit(self, X, y):
        m = X.shape[0]
        W = np.full(m, 1/m)

        self.a_js = np.zeros(self.S)

        for j, model in enumerate(self.models):
            
            model.fit(X, y, W)
            
            yhat = model.predict(X) 
            err = W[(yhat != y)].sum()
            err = err if err != 0 else 0.000001
                
            a_j = self.learning_rate * np.log ((1 - err) / err) / 2
            self.a_js[j] = a_j
            
            W = (W * np.exp(-a_j * y * yhat)) 
            W = W / sum (W)
        return self.a_js
    
    def predict(self, X):
        Hx = 0
        for i, model in enumerate(self.models):
            yhat = model.predict(X)
            Hx += self.a_js[i] * yhat
        return np.sign(Hx)

In [19]:
model = AdaBoost(S=20, learning_rate=0.5)
model.fit(X_train, y_train)
yhat = model.predict(X_test)
print(classification_report(y_test, yhat))

              precision    recall  f1-score   support

          -1       0.94      0.96      0.95        79
           1       0.96      0.93      0.94        71

   micro avg       0.95      0.95      0.95       150
   macro avg       0.95      0.95      0.95       150
weighted avg       0.95      0.95      0.95       150

