### ===Task===

Your work: Let's modify the above scratch code:
- Notice that if <code>err</code> = 0, then $\alpha$ will be undefined, thus attempt to fix this by adding some very small value to the lower term
- Notice that sklearn version of AdaBoost has a parameter <code>learning_rate</code>.  This is in fact the $\frac{1}{2}$ in front of the $\alpha$ calculation.  Attempt to change this $\frac{1}{2}$ into a parameter called <code>eta</code>, and try different values of it and see whether accuracy is improved.  Note that sklearn default this value to 1.
- Observe that we are actually using sklearn DecisionTreeClassifier.  If we take a look at it closely, it is actually using weighted gini index, instead of weighted errors that we learn above.   Attempt to write your own class of <code>class Stump</code> that actually uses weighted errors, instead of weighted gini index
- Put everything into a class

In [2]:
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
import numpy as np

X, y = make_classification(n_samples=500, random_state=1)
y = np.where(y==0,-1,1)  #change our y to be -1 if it is 0, otherwise 1

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42)

In [10]:
class Strump:
    def __init__(self):
        self.polarity = 1
        self.feature_index = None
        self.threshold = None
        self.alpha = None

In [11]:
class Adaboost:
    def __init__(self, eta = 0.5, S = 20):
        self.eta = eta
        self.S = S

    def fit(self, X, y):
        m, n = X.shape

        W = np.full(m, 1/m)

        self.clfs = []

        for i in range(self.S):
            model = Strump()
            min_err = 1
            for feature in range(n):
                X_sorted = np.sort(X[:, feature])
                thd_list = (X_sorted[:-1]+X_sorted[1:])/2
                for thd in thd_list:
                    for polarity in [1, -1]:
                        yhat = np.ones(m)
                        yhat[polarity*X[:,feature] < polarity*thd] = -1
                        # Give min limit of err to be the most min foalt
                        err = max(W[(yhat != y)].sum().sum(),sys.float_info.min)
                        if err < min_err:
                            model.polarity = polarity
                            model.threshold = thd
                            model.feature_index = feature
                            yhat_best = yhat
                            min_err = err
            model.alpha = np.log ((1 - err) / err) * self.eta
            W = (W * np.exp(model.alpha * y * yhat_best)) 
            W = W / sum (W)
            self.clfs.append(model)

    def predict(self, X):
        m = X.shape[0]
        yhat = np.zeros(m)
        for model in self.clfs:
            h = np.ones(m)
            h[model.polarity*X[:,model.feature_index] < model.polarity*model.threshold] = -1
            yhat += model.alpha * h
        return np.sign(yhat)

In [13]:
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report


X, y = make_classification(n_samples=500, random_state=1)
y = np.where(y==0,-1,1)  #change our y to be -1 if it is 0, otherwise 1

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42)

model = Adaboost(S=10)
model.fit(X_train, y_train)
yhat = model.predict(X_test)
print(classification_report(y_test, yhat))

              precision    recall  f1-score   support

          -1       0.94      0.95      0.94        79
           1       0.94      0.93      0.94        71

    accuracy                           0.94       150
   macro avg       0.94      0.94      0.94       150
weighted avg       0.94      0.94      0.94       150

