# Boosting With Neural Networks

## Markdown Section

### Representation

Let $H$ be the class of base, un-boosted hypotheses. Then, $E$ is defined as the ensemble of $H$ weak learners of size $T$.

$$E(H, T) = {x \to sign(Σ(w_t * h_t(x))) : w ∈ R^T, ∀t, h_t ∈ H}$$



### Check Version

In [1]:
from __future__ import print_function
from packaging.version import parse as Version
from platform import python_version

OK = '\x1b[42m[ OK ]\x1b[0m'
FAIL = "\x1b[41m[FAIL]\x1b[0m"

try:
    import importlib
except ImportError:
    print(FAIL, "Python version 3.12.5 is required,"
                " but %s is installed." % sys.version)

def import_version(pkg, min_ver, fail_msg=""):
    mod = None
    try:
        mod = importlib.import_module(pkg)
        if pkg in {'PIL'}:
            ver = mod.VERSION
        else:
            ver = mod.__version__
        if Version(ver) == Version(min_ver):
            print(OK, "%s version %s is installed."
                  % (lib, min_ver))
        else:
            print(FAIL, "%s version %s is required, but %s installed."
                  % (lib, min_ver, ver))
    except ImportError:
        print(FAIL, '%s not installed. %s' % (pkg, fail_msg))
    return mod


# first check the python version
pyversion = Version(python_version())

if pyversion >= Version("3.12.5"):
    print(OK, "Python version is %s" % pyversion)
elif pyversion < Version("3.12.5"):
    print(FAIL, "Python version 3.12.5 is required,"
                " but %s is installed." % pyversion)
else:
    print(FAIL, "Unknown Python version: %s" % pyversion)


print()
requirements = {'matplotlib': "3.9.1", 'numpy': "2.0.1",'sklearn': "1.5.1",
                'pandas': "2.2.2"}

# now the dependencies
for lib, required_version in list(requirements.items()):
    import_version(lib, required_version)

## Model Section

### Weak Learner: One Layer Neural Network

In [10]:
import numpy as np
import random

def l2_loss(predictions,Y):
    '''
        Computes L2 loss (sum squared loss) between true values, Y, and predictions.
        :param Y: A 1D Numpy array with real values (float64)
        :param predictions: A 1D Numpy array of the same size of Y
        :return: L2 loss using predictions for Y.
    '''

    return np.sum((predictions - Y)**2)

class OneLayerNN:
    '''
        One layer neural network trained with Stocastic Gradient Descent (SGD)
    '''
    def __init__(self):
        '''
        @attrs:
            weights: The weights of the neural network model.
            batch_size: The number of examples in each batch
            learning_rate: The learning rate to use for SGD
            epochs: The number of times to pass through the dataset
            v: The resulting predictions computed during the forward pass
        '''
        # initialize self.weights in train()
        self.weights = None
        self.learning_rate = 0.001
        self.epochs = 25
        self.batch_size = 1

        # initialize self.v in forward_pass()
        self.v = None

    def train(self, X, Y, print_loss=False):
        '''
        Trains the OneLayerNN model using SGD.
        :param X: 2D Numpy array where each row contains an example
        :param Y: 1D Numpy array containing the corresponding values for each example
        :param print_loss: If True, print the loss after each epoch.
        :return: None
        '''
        # Initialize weights
        input_size = X.shape[1]
        self.weights = np.random.uniform(0,1,(1, input_size))
        #print("Weights Initial Shape: ", self.weights.shape)

        # Train network for certain number of epochs
        for epoch in range(self.epochs):

            # Shuffle the examples (X) and labels (Y)
            rand_index = np.arange(X.shape[0])
            np.random.shuffle(rand_index)
            X_s = X[rand_index]
            Y_s = Y[rand_index]

             # iterate through the examples in batch size increments
            for i in range((int(np.ceil(X_s.shape[0] / self.batch_size)))):
                X_batch = X_s[i * self.batch_size : (i + 1) * self.batch_size]
                Y_batch = Y_s[i * self.batch_size : (i + 1) * self.batch_size]

                #Perform the forward and backward pass on the current batch
                self.forward_pass(X_batch)
                self.backward_pass(X_batch, Y_batch)

            # Print the loss after every epoch
            if print_loss:
                print('Epoch: {} | Loss: {}'.format(epoch, self.loss(X, Y)))

    def forward_pass(self, X):
        '''
        Computes the predictions for a single layer given examples X and
        stores them in self.v
        :param X: 2D Numpy array where each row contains an example.
        :return: None
        '''

        self.v = np.dot(self.weights, X.T) # + bias (no bias for now?)
        #print("v shape: ", self.v.shape)



    def backward_pass(self, X, Y):
        '''
        Computes the weights gradient and updates self.weights
        :param X: 2D Numpy array where each row contains an example
        :param Y: 1D Numpy array containing the corresponding values for each example
        :return: None
        '''
        # Compute the gradients for the model's weights using backprop
        grads = self.backprop(X,Y)

        # Update the weights using gradient descent
        self.gradient_descent(grads)



    def backprop(self, X, Y):
        '''
        Returns the average weights gradient for the given batch
        :param X: 2D Numpy array where each row contains an example.
        :param Y: 1D Numpy array containing the corresponding values for each example
        :return: A 1D Numpy array representing the weights gradient
        '''
        # Compute the average weights gradient
        # Refer to the SGD algorithm in slide 12 in Lecture 17: Backpropagation

        # The gradient dL/dw = -2 * xi * predictons (sum(yi-h(xi)))
        m = X.shape[0]
        grad_W = (2/m) * np.dot(X.T, self.v - Y)

        #print("Shape of grad_W: ", grad_W.T.shape)

        return grad_W.T

    def gradient_descent(self, grad_W):
        '''
        Updates the weights using the given gradient
        :param grad_W: A 1D Numpy array representing the weights gradient
        :return: None
        '''
        # Update the weights using the given gradient and the learning rate
        # Refer to the SGD algorithm in slide 12 in Lecture 17: Backpropagation
        self.weights = self.weights - (self.learning_rate * grad_W)
        #print(self.weights)


    def loss(self, X, Y):
        '''
        Returns the total squared error on some dataset (X, Y).
        :param X: 2D Numpy array where each row contains an example
        :param Y: 1D Numpy array containing the corresponding values for each example
        :return: A float which is the squared error of the model on the dataset
        '''
        # Perform the forward pass and compute the l2 loss
        self.forward_pass(X)
        return l2_loss(self.v, Y)

    def average_loss(self, X, Y):
        '''
        Returns the mean squared error on some dataset (X, Y).
        MSE = Total squared error/# of examples
        :param X: 2D Numpy array where each row contains an example
        :param Y: 1D Numpy array containing the corresponding values for each example
        :return: A float which is the mean squared error of the model on the dataset
        '''
        return self.loss(X, Y) / X.shape[0]

    def predict(self, X):
        '''
        Returns the predicted values for some dataset (X).
        :param X: 2D Numpy array where each row contains an example
        :return: 1D Numpy array containing the predicted values for each example
        '''
        self.forward_pass(X)
        return self.v

### Boosting Model

In [44]:
class Boosted_NN:
  def __init__(self, n_estimators=50, learning_rate=0.01, random_state=1):
    self.n_estimators = n_estimators
    self.learning_rate = learning_rate
    self.random_state = random_state
    self.estimator_weights = np.zeros(self.n_estimators)

    # Initialize the estimators
    self.estimators = []
    for i in range(self.n_estimators):
      self.estimators.append(OneLayerNN())

  def train(self, X, y):
    '''
    Trains/Fits the Boosting Model using AdaBoost.
    :param X: 2D Numpy array where each row contains an example
    :param Y: 1D Numpy array containing the corresponding values for each example
    '''
    # Initialize the weights
    num_inputs = X.shape[0]
    self.estimator_weights = 1/num_inputs * np.ones(num_inputs)

    # For each round/weak learner
    for i in range(self.n_estimators):

      # Use the weak learner
      weak_learner = self.estimators[i]

      # Fit the weak learner
      weak_learner.train(X, y)

      y_pred = weak_learner.predict(X).reshape(num_inputs)

      loss = l2_loss(y, y_pred)
      e_t = (self.estimator_weights[i] * loss) / loss
      print("weighted error", e_t)
      w_t = 0.5 * np.log((1 / e_t) - 1)
      print("w_t", w_t)
      
  
      self.estimator_weights *= np.exp(-w_t * y * y_pred)
      self.estimator_weights /= np.sum(self.estimator_weights)
      print("sum of weights (should be 1)", np.sum(self.estimator_weights))


  def loss(self, X, Y):
    print("X: ", X)
    print("Y: ", Y)
    print("first estimator loss: ", self.estimators[0].loss(X,Y))
    print("second estimator loss: ", self.estimators[1].loss(X,Y))
    estimator_loss = np.array([e.loss(X, Y) for e in self.estimators])
    print(estimator_loss)
    return np.dot(estimator_loss, self.estimator_weights)


  def predict(self, X):
    '''
    Returns the predicted values for some dataset (X).
    :param X: 2D Numpy array where each row contains an example
    :return: 1D Numpy array containing the predicted values for each example
    '''
    y_pred = np.array([e.predict(X) for e in self.estimators])
    return np.dot(self.estimator_weights, y_pred)


## Accuracy on Data Sets

In [45]:
from sklearn.model_selection import train_test_split
import os 


def test_models(dataset, test_size=0.2):
    '''
        Tests OneLayerNN, Boost on a given dataset.
        :param dataset The path to the dataset
        :return None
    '''

    # Check if the file exists
    if not os.path.exists(dataset):
        print('The file {} does not exist'.format(dataset))
        exit()

    # Load in the dataset
    data = np.loadtxt(dataset, skiprows = 1)
    X, Y = data[:, 1:], data[:, 0]

    # Normalize the features
    X = (X-np.mean(X, axis=0))/np.std(X, axis=0)

    X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=test_size)

    print('Running models on {} dataset'.format(dataset))

    # Add a bias
    X_train_b = np.append(X_train, np.ones((len(X_train), 1)), axis=1)
    X_test_b = np.append(X_test, np.ones((len(X_test), 1)), axis=1)

    #### 1-Layer NN ######
    print('----- 1-Layer NN -----')
    nnmodel = OneLayerNN()
    nnmodel.train(X_train_b, Y_train, print_loss=False)
    print('Average Training Loss:', nnmodel.average_loss(X_train_b, Y_train))
    print('Average Testing Loss:', nnmodel.average_loss(X_test_b, Y_test))

    #### 2-Layer NN ######
    print('----- Boosted Neural Network -----')
    model = Boosted_NN()

    model.train(X_train_b, Y_train)

    print('Average Training Loss:', model.average_loss(X_train_b, Y_train))
    print('Average Testing Loss:', model.average_loss(X_test_b, Y_test))

test_models('wine.txt')

Running models on wine.txt dataset
----- 1-Layer NN -----
Average Training Loss: 0.5638270145889568
Average Testing Loss: 0.5734783902819962
----- Boosted Neural Network -----
weighted error 0.00025523226135783564
w_t 4.136540666832915
sum of weights (should be 1) 0.9999999999999999
weighted error 2.9764082517316005e-40
w_t 45.50634321541903
sum of weights (should be 1) 1.0
weighted error 0.0
w_t inf
sum of weights (should be 1) nan


  w_t = 0.5 * np.log((1 / e_t) - 1)
  self.estimator_weights /= np.sum(self.estimator_weights)


weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weights (should be 1) nan
weighted error nan
w_t nan
sum of weight

AttributeError: 'Boosted_NN' object has no attribute 'average_loss'