Michael O'Hanlon\
Professor Monogioudis\
CS301101\
11/17/2022

Assignment #3: Electromyography and Gradient Boosting

# Background and Information

Gradient Boosting is a very popular and effective machine learning effective. It works to combine several weak models or learners into a strong model, where a weak model is defined as one with poor accuracy performance. If a model struggles to perform better than complete random predictions, it can be classified as a weak model. So, the motivation behind using Gradient Boosting to take several models with poor accuracy performance and combine them into a single strong model that has good accuracy, which if what we want of course!
\
\
\
What follows is the four steps of the Gradient Boosting Algorithm.
\
\
\
The Gradient Boosting Algorithm:\
![](https://drive.google.com/uc?export=view&id=1-4Ai4qRnqSM6I74AYL9qPcihuX9cLGct)

When M is sufficiently large, the result is a strong composite model which can be used to make predictions.
\
\
\
For the classification task, the loss function is defined as:\
![](https://drive.google.com/uc?export=view&id=1MIuSSvHCDbI2H7IH3H5ru-FkVu2rYAGi)

However, in the case of multiclass classification, which will be applied later, the loss function is defined as:\
![](https://drive.google.com/uc?export=view&id=10T9G2VYHVJR-CgvRmb4kRAROfkfvV4U-)
\
\
\
The trick behind Gradient Boosting is to fit a new model to the *residual errors* that are created by the previous model. As expected from it's name, Gradient Boosting utilizes calculating the gradient for every step. This is used to update the weights to ensure we are actually heading in the right direction.
\
\
\
An important component of Gradient Boosting is to finely tune the two hyperparameters, `learning_rate` and `n_estimators`. The first hyperparameter, `learning_rate`, is used to provide a weight to how much each weak learner actually contributes. The second, `n_estimators`, is used to give a number to the amount of models in the ensemble. These two hyperparameters must be tuned to prevent overfitting and ensure the resultant model is high in accuracy.
\
\
\
In this notebook, Gradient Boosting will be implemented from scratch using JAX libraries. The implementation will then be applied to a Electromyography dataset to see it's performance in action.

# Coding from scratch using JAX

In [None]:
"""
Implementation of Gradient Boosting from scratch.
Utilizes JAX libraries to perform classification.

Video Reference: 
https://www.youtube.com/watch?v=SstuvS-tVc0&list=WL&index=2&t=7s&ab_channel=AleksaGordi%C4%87-TheAIEpiphany

Useful Links:
https://towardsdatascience.com/gradient-boosting-classification-explained-through-python-60cc980eeb3d
https://www.simplilearn.com/gradient-boosting-algorithm-in-python-article
https://github.com/groverpr/Machine-Learning/blob/master/notebooks/01_Gradient_Boosting_Scratch.ipynb

https://gkaissis.github.io/post/2020-03-15-rfgb/
https://github.com/eriklindernoren/ML-From-Scratch/blob/master/mlfromscratch/supervised_learning/gradient_boosting.py
"""

#Necessary imports
import jax
import jax.numpy as jnp
from sklearn.tree import DecisionTreeClassifier

#Loss function for classification task
def CrossEntropy(y_true:jnp.array, y_proba:jnp.array):
    y_proba = jnp.clip(y_proba, 1e-5, 1 - 1e-5)
    return jnp.sum(- y_true * jnp.log(y_proba) - (1 - y_true) * jnp.log(1 - y_proba))

#Class for Gradient Boosting
class GradientBooster:
    #Construct a new GradientBooster
    def __init__(self, n_estimators, learning_rate, **kwargs):
        self.n_estimators = n_estimators
        self.learning_rate = learning_rate
        self.loss = CrossEntropy

        #Create all of the estimators to use together
        self.estimators = []
        for _ in range(self.n_estimators):
                self.estimators.append(DecisionTreeRegressor(**kwargs))

    #Function to train the classifier with given Xs and ys
    def fit(self, X:np.array, y:np.array):
        y_pred = np.full(np.shape(y), np.mean(y))
        for i, estimator in enumerate(self.estimators):
            gradient = jax.grad(self.loss, argnums=1)(y.astype(np.float32), y_pred.astype(np.float32))
            self.estimators[i].fit(X, gradient)
            update = self.estimators[i].predict(X)
            y_pred -= (self.learning_rate * update)

    #Function to make predictions based on X data
    def predict(self, X:np.array):
        y_pred = np.zeros(X.shape[0], dtype=np.float32)
        for estimator in self.estimators:
            y_pred -= (self.learning_rate * estimator.predict(X))

        #Return the prediction
        return np.where(1/(1 + np.exp(-y_pred))>.5, 1, 0)

# EMG Dataset

In [None]:
"""
Load the EMG dataset into the Colab environment.
Apply the Gradient Boosting implementation to the model.

20 Classes (10 Aggressive, 10 Normal)
Load the data into a csv and put in Google Drive
Add a column to the data indicating it's class
Load the data from all 4 sub folders
Seperate the data into X (features) and y (class)
Split the data into training and test
"""

#Necessary imports
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

#Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

#Create a gradient booster that creates 100 estimators with a learning rate of 0.1
gradBooster = GradientBooster(100, 0.1)