<a href="https://colab.research.google.com/github/ArcturusMajere/CS575/blob/main/2023_FA_CS_575_HA_02_JohnnyS.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Johnny Stuto CS 575: 2023-FA-CS-575-HA-02-JohnnyS

## Neural Network Implementation


### Problem Definition: Address the issue of sleep/wake prediction based on data from wrist health monitors.

## Data Source:Describe the origins and method of data acquisition.

- the dataset is a substantial 1 TB time series representing approximately 35 individuals.
- Parquet is an open source file format built to handle flat columnar storage data formats.
- Parquet works great with large, complex data and is known for its data
compression and ability many encoding types.
- Data found: https://www.kaggle.com/competitions/child-mind-institute-detect-sleep-states/data
- size: 986.46 MB

## Covariates
#### Z-angle:
corresponds to the angle between the accelerometer axis perpendicular to the skin surface and the horizontal plane.

#### ENMO :
The Euclidean Norm Minus One (ENMO) with negative values rounded to zero in g has been shown to correlate with the magnitude of acceleration and human energy expenditure16. ENMO is computed as follows:

$ \text{ENMO} = \sqrt{x^2 + y^2 + z^2} - 1 $

### sss_std:
seconds since start standardized.  (time unit of session)


## Procedures employed for data cleaning, normalization, or transformation. see other notebooks~---->

Implementation:Neural Network structure
- Input Layer:
Number of neurons: 3 (as specified)
- Hidden Layer 1:
Number of neurons: 8 (from hidden_layers list)

- Hidden Layer 2:
Number of neurons: 4 (from hidden_layers list)

- Hidden Layer 3:
Number of neurons: 2 (from hidden_layers list)

- Output Layer:
Number of neurons: 1 (as specified)

--> So, the neural network structure is: 3-8-4-2-1.

- Learning algorithm details

Sets up the neural network structure based on the given input size, hidden layer sizes, and output size.

Initializes weights with random values and biases with zeros for each layer.


Activation Functions:
sigmoid: Implements the sigmoid activation function.
sigmoid_derivative: Computes the derivative of the sigmoid function.
tanh: Implements the hyperbolic tangent activation function.
tanh_derivative: Computes the derivative of the tanh function.

Forward Propagation (forward):
Computes the output of the neural network for given input data, X.

Backward Propagation (backward):
Computes the gradients for weights and biases based on the difference between predicted and actual values.

Uses the chain rule and previously stored intermediate values.
Updates weights and biases with the computed gradients, scaled by the learning rate.

Trains the neural network using mini-batch gradient descent.

Optionally, computes and prints accuracy for the current epoch.

Uses the trained neural network to predict the output for given input data, X.
Rounds the predictions for binary classification purposes.

This class defines a simple feedforward neural network with the capability to save and load models, train on batches of data, and make predictions. The primary activation function used is sigmoid, but tanh is also defined for potential use.


Experiments:
- cross-validation split = 3
Results: best score
### Function 'train_kfold' with Tanh activation took 1310.465219 seconds to execute.

| Parameter      | Value        |
|----------------|--------------|
| learning_rate  | 0.01         |
| epochs         | 100          |
| batch_size     | 256          |
| n_splits       |  3           |
| hidden nodes   | [8,4]        |
|                |              |





| Fold | Test       | Accuracy (%) |
|------|------------|--------------|
| 0    | Fold 1     | 90.82        |
| 1    | Fold 2     | 90.92        |
| 2    | Fold 3     | 90.52        |

### average = 0.91

Conclusion:
- accuracy ~ 90%, pretty good for a numpy-only NN model applied to a time series

In [1]:
import numpy as np
import pandas as pd
import time
import gc
from datetime import datetime;
import matplotlib.pyplot as plt
import warnings;warnings.simplefilter(action='ignore', category=Warning)
from sklearn.model_selection import train_test_split, KFold
from sklearn.utils import shuffle

In [2]:
!gdown 1yRg_tb2HD4VFMlnxnUkU_D4XdTSAOv8m
file_path1 = '/content/X_full.csv'
Z = pd.read_csv(file_path1)
drop_cols = ['series_id', 'step', 'timestamp', 'anglez', 'enmo', 'awake',
             'seconds_since_start', 'sss', 'mss', 'hss']
y = np.array(Z['awake'])
Z.drop(columns=drop_cols, inplace=True)
X = np.array(Z)

Downloading...
From: https://drive.google.com/uc?id=1yRg_tb2HD4VFMlnxnUkU_D4XdTSAOv8m
To: /content/X_full.csv
100% 2.19G/2.19G [00:16<00:00, 134MB/s]


In [3]:
def timer(func):
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        end = time.time()
        elapsed = end - start
        print(f"Function '{func.__name__}' took {elapsed:.6f} seconds to execute.")
        return result
    return wrapper

In [4]:
class NN:
    def __init__(self, input_size, hidden_sizes, output_size):
        self.layers = [input_size] + hidden_sizes + [output_size]
        self.weights = []
        self.biases = []

        # Initialize weights & biases
        for i in range(len(self.layers) - 1):
            self.weights.append(np.random.randn(self.layers[i], self.layers[i+1]))
            self.biases.append(np.zeros((1, self.layers[i+1])))

    def save_model(self, filename):
        with open(filename, 'wb') as file:
            pickle.dump([self.weights, self.biases], file)

    def load_model(self, filename):
        with open(filename, 'rb') as file:
            self.weights, self.biases = pickle.load(file)

    # activations
    @staticmethod
    def sigmoid(x):return 1 / (1 + np.exp(-x))
    @staticmethod
    def sigmoid_derivative(x):return x * (1 - x)
    @staticmethod
    def tanh(x): return np.tanh(x)
    @staticmethod
    def tanh_derivative(x):return 1 - np.tanh(x)**2


    def forward(self, X):
        self.a = [X]
        self.z = []
        for i in range(len(self.layers) - 1):
            z_temp = np.dot(self.a[-1], self.weights[i]) + self.biases[i]
            a_temp = self.sigmoid(z_temp)
            self.z.append(z_temp)
            self.a.append(a_temp)
        return self.a[-1]

    def backward(self, X, y, learning_rate):
        m = X.shape[0]
        self.dz = [self.a[-1] - y]
        self.dw = [np.dot(self.a[-2].T, self.dz[-1]) / m]
        self.db = [np.sum(self.dz[-1], axis=0) / m]

        for i in range(len(self.layers)-3, -1, -1):
            dz_temp = np.dot(self.dz[-1], self.weights[i+1].T) * self.sigmoid_derivative(self.a[i+1])
            dw_temp = np.dot(self.a[i].T, dz_temp) / m
            db_temp = np.sum(dz_temp, axis=0) / m
            self.dz.append(dz_temp)
            self.dw.append(dw_temp)
            self.db.append(db_temp)

        self.dw = self.dw[::-1]
        self.db = self.db[::-1]

        # Update weights & biases
        for i in range(len(self.weights)):
            self.weights[i] -= learning_rate * self.dw[i]
            self.biases[i] -= learning_rate * self.db[i]

    def train(self, X, y, learning_rate, epochs, batch_size):
        m = X.shape[0]
        for epoch in range(epochs):
            indices = np.arange(m)
            np.random.shuffle(indices)
            X = X[indices]
            y = y[indices]
            for i in range(0, m, batch_size):
                X_mini = X[i:i+batch_size]
                y_mini = y[i:i+batch_size]
                self.forward(X_mini)
                self.backward(X_mini, y_mini, learning_rate)
            predictions = self.forward(X)
            epoch_accuracy = np.mean(np.round(predictions) == y)
            #print(f"Epoch {epoch+1}/{epochs} - Accuracy: {epoch_accuracy:.4f}")

    def predict(self, X):
        predictions = self.forward(X)
        return np.round(predictions)


In [5]:
# Parameters
learning_rate = 0.001
epochs = 20
batch_size = 256
kf = KFold(n_splits=3)
latent1 = [8,4,2] #.91
latent2 = [3,6,9]
#M = 'nn1_model.pkl'
nn = NN(input_size=3, hidden_sizes=latent1, output_size=1)

In [6]:
@timer
def train_kfold():
    fold_accuracies = []

    for train_index, test_index in kf.split(X):
        X_train, X_test = X[train_index], X[test_index]
        y_train, y_test = y[train_index], y[test_index]
        nn.train(X_train, y_train.reshape(-1, 1), learning_rate, epochs, batch_size)
        #nn.save_model(M)
        preds = nn.predict(X_test)
        #print(metrics(y_test,preds))
        fold_acc = np.mean(preds == y_test.reshape(-1, 1))
        fold_accuracies.append(fold_acc)

    return fold_accuracies

fold_accuracies = train_kfold()

results_df = pd.DataFrame({
    'Fold': ['Fold 1', 'Fold 2', 'Fold 3'],
    'Test Accuracy (%)': [round(val * 100, 2) for val in fold_accuracies]})

print(np.round(np.mean(fold_accuracies), 2))
results_df

#nn.load_model('nn_model.pkl')

Function 'train_kfold' took 1044.229539 seconds to execute.
0.89


Unnamed: 0,Fold,Test Accuracy (%)
0,Fold 1,85.92
1,Fold 2,90.56
2,Fold 3,90.69
