# Assignment 4 - Neural Network

## Grade: 100 pts + 10 Bonus 

This notebook contains the questions for Assignment 4. 

You must upload this completed Jupyter Notebook file as your submission (other file types are not permitted and will result in a grade of 0).***

* If you have trouble running neural network models on your laptop, you can use online platforms, like **[Google Colab](https://colab.research.google.com/)**.
* All Figures should have a x- and y-axis label and an appropriate title.
**Ensure that your code runs correctly by choosing "Kernel -> Restart and Cell -> Run All" before submitting.**

In [1]:
# You are allowed to use other libraries as needed

import warnings 
warnings.filterwarnings('ignore')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, f1_score, recall_score, precision_score
from sklearn.preprocessing import StandardScaler

from keras.wrappers.scikit_learn import KerasClassifier
from tensorflow.keras import Model, Sequential
from tensorflow.keras.layers import Dense, Input 
import tensorflow as tf

import time

#add other imports here if any (for example, pytorch)

## Data set 
As modern vehicles have lots of connectivity, protecting in-vehicle network from cyber-attacks is an important issue. Controller Area Network (CAN) is the standard protocol for the in-vehicle network. But, the lack of security features in the CAN protocol makes vehicles vulnerable to attacks. The message injection attack is a representative attack type which injects fabricated messages to deceive electronic control unit (ECUs) or cause malfunctions. Through this notebook, you will develop ML modeles to detect different types of CAN attacks and protect vehicle networks. 

### Source
The dataset (CAN-intrusion-dataset-10000.csv) has been constructed by logging CAN traffic via the OBD-II port from a real vehicle while message injection attacks were performing. The classification goal is to distinguish cyber-attacks and normal samples by classifying the data samples. The dataset includes over 10,000 records and 10 attributes (including the target variable "Label").

### Variables
The definitions of the attributes are as followss.

* CAN ID : identifier of CAN message.
* DATA[0~7] : data value (byte), ranging from 0 to 255. They have been converted from hexadecimal numbers to decimal numbers.  
* Label : 0 indicates 'Normal', and 1 indicates an attack (including DoS, Fuzzy, Gear, or RPM).

## Question 1: Load Datasets (15pts)
A) Load the Dataset CAN-intrusion-dataset-10000.csv 

B) Split the data into equals-sized training and test sets (use a random_state = 1, and do not shuffle the data).  

C) How many observations do you have in your training set?  

D) How many observations for each class in your training set?

E) Z-standarize the input features of the training and test sets.

In [None]:
### Q1A) 

In [None]:
### Q1B) 

In [None]:
### Q1C) 

In [None]:
### Q1D) 

In [None]:
### Q1E) 

## Question 2: Logistic Regression (20pts)
A) Build a L1-regularized logistic regression model to all the training data, and then get the predicted labels for each item of the test set. 

B) Print out the precision, recall, and F1-score of the test set.

C) Print out the model execution time (including both training and testing time) in milliseconds. Please keep two decimal places.

In [None]:
### Q2A) 

In [None]:
### Q2B) 

In [None]:
### Q2C) 

## Question 3: Single Layer Neural Networks (30 pts)
In this task we aim to build models with better performance, using "deep" learning. __You may use PyTorch or Keras libraries for building deep learning models.__ 

A) Implement a single-layer neural network model that is used to classify the CAN intrusion data samples into normal and anomalous classes (0: normal, 1: attack). Use the standarized training set from Q1E) to train the network.

The details of the model are as follows:
* Use a Sigmoid as the output layer acitivation function to enable non-linearity. 
* Use the binary cross-entropy loss as a training criterion.
* Use Stochastic gradient descent optimizer with a learning rate of 0.1.
* Run the model for 10 iterations/epochs.

B) Record the loss for each iteration, and make a plot of iterations/epochs vs loss(Binary Cross Entropy).

C) Print out the precision, recall, and F1-score of the test set.

D) Print out the model execution time (including both training and testing time) in milliseconds. Please keep two decimal places.

In [None]:
### Q3A)

In [None]:
### Q3B)

In [None]:
### Q3C)

In [None]:
### Q3D)

## Question 4: Multi-Layer Perceptron (MLP) (35 pts)

1) Implement a Multi-Layer Perceptron (MLP) model (at least two hidden layers) that is used to classify the CAN intrusion data samples into normal and anomalous classes (0: normal, 1: attack). Use the standarized training set from Q1E) to train the network. 
The details of the model are as follows:
* Each hidden layer have 8 neurons/units. 
* Use tanh function as the activation function for hidden layers.
* Use a Sigmoid as the output layer acitivation function to enable non-linearity.  
* Use Stochastic gradient descent optimizer with a learning rate of 0.1.
* Run the model for 10 iterations/epochs 

B) Record the loss for each iteration, and make a plot of iterations/epochs vs loss(Binary Cross Entropy).

C) Print out the precision, recall, and F1-score of the test set.

D) Print out the model execution time (including both training and testing time) in milliseconds. Please keep two decimal places.

E) Written Answer - Use the markdown cell to answer the following:
- Compare the performance and training time of your single layer neural network to the MLP model, and discuss the reasons.

In [None]:
### Q4A)

In [None]:
### Q4B)

In [None]:
### Q4C)

In [None]:
### Q4D)

#### Q4E)  
Written answer here


## Question 5: Hyperparameter Optimization (10 Bonus pts)
A) Buil a Grid_Search_NN_model that has the same architecture as the MLP model from Question 4. Use grid search to tune two hyperparameters:
* The number of neurons on the hidden layers of your MLP model (find the best number among 8, 16, 32). Each hidden layer should have the same number of neurons/nodes, so only one hyperparameter is needed to tune the number of neurons.
* Learning rate of the SGD optimizer (find the best value among the two numbers 0.01 and 0.1). 

B) Implement grid search to identify optimal hyperparameter values, and print out the best hyperparameter values and the best cross-validation accuracy.

You can use 3-fold GridSearchCV and KerasClassifier functions on the standarized training set to do this. 

C) Build the optimized MLP model on the training set by passing the detected best hyperparameter values to the Grid_Search_NN_model. Print out the precision, recall, and F1-score of the optimized MLP model on the test set.

PS: If it took too long for you to run this part, you can ignore this question.

In [None]:
### Q5A)
def Grid_Search_NN_model(hidden_neurons = 8, learning_rate = 0.1):
    #write function here
    
    return myGSModel

In [None]:
### Q5B)
# Run gridsearch here

In [None]:
### Q5C)

## Make sure to add sufficient comments to your code, and run the entire code before submitting.