# Machine Learning (CS535): Assignment 3
## Neural Networks
#### Name: Muhammad Waleed
#### Roll Number: 22030017

### Instructions


*   The aim of this assignment is to learn machine learning tools - Keras, Sklearn and PyTorch.
*   You must use the Python programming language.
*   You can add as many code/markdown cells as required.
*   ALL cells must be run (and outputs visible) in order to get credit for your work.
*   Please use procedural programming style and comment your code thoroughly.
*   There are three parts of this assignment. The import statemnets for the required libraries is already given.
*   **Carefully read the submission instructions and plagiarism policy.**
*   Deadline to submit this assignment is 17th November 2022, 11:55pm on LMS.
*   TAs will not be allowed to debug your code or answer how to solve a question.

### Submission Instructions

You should submit both your notebook file (.ipynb), python script (.py) and dataset (.csv) on LMS.
Please name your files Name_RollNo_Assignment3. Zip these files in a folder and name
the folder Name_RollNo_Assignment3. If you don't know how to save .ipynb as .py see
[this](https://i.stack.imgur.com/L1rQH.png). Failing to submit any one of them might result in the reduction of marks.

### Plagiarism Policy

The code $\color{red}{\text{MUST}}$ be done independently. Any plagiarism or cheating of work from others
or the internet will be immediately referred to the DC. If you are confused about what
constitutes plagiarism, it is your responsibility to consult with the instructor or the TA
in a timely manner. **PLEASE DO NOT LOOK AT ANYONE ELSE'S CODE
NOR DISCUSS IT WITH THEM.**


### Introduction
In this assignment, you will be implementing neural network for the provided dataset using Sklearn, Keras and PyTorch. A description of the problem statement is given at the start of each part. 

Have fun!

In [None]:
# Importing the modules
import numpy as np
import pandas as pd

## Part 1: Feature Extraction
You are given [MNIST audio dataset](https://drive.google.com/file/d/1imgBIVbgtGPV31r64UghSK3161Qfumsm/view?usp=sharing) which contains audio recordings, where speakers say digits (0 to 9) out loud. Use the following line of code to read the audio file:
```python
audio, sr = librosa.load(file_path, sr=16000)
```
You need to extract MFCC features for each audio file, the feature extraction code is give (you can read about MFCC from [here](https://link.springer.com/content/pdf/bbm:978-3-319-49220-9/1.pdf)). Length of each feature vector will be 13. You need to save all the feature vectors in a csv file with ith column representing ith feature, and each row representing an audio file. Add a 'y' column to the csv file and append the labels column at the end. Your csv file should look like this:

| x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 | x11 | x12 | x13 | y |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| -11.347038 | -8.070062 | -0.915299 | 6.859546 | 8.754656 | -3.440287 | -5.738487 | -21.853178 | -9.859462 | 3.584948 | -2.661195	| 1.023747 | -4.574332 | 2 |

Split the dataset into train and test with 80:20 ratio. Print the train data size and test data size.

In [None]:
# Installing the python_speech_features
!pip install python_speech_features

In [None]:
# Importing the modules
from glob import glob
import python_speech_features as mfcc
import librosa
from sklearn.model_selection import train_test_split
import os

In [None]:
def get_MFCC(audio, sr):
    features = mfcc.mfcc(audio, sr, 0.025, 0.01, 13, appendEnergy = True)
    return np.mean(features, axis=0)

In [None]:
# Mount your Google Drive
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
# Iterating over the dataset and creating cvs file according to the requirements
file_path = "/content/drive/MyDrive/ML/output/data"
file_name, x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, y = [], [], [], [], [], [], [], [], [], [], [], [], [], [], []


for each_folder in os.listdir(file_path):
  if not each_folder.endswith(".txt"):
    sub_path = file_path + "/" + each_folder
    for each_voice in os.listdir(sub_path):
      audio, sr = librosa.load(sub_path + "/" + each_voice, sr=16000)
      m = get_MFCC(audio, sr)
      file_name.append(each_voice)
      y.append(int(each_voice.split("_")[0])), x1.append(float(m[0])), x2.append(float(m[1])), x3.append(float(m[2])), x4.append(float(m[3])), 
      x5.append(float(m[4])), x6.append(float(m[5])), x7.append(float(m[6])), x8.append(float(m[7])), x9.append(float(m[8])),
      x10.append(float(m[9])), x11.append(float(m[10])), x12.append(float(m[11])), x13.append(float(m[12]))

In [None]:
# Saving the dataframe to csv
data = {'x1': x1, 'x2': x2, 'x3': x3, 'x4': x4, 'x5': x5, 'x6': x6, 'x7': x7, 'x8': x8, 'x9': x9, 'x10': x10, 'x11': x11, 'x12': x12, 'x13': x13, "y": y} 
df = pd.DataFrame(data)
df.to_csv('out.csv')

In [None]:
# Reading from the csv file

df = pd.read_csv("/content/drive/MyDrive/ML/out.csv")
df = df.drop(df.columns[[0]], axis=1)
df.head()

Unnamed: 0,File,x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,y
0,0_01_0.wav,-10.679614,-1.126155,-2.9215,7.777644,-4.234647,-1.907632,-15.338491,-3.18081,2.313479,-10.125013,-3.45576,4.89624,-12.021362,0
1,0_01_1.wav,-11.423947,-1.890179,-1.148646,10.610124,1.231129,-4.728646,-13.27278,1.20289,0.686237,-5.113064,2.27228,6.805703,-8.340184,0
2,0_01_10.wav,-10.253554,-2.093445,-5.647451,6.346362,-1.961829,-1.76137,-14.366297,-6.844078,-0.555402,-5.644106,-2.729775,4.582464,-7.00511,0
3,0_01_11.wav,-10.208797,-3.407698,-8.520919,3.171731,-1.332095,-7.890483,-19.013466,-1.75054,8.268122,-3.398884,-2.734833,8.183276,-12.306567,0
4,0_01_12.wav,-10.512413,-4.236322,-0.662954,3.462385,-4.762044,-3.514585,-16.976979,-0.667198,1.33608,-7.423439,-2.892023,5.20013,-8.368142,0


In [None]:
# Splitting the dataset into training and testing
x = df.iloc[:, 1:14]
y = df.iloc[:, -1]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

print("x_train:", x_train.shape)
print("x_test:", x_test.shape)
print("y_train:", y_train.shape)
print("y_test", y_test.shape)

x_train: (24000, 13)
x_test: (6000, 13)
y_train: (24000,)
y_test (6000,)


## Part 2: Neural Network Implementation

### Task 2.1:  Scikit-learn

In this part you will use the [Scikit-learn](https://scikit-learn.org/stable/index.html) to implement the [Neural Network](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html) and apply it to the MNIST audio dataset (provided in part 1). Split the training dataset into train and evaluation data with 90:10 ratio. Run evaluation on X_eval while training on X_train. Tune the hyperparameters to get the best possible classification accuracy. You need to report accuracy, recall, precision and F1 score on the test dataset and print the confusion matrix.

In [None]:
# Importing the necessary modules
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix

In [None]:
# Splitting the dataset to train and test
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.1)

print("x_train:", x_train.shape)
print("x_test:", x_test.shape)
print("y_train:", y_test.shape)
print("y_test", y_train.shape)

x_train: (27000, 13)
x_test: (3000, 13)
y_train: (3000,)
y_test (27000,)


In [None]:
# Fitting the dataset
clf = MLPClassifier(random_state=1, max_iter=300).fit(x_train, y_train)

In [None]:
# Predicting on test dataset
pred = clf.predict(x_test)
print(classification_report(y_test, pred))
print(confusion_matrix(y_test, pred))

              precision    recall  f1-score   support

           0       0.89      0.89      0.89       295
           1       0.95      0.87      0.91       311
           2       0.86      0.94      0.90       326
           3       0.94      0.89      0.92       287
           4       0.94      0.98      0.96       272
           5       0.98      0.98      0.98       303
           6       1.00      0.99      0.99       336
           7       0.96      0.96      0.96       314
           8       0.95      0.95      0.95       275
           9       0.93      0.93      0.93       281

    accuracy                           0.94      3000
   macro avg       0.94      0.94      0.94      3000
weighted avg       0.94      0.94      0.94      3000

[[262   2  22   2   2   0   0   2   0   3]
 [  6 272   5   0  12   3   0   1   0  12]
 [ 11   2 307   5   0   0   0   0   1   0]
 [  2   0  18 256   0   0   0   0  10   1]
 [  1   3   0   0 267   1   0   0   0   0]
 [  2   0   0   0   3 296 

### Task 2.2: Tensorflow Keras

In this part you will use the [Keras](https://www.tensorflow.org/api_docs/python/tf/keras/Sequential) to implement the [Neural Network](https://machinelearningmastery.com/tutorial-first-neural-network-python-keras/) and apply it to the MNIST audio dataset (provided in part 1). Split the training dataset into train and evaluation data with 90:10 ratio. Run evaluation on X_eval while training on X_train. Tune the hyperparameters to get the best possible classification accuracy. You need to report accuracy, recall, precision and F1 score on the test dataset and print the confusion matrix.

In [None]:
# Importing the modules
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras import optimizers
from tensorflow import keras

In [None]:
# Splitting the dataset to train and test

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.1)

print("x_train:", x_train.shape)
print("x_test:", x_test.shape)
print("y_train:", y_train.shape)
print("y_test", y_test.shape)

x_train: (27000, 13)
x_test: (3000, 13)
y_train: (27000,)
y_test (3000,)


In [None]:
# Converting output to categorical 
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

In [None]:
LEARNING_RATE = 0.001
BATCH_SIZE = 100
EPOCHS = 60

# Stacking the layers for model
model = Sequential()
model.add(Dense(60, input_shape=(13,), activation='relu'))
model.add(Dense(40, input_shape=(60,), activation='relu'))
model.add(Dense(15, input_shape=(40,), activation='relu'))
model.add(Dense(10, activation='sigmoid'))
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(learning_rate=LEARNING_RATE), metrics=['accuracy'])
model.fit(x_train, y_train, epochs=EPOCHS, batch_size=BATCH_SIZE)


Epoch 1/60
Epoch 2/60
Epoch 3/60
Epoch 4/60
Epoch 5/60
Epoch 6/60
Epoch 7/60
Epoch 8/60
Epoch 9/60
Epoch 10/60
Epoch 11/60
Epoch 12/60
Epoch 13/60
Epoch 14/60
Epoch 15/60
Epoch 16/60
Epoch 17/60
Epoch 18/60
Epoch 19/60
Epoch 20/60
Epoch 21/60
Epoch 22/60
Epoch 23/60
Epoch 24/60
Epoch 25/60
Epoch 26/60
Epoch 27/60
Epoch 28/60
Epoch 29/60
Epoch 30/60
Epoch 31/60
Epoch 32/60
Epoch 33/60
Epoch 34/60
Epoch 35/60
Epoch 36/60
Epoch 37/60
Epoch 38/60
Epoch 39/60
Epoch 40/60
Epoch 41/60
Epoch 42/60
Epoch 43/60
Epoch 44/60
Epoch 45/60
Epoch 46/60
Epoch 47/60
Epoch 48/60
Epoch 49/60
Epoch 50/60
Epoch 51/60
Epoch 52/60
Epoch 53/60
Epoch 54/60
Epoch 55/60
Epoch 56/60
Epoch 57/60
Epoch 58/60
Epoch 59/60
Epoch 60/60


<keras.callbacks.History at 0x7f175e728750>

In [None]:
# predicting and measuring
pred = model.predict(x_test)
pred=np.argmax(pred, axis=1)
y_test=np.argmax(y_test, axis=1)
print(classification_report(y_test, pred))
print(confusion_matrix(y_test, pred))

              precision    recall  f1-score   support

           0       0.84      0.90      0.87       298
           1       0.90      0.95      0.92       303
           2       0.83      0.89      0.86       304
           3       0.85      0.83      0.84       281
           4       0.97      0.95      0.96       305
           5       0.96      0.96      0.96       286
           6       0.99      0.98      0.99       317
           7       0.97      0.94      0.96       311
           8       0.93      0.86      0.89       292
           9       0.92      0.91      0.92       303

    accuracy                           0.92      3000
   macro avg       0.92      0.92      0.92      3000
weighted avg       0.92      0.92      0.92      3000

[[267   2  18   2   2   4   0   2   0   1]
 [  4 287   0   0   3   0   0   0   0   9]
 [ 29   2 270   0   1   0   0   0   0   2]
 [  5   0  28 233   0   1   0   1  12   1]
 [  5   7   0   0 289   3   0   0   0   1]
 [  1   2   0   0   2 274 

### Task 2.3: Pytorch 

In this part you will use the [Pytorch](https://pytorch.org/docs/stable/nn.html) to implement the [Neural Network](https://medium.com/analytics-vidhya/a-simple-neural-network-classifier-using-pytorch-from-scratch-7ebb477422d2) and apply it to the MNIST audio dataset (provided in part 1). Split the training dataset into train and evaluation data with 90:10 ratio. Run evaluation on X_eval while training on X_train. You need to use DataLoader to generate batches of data. Tune the hyperparameters to get the best possible classification accuracy. You need to report training loss, training accuracy, validation loss and validation accuracy after each epoch in the following format:
```
Epoch 1/2
loss: 78.67749792151153 - accuracy: 0.6759259259259259 - val_loss: 6.320814955048263 - val_accuracy: 0.7356481481481482
Epoch 2/2
loss: 48.70551285566762 - accuracy: 0.7901234567901234 - val_loss: 6.073690168559551 - val_accuracy: 0.7791666666666667
```
You need to report accuracy, recall, precision and F1 score on the test dataset and print the confusion matrix.

In [None]:
#importing the modules
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from sklearn.metrics import accuracy_score

In [None]:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.1)

x_train = x_train.to_numpy()
x_test = x_test.to_numpy()
y_train = y_train.to_numpy()
y_test = y_test.to_numpy()

print("x_train:", x_train.shape)
print("x_test:", x_test.shape)
print("y_train:", y_train.shape)
print("y_test", y_test.shape)

x_train: (27000, 13)
x_test: (3000, 13)
y_train: (27000,)
y_test (3000,)


In [None]:
class Data(Dataset):
    def __init__(self, X_train, y_train):
      # Code here
      self.X = torch.from_numpy(X_train.astype(np.float32))
      self.y = torch.from_numpy(y_train).type(torch.LongTensor)
      self.len = self.X.shape[0]

    def __getitem__(self, index):
      # Code here
      return self.X[index], self.y[index]
    
    def __len__(self):
      # Code here
      return self.len

In [None]:
class NeuralNetwork(nn.Module):
    def __init__(self, ):
        super(NeuralNetwork, self).__init__()
        # Code here
        self.linear1 = nn.Linear(13, 300)
        self.linear2 = nn.Linear(300, 200)
        self.linear3 = nn.Linear(200, 50)
        self.linear6 = nn.Linear(50, 10)

    def forward(self, x):
        # Code here
        x = torch.sigmoid(self.linear1(x))
        x = self.linear2(x)
        return x

In [None]:
def get_loss(testdata, testloader, model):
  correct, total = 0, 0
  running_loss = 0.0
  # no need to calculate gradients during inference
  with torch.no_grad():
    for data in testloader:
      inputs, labels = data
      # calculate output by running through the network
      outputs = model(inputs)
      loss = criterion(outputs, labels)
      # get the predictions
      __, predicted = torch.max(outputs.data, 1)
      # update results
      total += labels.size(0)
      correct += (predicted == labels).sum().item()
      running_loss += loss.item()
    
  return correct // total, float("{0:.3f}".format(running_loss/2000))


In [None]:
# Set the parameters accordingly
LEARNING_RATE = 0.1
BATCH_SIZE = 100
EPOCHS = 50

In [None]:
# Set the loss function and optimizer accordingly
loss_function = nn.CrossEntropyLoss()
# Initialize the model
model = NeuralNetwork()
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
traindata = Data(x_train, y_train)
trainloader = DataLoader(traindata, batch_size=BATCH_SIZE, shuffle=True, num_workers=2)

testdata = Data(x_test, y_test)
testloader = DataLoader(testdata, batch_size=BATCH_SIZE, shuffle=True, num_workers=2)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=LEARNING_RATE)

In [None]:
epochs = EPOCHS
for epoch in range(epochs):
  running_loss = 0.0
  for i, data in enumerate(trainloader, 0):
    inputs, labels = data
    # set optimizer to zero grad to remove previous epoch gradients
    optimizer.zero_grad()
    # forward propagation
    outputs = model(inputs)
    loss = criterion(outputs, labels)
    # backward propagation
    loss.backward()
    # optimize
    optimizer.step()
    running_loss += loss.item()
  train_accuracy, train_loss = get_loss(traindata, trainloader, model)
  test_accuracy, test_loss = get_loss(testdata, testloader, model)
  print(f"Epoch {epoch+1}/{epochs} \nloss: {train_loss} - accuracy: {train_accuracy} - val_loss: {test_loss} - val_accuracy: {test_accuracy}\n")

Epoch 1/50 
loss: 0.063 - accuracy: 84 - val_loss: 0.007 - val_accuracy: 83

Epoch 2/50 
loss: 0.053 - accuracy: 86 - val_loss: 0.006 - val_accuracy: 85

Epoch 3/50 
loss: 0.047 - accuracy: 87 - val_loss: 0.005 - val_accuracy: 87

Epoch 4/50 
loss: 0.044 - accuracy: 88 - val_loss: 0.005 - val_accuracy: 88

Epoch 5/50 
loss: 0.04 - accuracy: 89 - val_loss: 0.005 - val_accuracy: 88

Epoch 6/50 
loss: 0.038 - accuracy: 90 - val_loss: 0.005 - val_accuracy: 88

Epoch 7/50 
loss: 0.036 - accuracy: 91 - val_loss: 0.004 - val_accuracy: 89

Epoch 8/50 
loss: 0.034 - accuracy: 91 - val_loss: 0.004 - val_accuracy: 90

Epoch 9/50 
loss: 0.032 - accuracy: 92 - val_loss: 0.004 - val_accuracy: 90

Epoch 10/50 
loss: 0.03 - accuracy: 92 - val_loss: 0.004 - val_accuracy: 91

Epoch 11/50 
loss: 0.03 - accuracy: 92 - val_loss: 0.004 - val_accuracy: 91

Epoch 12/50 
loss: 0.028 - accuracy: 93 - val_loss: 0.004 - val_accuracy: 91

Epoch 13/50 
loss: 0.027 - accuracy: 93 - val_loss: 0.003 - val_accuracy: 91

## Part 3: Theoretical

**Q1**: Compare the tools you used above. State the advantages and disadvantages.

**Ans**: If we compare the ease of implementation MLP is the easiest one then the Keras and in the last is Pytorch. Pytorch has relatively more complex implementation as compared to other.

If we compare processing speed Pytroch is much faster.

Debugging is much easier in Pytorch as compared to Keras.

Keras is better for smaller data when you want to do fast implementation.

**Q2**: What is the purpose of the data loader in PyTorch?

**Ans**: Data Loader provides an iterable over the provided dataset. It provides two tensors the first one is the features and the second one is the output labels. The numbers of samples for each iteration depends upon the batch size.