# CS7643 Final Project

This notebook is meant to go through the different models and experiment on our dataset containing 21782 training samples on different sounds. 

## Group Members
Zach Halaby
Michael Marzec
Shayan Mukhtar

## Dataset

The dataset was obtained under Google's GPL license terms from the following site: https://research.google.com/audioset/download.html

## Colab Prep

github <--> colab instructions: https://medium.com/analytics-vidhya/how-to-use-google-colab-with-github-via-google-drive-68efb23a42d

In [1]:
### If using Google Colab
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


### Initial Cloning

In [10]:
### cd to github drive (create the folder, if it doesn't already exist)
%cd /content/drive/MyDrive/Github

# git clone -b branch https://{git_token}@github.com/{username}/{repository}
# See instructions file if you're unfamiliar with generating git_tokens

username = 'shayanmukhtar'
repository = 'cs7643_final'
# git_token = 

# !git clone -b UPDATE_BRANCH https://{git_token}@github.com/{username}/{repository}



/content/drive/MyDrive/Github
Cloning into 'cs7643_final'...
remote: Enumerating objects: 66, done.[K
remote: Counting objects: 100% (66/66), done.[K
remote: Compressing objects: 100% (43/43), done.[K
remote: Total 66 (delta 29), reused 57 (delta 20), pack-reused 0[K
Unpacking objects: 100% (66/66), done.
Checking out files: 100% (17/17), done.


### Git Commands

#### access git (via Google Drive)

In [11]:
repository = 'cs7643_final'
%cd {repository}

%ls -a

/content/drive/My Drive/Github/cs7643_final
cs7643_final_project.ipynb  [0m[01;34m.ipynb_checkpoints[0m/  [01;34mmodels[0m/    [01;34mutils[0m/
[01;34m.git[0m/                       main_scratch.py      README.md


#### Commit / Status / Push

In [13]:
!git status

On branch marzec
Your branch is up to date with 'origin/marzec'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	[31mcs7643_final_project (1).ipynb[m

nothing added to commit but untracked files present (use "git add" to track)


In [None]:
!git add .

In [None]:
!git commit -m "[message]"

In [None]:
!git push

## Instructions
Cells with a "Mandatory" in their title must be run. Cells with a title stating that running is optional do not have to be run.

## Mandatory - Imports

Let's start by importing the necessary packages:

In [18]:
!pip install torchmetrics
!pip install tfrecord

import os.path

import torch
import torchmetrics
import tfrecord
import numpy as np
from os import walk

from torch.utils.data import DataLoader

from models import LinearModel
from models import SimpleConvolutionModel
from utils import utils
from utils import dataloader
from torch import nn
from torch import optim

# Tqdm progress bar
from tqdm import tqdm_notebook

Collecting tfrecord
  Downloading tfrecord-1.14.1.tar.gz (15 kB)
Collecting crc32c
  Downloading crc32c-2.2.post0-cp37-cp37m-manylinux2010_x86_64.whl (49 kB)
[K     |████████████████████████████████| 49 kB 2.7 MB/s 
Building wheels for collected packages: tfrecord
  Building wheel for tfrecord (setup.py) ... [?25l[?25hdone
  Created wheel for tfrecord: filename=tfrecord-1.14.1-py3-none-any.whl size=15654 sha256=06658a35b35544e2be637fda539e8f567affd3fd5147433a23ab802f957c2293
  Stored in directory: /root/.cache/pip/wheels/07/63/59/2a382bd2e3051f622bf8742e79f2641d78b29523680f57bf74
Successfully built tfrecord
Installing collected packages: crc32c, tfrecord
Successfully installed crc32c-2.2.post0 tfrecord-1.14.1


## Mandatory - Load Training Data

Load the training data into memory

In [19]:
# Figure out which device this notebook is being run from
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print("You are using device: %s" % device)

# Load the training data from memory. Note this training data was created
# by converting the tfrecord files from the original dataset
training_data = utils.load_pytorch_tensor('./utils/balanced_train_data.pt')
training_label = utils.load_pytorch_tensor('./utils/balanced_train_label.pt')

# make this multi-classification problem a binary classification problem by
# selecting all labels which contain a given class and making their label
# True, and all others False. In this case, 0 means select the speech class
training_label, count = utils.convert_multiclass_to_binary(0, training_label)
print("Total of " + str(count) + " positive examples out of " + str(training_label.shape[0]) + " samples")

# convert the training data to floating point
training_data = np.float32(training_data)

# split the training data into two parts, one for training and the other for validation
data_train, label_train, data_val, label_val = utils.split_data_train_val(training_data, training_label)

# Load the dataset into an iterable object from which batches can be made
train_dataset = dataloader.MusicDataset(data_train, label_train)
val_dataset = dataloader.MusicDataset(data_val, label_val)

You are using device: cuda
Total of 5668 positive examples out of 21782 samples


## Optional - Simple Linear Model

Let's get a training loop running with this simple linear model, which is nothing but an input, a ReLu, and an output

In [None]:
# Linear Model Hyperparameters

BATCH_SIZE = 32
LEARNING_RATE = 1e-3
HIDDEN_LAYER_SIZE = 64
NUM_EPOCHS = 10

In [None]:
# Linear Model boilerplate code

train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=BATCH_SIZE, shuffle=True)

linear_model = LinearModel.LinearModel(10*128, HIDDEN_LAYER_SIZE, 2)
optimizer = optim.Adam(linear_model.parameters(), lr=LEARNING_RATE)
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer)
criterion = nn.CrossEntropyLoss()

In [None]:
for epoch_idx in range(NUM_EPOCHS):
    print("-----------------------------------")
    print("Epoch %d" % (epoch_idx+1))
    print("-----------------------------------")
    
    train_loss, avg_train_loss = utils.train(linear_model, train_loader, optimizer, criterion)
    scheduler.step(train_loss)

    val_loss, avg_val_loss = utils.evaluate(linear_model, val_loader, criterion)

    avg_train_loss = avg_train_loss.item()
    avg_val_loss = avg_val_loss.item()
    print("Training Loss: %.4f. Validation Loss: %.4f. " % (avg_train_loss, avg_val_loss))

-----------------------------------
Epoch 1
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 1.7526. Validation Loss: 0.5098. 
-----------------------------------
Epoch 2
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.4300. Validation Loss: 0.4178. 
-----------------------------------
Epoch 3
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.4108. Validation Loss: 0.4182. 
-----------------------------------
Epoch 4
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.4079. Validation Loss: 0.4267. 
-----------------------------------
Epoch 5
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.4061. Validation Loss: 0.3990. 
-----------------------------------
Epoch 6
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.3921. Validation Loss: 0.4018. 
-----------------------------------
Epoch 7
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.3956. Validation Loss: 0.4062. 
-----------------------------------
Epoch 8
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.3921. Validation Loss: 0.4089. 
-----------------------------------
Epoch 9
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.3916. Validation Loss: 0.3938. 
-----------------------------------
Epoch 10
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.3888. Validation Loss: 0.4522. 


## Optional - Simple Convolutional Model

Using convolution for sound identification is an established method and has been used on this dataset before. The idea is to make the learnable kernel 1-D and stride it across the sound artifacts. In this simple convolutional model, all 10 seconds of sounds will get flattened into one tensor, and a 1-D kernel strided. 

In [None]:
# Simple Convolution Model Hyperparameters

BATCH_SIZE = 32
LEARNING_RATE = 1e-3
NUM_EPOCHS = 10

START_KERNEL_SIZE = 3
DROPOUT_RATE = 0.2

In [None]:
# Convolution Boilerplate code
train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=BATCH_SIZE, shuffle=True)

convolution_model = SimpleConvolutionModel.SimpleConvolutionModel(START_KERNEL_SIZE, DROPOUT_RATE)
optimizer = optim.Adam(convolution_model.parameters(), lr=LEARNING_RATE)
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer)
criterion = nn.CrossEntropyLoss()

In [None]:
for epoch_idx in range(NUM_EPOCHS):
    print("-----------------------------------")
    print("Epoch %d" % (epoch_idx+1))
    print("-----------------------------------")
    
    train_loss, avg_train_loss = utils.train(convolution_model, train_loader, optimizer, criterion)
    scheduler.step(train_loss)

    val_loss, avg_val_loss = utils.evaluate(convolution_model, val_loader, criterion)

    avg_train_loss = avg_train_loss.item()
    avg_val_loss = avg_val_loss.item()
    print("Training Loss: %.4f. Validation Loss: %.4f. " % (avg_train_loss, avg_val_loss))

-----------------------------------
Epoch 1
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 95.4416. Validation Loss: 51.7843. 
-----------------------------------
Epoch 2
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 34.2707. Validation Loss: 16.0627. 
-----------------------------------
Epoch 3
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 7.4483. Validation Loss: 3.1207. 
-----------------------------------
Epoch 4
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 1.5687. Validation Loss: 0.8310. 
-----------------------------------
Epoch 5
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.6056. Validation Loss: 0.5321. 
-----------------------------------
Epoch 6
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.4478. Validation Loss: 0.5142. 
-----------------------------------
Epoch 7
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.4225. Validation Loss: 0.4646. 
-----------------------------------
Epoch 8
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.4186. Validation Loss: 0.4541. 
-----------------------------------
Epoch 9
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.4154. Validation Loss: 0.5714. 
-----------------------------------
Epoch 10
-----------------------------------


  0%|          | 0/545 [00:00<?, ?it/s]

  0%|          | 0/137 [00:00<?, ?it/s]

Training Loss: 0.4187. Validation Loss: 0.4841. 


## Optional - Model Evaluation

Pick a model that was trained above and push the evaluation data through it, and compute the model metrics

In [None]:
# First, load the evaluation data
eval_data = utils.load_pytorch_tensor('./utils/eval_data.pt')
eval_label = utils.load_pytorch_tensor('./utils/eval_label.pt')

# make this multi-classification problem a binary classification problem
eval_label, count = utils.convert_multiclass_to_binary(0, eval_label)
print("Total of " + str(count) + " positive examples out of " + str(eval_label.shape[0]) + " samples")

eval_data = np.float32(eval_data)

# Next, pick the model you want to evaluate
# options: linear_model, convolution_model, 
model = linear_model 

# push the eval data through the model
eval_dataset = dataloader.MusicDataset(eval_data, eval_label)
eval_loader = DataLoader(eval_dataset, batch_size=BATCH_SIZE, shuffle=True)

avg_acc = utils.evaluate_with_metrics(model, eval_loader)

print("Model achieved an average accuracy of %.4f on evaluation data" % avg_acc.item())


Total of 5233 positive examples out of 19976 samples


  0%|          | 0/625 [00:00<?, ?it/s]

Model achieved an average accuracy of 0.7380 on evaluation data
