<a href="https://colab.research.google.com/github/HumanAndMachineHearing/Practical_2023/blob/main/Assignment_3_Students.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Programming Assignment 3. Data pre-processing and exploring the ResNet-18 model**

## 1. Introduction

In Session 3, you will explore the ResNet-18 model structure, the files you need for training the model, and prepare your datasets (that is, a one-channel dataset and a three-channel dataset).

* The paper by He et al. (2016) in which ResNet was originally introduced can be found [here](https://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf).
* A helpful explanation of the Resnet paper can be found [here](https://debuggercafe.com/residual-neural-networks-resnets-paper-explanation/).
* If you would like to know more about implementing ResNet-18 from scratch, have a look at the following [blogpost](https://https://debuggercafe.com/implementing-resnet18-in-pytorch-from-scratch/). The present code for the model is based largely on this implementation.

**Practical Report**


<p align = 'justify'> For this session, you are expected to add the output of and answers to the exercises as defined in the notebook to the Practical Report. A link to the templates for the practical reports can be found in the Readme file.



## Preparation

Before starting with the exercises, import the libraries that are essential for this excerise.

In [None]:
# Machine learning framework
import torch

# Library for audio and signal processing with PyTorch
import torchaudio
import torchaudio.transforms as T # for common audio processings and feature extractions
import torchaudio.functional as F # Implements features as standalone functions

# For manipulating directory paths
import os

# For working with datasets
import pandas as pd

# Plotting library
import matplotlib.pyplot as plt
# To embed plots within the notebook
%matplotlib inline

# Scientific and vector computation for Python
import numpy as np

## 2. Preparing data and metadata

Generate a list of names of all sound clips in the datafolder to easily access data later.

In [None]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

# Define file path to folder with .wav files
filepath_snds = '/content/drive/MyDrive/soundclass_resnet18/audio'
filepath_metadata = '/content/drive/MyDrive/soundclass_resnet18'

# Create list of file IDs to cycle through data
filename_csv = 'esc50_edited.csv'

# Load csv file
csv_data = pd.read_csv(os.path.join(filepath_csv,filename_csv),sep = ';')

# Convert filenames to list
file_IDs = csv_data['filename'].to_list()

## 3. Generate two audio feature data sets consisting of a single channel

Code a routine which converts the .wav files into the desired sound feature with the required dimensions using the code from the previous sessions. Do this for two different sound features. Save each dataset into a separate folder.

* Dimensions one-channel dataset: [1, 128, 128]

In [None]:
# Routine to extract single sound feature and store in folder

for count, value in enumerate(file_IDs):
  # Add code to extract audio features

  # Use torch.save() to save the resulting files. # This saves tensors with a .pt file extension.

# 4. The Resnet-18 model

<p align="justify"> To train the Resnet-18 model, we use the files listed below. Have a look at the files to get acquainted with the materials.

* **resnet18.py**: This file contains the code for building the Resnet Basic Blocks and the code for the Resnet module.  
* **utils_resnet_esc50.py**: This file contains the utility scripts, i.e. the function definitions to load the training and validation dataset.
* **training_utils_resnet_esc50_v2.py**: This file contains the function definitions for training and validation.
* **train_resnet_esc50.py**: This script contains everything that is needed for training, i.e., importing the required modules, setting the training parameters, loading the model and executing training.

**Exercise 3.1:**
<p align = "justify"> (A) Describe the Resnet-18 model architecture in your practical report using the information from the sources mentioned at the top of the page.
<p align = "justify"> (B) Explain why ResNet-18 is a suitable model for the present task and dataset or, if you think another model would be better suited to the current task and dataset, describe why you think your proposed model would give better results than Resnet-18.