# Implementing ANN using Tensorflow - Final Project
## Using a Convolutional Neural Network for genre Classification in music
---

by Group 03 - Lea Doppertin and Sven Groen

We are trying to recreate the results achieved by:

http://cs229.stanford.edu/proj2018/report/21.pdf

---

# Introduction

## Importing Librarys

In [122]:
import tensorflow as tf
import pandas as pd
import os
import numpy as np
import librosa as lb
import matplotlib.pyplot as plt
import IPython.display as ipd
import random

# The Dataset

The GTZAN dataset was also used by Huang, Serafini and Pugh [1]. It consists of 1000 wav-audio files with each 30 secounds of length. They are sorted by musical genre. In [1] the Authors did some preprocessing to the audio in order to achieve better results:

>"From each clip, we sampled a
contiguous 2-second window at four random locations, thus augmenting our data to 8000 clips of two seconds each.
Since this data was sampled at 22050HZ, this leaves us with 44100 features for the raw audio input. We restricted
our windows to two seconds to limit the number of features. We found that 44100 features was the perfect balance
between length of audio sample and dimension of feature space. Thus after pre-processing our input is of shape (8000,
CS229 Final Report - Music Genre Classification
44100), where each feature denotes the amplitude at a certain timestep out of the 44100. We also used 100 samples of
un-augmented data each of our cross validation and test sets."

## Loading the Data in Python

### creating a csv for easier use

Having the data in a Pandas Dataframe format makes it easier to access the files.

In [72]:
genre_directory = os.getcwd() + "\data\genres"

#get the labels (all Genres)
for dirpath, dirname, filename in os.walk(genre_directory):
    labels = dirname
    break

# creating touple dataset
data = []
for genre in labels:
    path = genre_directory + "\\" + genre
    for dirpath, dirnames, filenames in os.walk(path):
        for file in filenames:
            data.append((file,"data/genres/"+genre+"/"+file , genre))
            
#"data/genres/" + genre

In [80]:
df = pd.DataFrame(data, columns = ["filename","path","genre"])
df.head()


Unnamed: 0,filename,path,genre
0,blues.00000.wav,data/genres/blues/blues.00000.wav,blues
1,blues.00001.wav,data/genres/blues/blues.00001.wav,blues
2,blues.00002.wav,data/genres/blues/blues.00002.wav,blues
3,blues.00003.wav,data/genres/blues/blues.00003.wav,blues
4,blues.00004.wav,data/genres/blues/blues.00004.wav,blues


In [8]:
print("Our Dataset consists of {} files.".format(len(df)))

Our Dataset consists of 1000 files.


### Loading the Data

In [133]:
signals = []
for file in df["path"]:
    signal,sampling_rate= lb.load(file)
    signals.append(signal)
    

In [136]:
# Lets hear a random example
ipd.Audio(np.random.choice(signals),rate=sampling_rate)

#### 2-second random samples

We need to sample contiguous 2-second window at four random locations.
To get the duration of each signal in seconds we need to divide it by the sampling rate:

$\frac{len(signal)}{sampling\_rate} = duration_{secounds}$

$\Leftrightarrow len(signal) = duration_{secounds} * sampling\_rate$


In [139]:
len_signal = 2 * sampling_rate
print("To receive 2 second random clips from our audio signal, we have to look at {} continous entries in our numpy signal.".format(len_signal))

To receive 2 second random clips from our audio signal, we have to look at 44100 continous entries in our numpy signal


To achieve this we pick one random starting index (and make sure it can maximialy be at 28 seconds) and return the signal from that point on for 2 seconds:

In [146]:
# get a random starting point in the 30 sec. clip
def get_rand_sample(signal,duration,sr=22050):
    len_signal=duration * sr
    rand_index = np.random.randint(len(signal)-len_signal+1,size=1)[0]
    
    return signal[rnd_start_index: rnd_start_index+len_signal]



example = get_rand_sample(signals[5],2,sr=sampling_rate)
assert(len(example)/sampling_rate==2.0)
ipd.Audio(example, rate=sampling_rate)

AttributeError: 'numpy.ndarray' object has no attribute 'get_samplerate'

now we have to sample random 44100 continous samples at four random locations in each signal.

## Exploring the Dataset

# Preprocessing the Data


## Generating the Images

# The Convolutional Network

# Performance

# Results

# References

[1] http://cs229.stanford.edu/proj2018/report/21.pdf