# The Task

The purpose of this notebook is to build a pipeline for classifying humming or whistling audio clips as being for the Harry Potter or for the StarWars theme tune.

The dataset used is the MLEnd Hums and Whistles dataset. For more information on this please visit https://lnkd.in/eqAiwu23 

# The Pipeline

We experiment with various architectures. For each we review the validation accuracy and use this to decide on the final pipeline. Our final pipeline:



*   Split each audio file input into two halves
*   Extract the tempo feature for each half
*   Input the extracted features into an SVM model with C=1 predicting if an input has a Potter or a StarWars label



# Splitting the Audio File

We will split each audio signal in half.

From the coursework explanation notebook and [1] we have 

`x, fs = librosa.load(files[n],sr=fs)`

and for two sections:

```
x1 = x[0:int(0.5*len(x))]

x2 = x[int(0.5*len(x)):int(len(x))]
```

# Extracting the Tempo Feature

We refer to [2] for information on tempo. 

```
#example
n = 5
fs = None
x, fs = librosa.load(files[n],sr=fs)
tempo = librosa.beat.tempo(x,sr=fs)
print(tempo)
print(sum(tempo))
```

We will use the sum here so to return a single value rather than a single term list - we will refer to the sum of the tempo output as just the tempo. From the librosa documentation, since we are not setting the aggregate hyperparameter to 'None', we get one overall tempo value rather than the value for each frame.[2]





**The Machine Learning Model**

**Support Vector Models** - We will use an SVM model and experiment with varying the kernel and the regularisation [3]. We know that it is unlikely that we will find a set of features which are linearly separable for the Potter and StarWars audio files and so using an SVM model is a good option since there is the option to experiment with using a kernel to map the data to a new space where it might be linearly separable. Also, reducing the regularisation to less than one means that we have a less precise boundary for the classifier - this could be useful since there is likely to be some outlier training samples and if we tried to fit a boundary to all of the training samples we would end up with a boundary that does not work well more generally [4] , [5]. We will experiment with C = 1 and C = 0.9 and kernel = 'rbf' (default), kernel = 'linear', kernel = 'poly' and kernel = 'sigmoid'.

Note: in this notebook we have only included the model with the final chosen hyperparameters.

# The Code

This code will not run without the required data, given the correct titles.

`filescombinedTrain` contains approx 70% of the Potter files and 70% of the StarWars files

`filescombinedTest` contains approx 30% of the Potter files and 30% of the StarWars files.

` filesPOTTER ` is all the Potter files 

` filesSTARWARS ` is all the StarWars files 

note: we referred to [6] to take a sample of one of the sets of files so that we had the same amount of Potter and StarWars files.



In [None]:
import librosa
import numpy as np

In [None]:
#This is an updated version of the code provided in Principles of Machine Learning Module, Queen Mary University of London

def getXy(s,files): 
  X,y =[],[]
  for file in files:
    yi = file in filesPOTTER #True if file is in Potter folder, otherwise false, reference [7]


    fs = None  #[1]
    x, fs = librosa.load(file,sr=fs)
    
    xi = []

    for j in range(s): 
      xj = x[int((j/s)*len(x)):int(((j+1)/s)*len(x))]  
      tempo_j = sum(librosa.beat.tempo(xj,sr=fs)) #[2]
      xi.append(tempo_j) #the tempo feature for each of the s sections of the audio 


    X.append(xi)
    y.append(yi)

  return np.array(X),np.array(y)

In [None]:
#This is an updated version of the code provided in Principles of Machine Learning Module, Queen Mary University of London

X_train,y_train = getXy(2,filescombinedTrain)

In [None]:
#This is an updated version of the code provided in Principles of Machine Learning Module, Queen Mary University of London

X_test,y_test = getXy(2,filescombinedTest)

In [None]:
#referred to [3], [8], [9] and code provided in Principles of Machine Learning Module, Queen Mary University of London

from sklearn import svm
from sklearn.svm import SVC
from sklearn import metrics
from sklearn.metrics import accuracy_score

model  = svm.SVC(C=1)
model.fit(X_train,y_train)

yt_p = model.predict(X_train)
yv_p = model.predict(X_test)

print('Training Accuracy', accuracy_score(y_train,yt_p))
print('Test Accuracy', accuracy_score(y_test,yv_p))

# References

[1] https://librosa.org/doc/0.9.1/generated/librosa.load.html?highlight=librosa%20load#librosa.load , Brian McFee, Alexandros Metsai, Matt McVicar, Stefan Balke, Carl Thomé, Colin Raffel, Frank Zalkow, Ayoub Malek, Dana, Kyungyun Lee, Oriol Nieto, Dan Ellis, Jack Mason, Eric Battenberg, Scott Seyfarth, Ryuichi Yamamoto, viktorandreevichmorozov, Keunwoo Choi, Josh Moore, … Thassilo. (2022). librosa/librosa: 0.9.1 (0.9.1). Zenodo. https://doi.org/10.5281/zenodo.6097378

[2] https://librosa.org/doc/latest/generated/librosa.beat.tempo.html#librosa.beat.tempo , Brian McFee, Alexandros Metsai, Matt McVicar, Stefan Balke, Carl Thomé, Colin Raffel, Frank Zalkow, Ayoub Malek, Dana, Kyungyun Lee, Oriol Nieto, Dan Ellis, Jack Mason, Eric Battenberg, Scott Seyfarth, Ryuichi Yamamoto, viktorandreevichmorozov, Keunwoo Choi, Josh Moore, … Thassilo. (2022). librosa/librosa: 0.9.1 (0.9.1). Zenodo. https://doi.org/10.5281/zenodo.6097378

[3] scikit-learn developers (BSD License), sklearn.svm.SVC, scikit-learn, 2007-2021, https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html (accessed 10th December 2021)

[4] Using information from Data Mining Module, Queen Mary University of London

[5] scikit-learn developers (BSD License), 1.4.6. Kernel functions, scikit-learn, 2007-2021, https://scikit-learn.org/stable/modules/svm.html#svm-kernels (accessed 10th December 2021)

[6] Python Software Foundation, random - Generate pseudo-random numbers, Python, 2001-2021, https://docs.python.org/3/library/random.html?highlight=random%20sample 

[7] W3Schools, Python Booleans, 1999-2022, https://www.w3schools.com/python/python_booleans.asp (accessed 8th June 2022)

[8] datacamp, Support Vector Machines with Scikit-learn Tutorial, 2019, https://www.datacamp.com/tutorial/svm-classification-scikit-learn-python 

[9] scikit-learn developers (BSD License), sklearn.metrics.accuracy_score, 2007-2022, https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html 