# Music Similarity

This is meant to be an example of how a model can be trained end-to-end and how it can be used for prediction.  
The tutorial is split into 2 parts:

1. Theory

2. Tutorial  
2.1 How to train a model  
2.2 Predicting the classes  
2.3 How to store and load the trained model  

## 1. Theory

TODO

## 2. Tutorial

First we will start by importing the necessary modules and defining some other information.

`MusicSimModel` is the classifier and `train` is used for computing the gaussian mixture parameters.   
The dataset we'll be using consists of two classes: "shounen" and "shoujo".

In [1]:
import sys
from pathlib import Path

# Add root dir
sys.path.append('../analysis')

from classification import MusicSimModel
from training import train

class_names = ('shounen', 'shoujo')
shounen_data_path = Path('/tmp/AnimeThemes/Shounen/wav')
shoujo_data_path = Path('/tmp/AnimeThemes/Shoujo/wav')

We'll create a list of all samples `samples` and a list with the corresponding class label for each sample in `class_labels`. "shounen" is given the label `1` and shoujo the label `2`.

In [2]:
samples = []
class_labels = []

for song_path in shounen_data_path.glob('*'):
    samples.append(str(song_path.resolve()))
    class_labels.append(1)

for song_path in shoujo_data_path.glob('*'):
    samples.append(str(song_path.resolve()))
    class_labels.append(2)

print(f'Total samples={len(samples)}')
print(f'Shounen class={class_labels.count(1)}')
print(f'Shoujo class={class_labels.count(2)}')

Total samples=238
Shounen class=152
Shoujo class=86


## 2.1 How to train a model

Now we can compute the parameters and the SVC using `train`, which will be used for the model.

In [3]:
svc, gmm_parameters = train(samples, class_labels, processes=4)

model = MusicSimModel(gmm_parameters, class_labels, class_names, svc)

Computing Gaussian Mixture Models with procceses=4
    

HBox(children=(FloatProgress(value=0.0, description='worker 0', max=59.0, style=ProgressStyle(description_widt…

HBox(children=(FloatProgress(value=0.0, description='worker 3', max=61.0, style=ProgressStyle(description_widt…

HBox(children=(FloatProgress(value=0.0, description='worker 1', max=59.0, style=ProgressStyle(description_widt…

HBox(children=(FloatProgress(value=0.0, description='worker 2', max=59.0, style=ProgressStyle(description_widt…





Computing SVC gramm matrix


HBox(children=(FloatProgress(value=0.0, max=237.0), HTML(value='')))


Applying rbf
Training SVC


## 2.2 Predicting the classes

We can now use the trained models to predict which class a sample could belong to.
For this example we'll use a shoujo song located in `/tmp/`. The `predict_file()` function produces the class label of the predicted song and the probability of that song being in that class.

In [4]:
# MusicSimModel currently only has the model parameters
# so we have to create the mixture models using load()
model.load()

# Now let's try predicting the class of a sample
sample = '/tmp/Cardcaptor Sakura.wav'
results = model.predict_file([sample])
prob, predicted_class = results[0]

print(f'predicted_class={predicted_class}')
print(f'probability={prob}')

print(f'class={model.class_names[predicted_class]}')

predicted_class=1
probability=0.7317474968214651
class=shoujo


In [5]:
%timeit model.predict_file([sample])

3.2 s ± 88.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## 2.3 How to store and load the trained model

The method used for storing the model for use is by pickling the MusicSimModel object.  
The `train_and_store()` function can be used as a helper function for this.  

Using the samples we've already defined in section 2 an example of an end-to-end solution would be:

```python
path = '/path/to/model/'
filename = 'model.pickle'

from training import train_and_store

# store a model
train_and_store(samples, class_labels, path, filename, processes=4)
```

If however, you already have a trained model, it can be stored by saving it as pickle file.

e.g.

```python
with (path + filename).open('wb') as f:
    pickle.dump(model, f)
```

When a model is saved at a certain path it can then be loaded then via:

```python
# load a model
with open(path + filename, 'rb') as f:
    model: MusicSimModel = pickle.load(f)
```