# Predicting Classical Composers from Spectrograms

Let's try to accurately predict composers from spectrograms of their compositions. We're going to strip away everything about a composition other than its frequency distribution through time if the composition were played by a single instrument (organ).

In [10]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

### Imports

In [11]:
from fastai import *
from fastai.vision import *

## Data
I found a great data set on a subreddit for DJs who use MIDI files to sample classical music.

It included 370,000 MIDI files, which I filtered down to classical compositions.

Unfortunately, the data is a mess. Some folders are composers, some are compositions, and there's no clear order to any of the files.

![messy data](./normalization_1.png)

We need to get things in a roughly sensible order. I moved compositions to the folder of their composer, merged duplicate folders, and extracted any zipped midi files. This gives us a list of folders named by composer.

We'll need to normalize our folders and files next.

### Normalization
Since we'll use folder names as our labels, let's normalize our folder names first.

```py
from pathlib import Path
from sys import argv

# Downcases the names of all directories inside the current directory.

path = Path('.')
dirs = [file for file in path.iterdir() if file.is_dir()]
for (x, dir) in enumerate(dirs):
    print('Changing', dir)
    dir.rename(dir.name.lower())
print('Done')
```

Next, let's normalize our file names to get them ready for conversion.

```py
#!/usr/bin/env python3

from pathlib import Path
from sys import argv

# Recursive Renaming!
# Converts all files in a directory to the format [foldername]_[filenumber]
# WARNING! By default, operates on root directory. Otherwise pass in a folder name
# as first argv.
def rename_files(folder_path, folder_name):
    files = [file for file in folder_path.iterdir() if file.is_file() and ("mid" in file.name or "MID" in file.name)]
    for (x, file) in enumerate(files):
        print('Renaming', file)
        file.rename(f'{folder_path}/{folder_name}_{x}.mid')
    print('Done')

def recursive_rename(dir_path):
    folders = [folder for folder in dir_path.iterdir() if folder.is_dir()]
    for folder in folders:
        print('Renaming:', folder.name)
        rename_files(root_path/f'{folder.name}', folder.name)

root_path = argv[1] if len(argv) > 1 else Path('.')
recursive_rename(root_path)
```

### MIDI to WAV
Next we need to convert our MIDI files to WAV files, so we can process them into a spectrogram.

MIDI files not only encode the notes of a given composition, but also data about the instruments. Our source data has symphonies, pieces for solo guitar, piano concertos, and just about everything in between. Some composers wrote heavily for certain instrument groups, so we *could* use this instrumentation data to predict.

But for this project we'll see if we can find patterns in only the notes themselves. To do this, we need to convert all midis to the same synth. All compositions will use an organ-like sound.

We'll use a [NodeJS package called synth](https://github.com/patrickroberts/synth-js) to handle the conversion.

Here's a script that calls a companion bash script to convert all midi files in a directory. The bash and python scripts should be in the same directory. You can call it like this: ```./convert_all_midi_files_script.py ./my-compositions-directory-here```

```py
#!/usr/bin/env python3

from pathlib import Path
from sys import argv
import subprocess
import shlex

# Takes in a directory and converts all midi files to .wav

def convert_all_midi_to_wav(folder_path):
    files = [file for file in folder_path.iterdir() if file.is_file() and ("mid" in file.name)]
    for (x, file) in enumerate(files):
        without_file_extension = file.name[:-4]
        resolved = Path(f'{folder_path}/{without_file_extension}')
        # synth adds .mid to input file by default, so we remove it first
        if (Path(f'{resolved}.wav').is_file() == False):
            # Check if .wav exists, so we can safely re-run the file without re-converting
            # any files that were already converted
            print('Converting...', without_file_extension)
            subprocess.call(shlex.split(f'./convert_midi_to_wav.sh {resolved}'))


def do_recursively_on_folders(dir_path, action):
    folders = [folder for folder in dir_path.iterdir() if folder.is_dir()]
    for folder in folders:
        action(folder)

path = Path(argv[1])
do_recursively_on_folders(path, convert_all_midi_to_wav)

```

Here's the companion bash script.

```sh
#!/usr/bin/env bash

synth -i $1
```

Conversion may take a long time! Don't worry if something fails or the script gets interrupted. You can call the script as many times as you need without worrying about overwriting or losing the files you've already converted.

![converting to .wav files](./conversion.gif)

### WAV to Spectrogram

With our data organized by composer and converted to WAV files, it's time to make our spectrograms!

A [spectrogram](https://en.wikipedia.org/wiki/Spectrogram) is a visual representation of sound and its properties. 

Spectrograms plot frequency on the y axis and time on the x axis. The intensity of colors suggests a higher density of a given frequency at a point in time.

Here's another script to convert all our WAV files to spectrograms. This will likely also take a while!

```py
#!/usr/bin/env python3

from pathlib import Path
from sys import argv
from scipy import signal
from scipy.io import wavfile
import numpy as np
import matplotlib.pyplot as plt

def create_spectrograms(folder_path):
    files = [file for file in folder_path.iterdir() if file.is_file() and ("wav" in file.name)]
    for (x, file) in enumerate(files):
        without_file_extension = file.name[:-4]
        resolved = Path(f'{folder_path}/{without_file_extension}')
        if (Path(f'{resolved}.png').is_file() == False):
            print('making spectro for', resolved)
            sample_rate, samples = wavfile.read(file)
            frequencies, times, spectrogram = signal.spectrogram(samples, sample_rate)
            plt.pcolormesh(np.log(spectrogram))
            plt.axis('off')
            plt.ylim(ymax=40)
            # plt.show() # Uncomment if you want to check out the plot.
            plt.savefig(f'{resolved}.png', transparent=True)
            # saves the file as a .png with transparent background (no border)

def do_recursively_on_folders(dir_path, action):
    folders = [folder for folder in dir_path.iterdir() if folder.is_dir()]
    for folder in folders:
        action(folder)

if (len(argv) < 2):
    raise Exception('Please specify a path')
path = Path(argv[1])
do_recursively_on_folders(path, create_spectrograms)
```

Converting...
![spectrogram conversion](./spectro-conversion.gif)

Look at these beautiful spectrograms!

Here's a Mozart, which looks pretty tame and orderly.
![mozart spectrogram](./mozart_sample.png)

And here's early-20th century composer [Charles Griffes](https://en.wikipedia.org/wiki/Charles_Tomlinson_Griffes), looking far more intensely [Modern](https://en.wikipedia.org/wiki/Modernism_(music)).
![charles griffes spectrogram](./griffes_sample.png)

### Moving Data to GCP
We need to get our data to our GPU-enabled computer. I'm using a compute instance on [GCP](https://cloud.google.com) to train our model. This step assumes you're using GCP too. If you're not, just ignore this step!

Let's zip up all our files to make them smaller and easier to send.

```sh
# Recursively zips our 'data' directory into a tarball named 'composition.tar.gz', excluding the wav and mid files we made in previous steps.
tar -czvf compositions.tar.gz ./data --exclude=*.wav --exclude=*.mid

```
Cool. The easiest way to get our data to GCP is with [scp](https://www.garron.me/en/articles/scp.html).

Yep, you guessed it, here's another script to send your files off to Google Cloud.

```sh
#! /bin/bash
gcloud compute scp --recurse ./your-data-path-here your-gcp-instance-name-here:~/
```

You may need to move the files from your root directory in GCP to another location depending on what your notebook needs.

## Training
Sweet. We're ready to load our data into a model and start training!

We'll start with [Resnet 34](https://www.kaggle.com/pytorch/resnet34), a model pre-trained for image recognition.

Let's load in our data!

In [21]:
path = Path('/compositions')

We'll use fastai's DataBunch class to get validation and training sets from our data. We'll train on 70% of the data and validate on 30%.


In [19]:
data = ImageDataBunch.from_folder(path, valid_pct=0.3, ds_tfms=get_transforms(), size=224)

FileNotFoundError: [Errno 2] No such file or directory: '/data/train'

`.from_folder` automatically looks through a directory of folders sorted by class and splits the files contained in each class into a `train` (training) and `valid` (validation) set.

ImageDataBunch takes a few additional parameters to help prepare our data for our model. `get_transforms()` creates a list of transforms that we'll apply to our images using the ds_tfms parameter. These include transforms like cropping and zooming the images.

We'll also set size to 224 since this 224x224 are the dimensions of the images used to train Resnet34.

### Sanity Check!
Before we continue let's sanity check our data.

First, let's peek a few rows to make sure they look right:

In [23]:
data.show_batch(rows=5, figsize=(7,6))

AttributeError: 'PosixPath' object has no attribute 'show_batch'

Looking good. Let's make sure our class labels look alright:

In [24]:
print(data.classes)
len(data.classes),data.c

AttributeError: 'PosixPath' object has no attribute 'classes'

Cool! Let's go ahead and create our learner and train it with 5 cycles through the data.

In [26]:
learn = ConvLearner(data, models.resnet34, metrics=error_rate)
learn.fit_one_cycle(5)

AttributeError: 'PosixPath' object has no attribute 'c'

Let's save the model so we can use it in future sessions without having to wait for training again.

In [27]:
learn.save('res-34')

NameError: name 'learn' is not defined

## Model Interpretation and Analysis

Let's see how well we're doing. Specifically, we want to understand where our model is confused. If the model's confusion seems reasonable, we're doing well.

In [None]:
interp = ClassificationInterpretation.from_learner(learn)

First we'll plot our `top_losses`. These are the images that the model was most confused about.

In [None]:
interp.plot_top_losses(10, figsize=(15,11))

Next, we'll plot a confusion matrix. A confusion matrix visualizes the labels that the model is most confused about. If you look to the above and below the sharp diagonal line, you'll see how many times a pair of labels was miscategorized.

As you can see....

In [28]:
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)

NameError: name 'interp' is not defined

Let's check out our most confused pairs to see what we can learn.

In [None]:
interp.most_confused(min_val=3)

Interesting!

### Improving the Model with Resnet 50
Let's take another pass at this problem using Resnet 50, a larger and deeper pre-trained model for image classification.

We'll follow the same process as last time, just with slightly different parameters.

In [29]:
data = ImageDataBunch.from_folder(path_img, fnames, pat, ds_tfms=get_transforms(), size=299, bs=48)
# WHAT ARE THESE PARAMS
data.normalize(imagenet_stats)

# Make our Learner with resnet50
learn = ConvLearner(data, models.resnet50, metrics=error_rate)
learn.fit_one_cycle(5)
learn.save('res-50')

NameError: name 'path_img' is not defined

It looks like things have improved!