# Predicting Classical Composers from Spectrograms

Let's try to accurately predict composers from spectrograms of their compositions. We're going to see if we can predict a composer without any data about instrumentation or dynamics, just the notes themselves.

In [2]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

### Imports

In [3]:
from fastai import *
from fastai.vision import *

## Data
The data was found on a subreddit for DJs who use MIDI files to sample classical music.

It included 370,000 MIDI files, which were filtered to classical compositions and organized by composer.

The data is a mess. Some folders are composers, some are compositions, and there's no clear order to any of the files.

![messy data](./normalization_1.png)

We needed two conversion steps to prepare this data for image classification.

### Normalization
We'll use folder names as our labels, so let's normalize our folder names first.

```py
from pathlib import Path
from sys import argv

# Downcases the names of all directories inside the current directory.

path = Path('.')
dirs = [file for file in path.iterdir() if file.is_dir()]
for (x, dir) in enumerate(dirs):
    print('Changing', dir)
    dir.rename(dir.name.lower())
print('Done')
```

Next, let's normalize our file names to get them ready for conversion.

```py
#!/usr/bin/env python3

from pathlib import Path
from sys import argv

# Recursive Renaming!
# Converts all files in a directory to the format [foldername]_[filenumber]
# WARNING! By default, operates on root directory. Otherwise pass in a folder name
# as first argv.
def rename_files(folder_path, folder_name):
    files = [file for file in folder_path.iterdir() if file.is_file() and ("mid" in file.name or "MID" in file.name)]
    for (x, file) in enumerate(files):
        print('Renaming', file)
        file.rename(f'{folder_path}/{folder_name}_{x}.mid')
    print('Done')

def recursive_rename(dir_path):
    folders = [folder for folder in dir_path.iterdir() if folder.is_dir()]
    for folder in folders:
        print('Renaming:', folder.name)
        rename_files(root_path/f'{folder.name}', folder.name)

root_path = argv[1] if len(argv) > 1 else Path('.')
recursive_rename(root_path)
```

### MIDI to WAV
Next we need to convert our MIDI files to wavs, so we can process them into a spectrogram.

MIDI files not only encode the notes of a given composition, but also data about the instruments. Since we're looking for patterns in the the notes themselves, we're going to use convert all midis to the same synth. All compositions will use an organ-like sound.

We'll use a [NodeJS package called synth](https://github.com/patrickroberts/synth-js) to handle the conversion.

Here's a script that calls its companion bash script to convert all midi files in a directory. The bash and python scripts should be in the same directory. You can call it like this: ```./convert_all_midi_files.py ./my-compositions-directory```

```py
#!/usr/bin/env python3

from pathlib import Path
from sys import argv
import subprocess
import shlex

# Takes in a directory and converts all midi files to .wav

def convert_all_midi_to_wav(folder_path):
    files = [file for file in folder_path.iterdir() if file.is_file() and ("mid" in file.name)]
    for (x, file) in enumerate(files):
        without_file_extension = file.name[:-4]
        resolved = Path(f'{folder_path}/{without_file_extension}')
        # synth adds .mid to input file by default, so we remove it first
        if (Path(f'{resolved}.wav').is_file() == False):
            # Check if .wav exists, so we can safely re-run the file without re-converting
            # any files that were already converted
            print('Converting...', without_file_extension)
            subprocess.call(shlex.split(f'./convert_midi_to_wav.sh {resolved}'))


def do_recursively_on_folders(dir_path, action):
    folders = [folder for folder in dir_path.iterdir() if folder.is_dir()]
    for folder in folders:
        action(folder)

path = Path(argv[1])
do_recursively_on_folders(path, convert_all_midi_to_wav)

```

Here's the companion bash script.
```sh
#!/usr/bin/env bash

synth -i $1
```

Conversion may take a long time! Don't worry if something fails or gets interrupted. You can call the script as many times as you need without worrying about overwriting or losing the files you've already converted.

![converting to .wav files](./conversion.gif)

### WAV to Spectrogram

With our data organized by composer and converted to .wav files, it's time to make our spectrograms!

A [spectrogram](https://en.wikipedia.org/wiki/Spectrogram) is a visual representation of sound and its properties. 

Spectrograms plot frequency on the y axis and time on the x axis. The intensity of colors suggests a higher density of a given frequency.

Here's how we can convert all our .wav files to spectrograms.

```py
#!/usr/bin/env python3

from pathlib import Path
from sys import argv
from scipy import signal
from scipy.io import wavfile
import numpy as np
import matplotlib.pyplot as plt

def create_spectrograms(folder_path):
    files = [file for file in folder_path.iterdir() if file.is_file() and ("wav" in file.name)]
    for (x, file) in enumerate(files):
        without_file_extension = file.name[:-4]
        resolved = Path(f'{folder_path}/{without_file_extension}')
        if (Path(f'{resolved}.png').is_file() == False):
            print('making spectro for', resolved)
            sample_rate, samples = wavfile.read(file)
            frequencies, times, spectrogram = signal.spectrogram(samples, sample_rate)
            plt.pcolormesh(np.log(spectrogram))
            plt.axis('off')
            plt.ylim(ymax=40)
            # plt.show() # Uncomment if you want to check out the plot.
            plt.savefig(f'{resolved}.png', transparent=True)
            # saves the file as a .png with transparent background (no border)

def do_recursively_on_folders(dir_path, action):
    folders = [folder for folder in dir_path.iterdir() if folder.is_dir()]
    for folder in folders:
        action(folder)

if (len(argv) < 2):
    raise Exception('Please specify a path')
path = Path(argv[1])
do_recursively_on_folders(path, create_spectrograms)
```

Look at these beautiful spectrograms!

Here's a Mozart, which looks pretty tame and orderly.
![mozart spectrogram](./mozart_sample.png)

And here's early-20th century composer [Charles Griffes](https://en.wikipedia.org/wiki/Charles_Tomlinson_Griffes), looking far more intense and Romantic.
![charles griffes spectrogram](./griffes_sample.png)

In [4]:
data = Path('/data')

In [5]:
data

PosixPath('/data')

### Labels

In [None]:
labels = 

### Create Learner and Training

Create ConvLearner.

Train for some number of epochs.

### Model Analysis
What is the model most confused about?

### Model Tuning
Unfreeze the model.
Find an optimal learning rate.
