# Demo: using the root-note detection model

In this notebook we demonstrate how a user can use our trained model for root-note prediction tasks on new pieces of music.

In [1]:
import pickle
import pandas as pd
from collections import Counter

from sklearn.ensemble import RandomForestClassifier

from music21 import *
from music21.analysis.discrete import KrumhanslSchmuckler, SimpleWeights, AardenEssen, BellmanBudge, TemperleyKostkaPayne

### Step 1: load the model

In the main notebook `root_note_detection.ipynb` we have experimented with multiple models for root-note detection, and saved the best model. We used code like this to save it:

In [2]:
model_filename = "./models/SMOTE_DS_RandomForestClassifier.h5"

```python
rf = RandomForestClassifier()
rf.fit(Xtrain, ytrain)
pickle.dump(rf, model_filename)
```

Now, a user does not need to retrain the model. We can load in the model (which is in our GitHub repo with the filename shown above), using code like this:

In [3]:
rf = pickle.load(open(model_filename, "rb"))

https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations


### Step 2: load a piece of music to be analysed

Next, we'll load a piece of music where we want to detect the key. We'll choose the first piece in the *O'Neill's 1850* collection, which is in the `music21` corpus.

In [4]:
tune = corpus.parse("oneills1850/0001-0050.abc", number=1)

In [5]:
tune.metadata.title

'The Enchanted Valley'

### Step 3: extract features

We'll use `music21` to extract the necessary features from this piece.

The features we need are:

```
'Krumhansl-Shmuckler', 'simple weights', 'Aarden Essen',
'Bellman Budge', 'Temperly Kostka Payne'
'final_note', 'freq note', 'freq weighted acc'
```

Notice we do not include the `as_transcribed` feature: most datasets where root-note detection is required will not have a partially-reliable `as_transcribed` feature.

We'll define a function to extract the features in the expected order.

In [6]:
def feature_extractor(tune):
    # get all the predictions from the Krumhansl-Schmuckler and similar algorithms
    histogram_predictions = [
        method().getSolution(tune).getTonic().pitchClass
        for method in
            (KrumhanslSchmuckler, SimpleWeights, AardenEssen, BellmanBudge, TemperleyKostkaPayne)
    ]
    
    # get a few features from heuristics. first, get the raw pitch
    # class sequence as integers
    tunef = tune.flatten()
    pitch_classes = [note.pitch.pitchClass for note in tunef.notes]
    
    final_note = pitch_classes[-1]
    freq_note = Counter(pitch_classes).most_common()[0][0]
    
    freq_weighted_acc = freq_note # TODO: replace this with the real algorithm
    
    heuristics = [
        final_note,
        freq_note,
        freq_weighted_acc
    ]
    return histogram_predictions + heuristics

Now let's run our function on the tune we have extracted.

In [7]:
features = feature_extractor(tune)

In [8]:
features

[7, 7, 7, 7, 7, 7, 7, 7]

Well, it looks like this was an "easy" tune: all the key-detection algorithms (Krumhansl-Schmuckler and variants) and all of our heuristics, gave the same result. They predicted pitch-class 7, ie G.

Anyway, let's pass this feature vector into our model.

### Step 4: run the model with these features to obtain a prediction



In [9]:
rf.predict([features]) # TODO

array([7], dtype=int64)

As shown, the random forest model makes a prediction. In this case, the random forest doesn't add any value! In other tunes, it does -- as an ensemble algorithm it's a bit better than the sum of its parts.

### Conclusion

To summarise our workflow:

1. Load the pretrained RF model using `pickle`
2. Load a piece of music using `music21`
3. Extract features using `feature_extractor`
4. Run the RF model with these features to obtain a prediction