# Demonstration Notebook
## Train model to recognize notes from input sounds

By Ben Walsh \
For Elly

&copy; 2021 Ben Walsh <ben@elly.io>

## Contents

1. [Import Libraries](#lib_import)
1. [Import Data](#data_import)
1. [Evaluate Model](#model_eval)
1. [Demo Model](#model_demo)


TO DO
- note_len_time in melody_record can be float, not just int
- melody_transcribe include DEBUG outputs with feature extract
- better feature extract with librosa?mfccs - see NLP course - then retrain
- Additional data augmentation with shifted onset and scaled length
- Decouple load_training_data so X isn't needed for hum_len/feat_extract
- Submodule repo into simple_gui
- Feature importance with xgboost
- Define data_folder and import so hum_wav_file in test model can use it

In [1]:
%load_ext autoreload
%autoreload 2

## <a id = "lib_import"></a>1. Import Libraries

In [2]:
import sys
import time

import ipywidgets as widgets

import pickle

import xgboost as xgb
from sklearn.metrics import accuracy_score

import pandas as pd
import numpy as np

from scipy.io import wavfile as wav
from IPython.display import Audio

# Add custom modules to path
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)
    
from util.music_util import note_to_freq, Note, Melody, melody_transcribe, melody_record
from util.ml_util import feat_extract, load_training_data


pygame 1.9.6
Hello from the pygame community. https://www.pygame.org/contribute.html
C:\Users\benja\OneDrive\Documents\Python\liloquy-git\note-recognition\sound_files\Hum_Db4.wav does not exist
C:\Users\benja\OneDrive\Documents\Python\liloquy-git\note-recognition\sound_files\Hum_Eb4.wav does not exist
C:\Users\benja\OneDrive\Documents\Python\liloquy-git\note-recognition\sound_files\Hum_Gb4.wav does not exist
C:\Users\benja\OneDrive\Documents\Python\liloquy-git\note-recognition\sound_files\Hum_Ab4.wav does not exist
C:\Users\benja\OneDrive\Documents\Python\liloquy-git\note-recognition\sound_files\Hum_Bb4.wav does not exist
C:\Users\benja\OneDrive\Documents\Python\liloquy-git\note-recognition\sound_files\Hum_B4.wav does not exist


## <a id = "data_import"></a>2.1 Import Data

### 2.1 Import Training Data (for reference)

In [3]:
SCALE = ('C4', 'D4', 'E4', 'F4', 'G4', 'A4')
X, y, fs = load_training_data(SCALE)

  fs, signal = wav.read(training_data[note])


Sanity check the dimensions of the augmented training set

## <a id = "feat_save"></a> 2.2 Import Evaluation Data

In [4]:
DATA_FOLDER = r"../data"

In [5]:
X_test = pd.read_csv(os.path.join(DATA_FOLDER,"X_test.csv"), index_col=0)
y_test = pd.read_csv(os.path.join(DATA_FOLDER, "y_test.csv"), index_col=0)

## <a id = "model_eval"></a>3. Evaluate Model

### 3.1 Load Model

In [6]:
# Load latest model
MODEL_FOLDER = r"..\model\trained_models"

In [7]:
latest_model = os.listdir(MODEL_FOLDER)[-1]
model_to_demo = pickle.load(open('{}/{}'.format(MODEL_FOLDER, latest_model), 'rb'))

### 3.2 Evaluation Model on Test Set

In [8]:
# Generate predictions
y_predict = model_to_demo.predict(X_test)

# Evaluate predictions
print(f"Accuracy on test set: {100*accuracy_score(y_test, y_predict)}")

Accuracy on test set: 100.0


## <a id = "model_demo"></a>4. Demo Model

### 4.1 Demo Model on Pre-Recorded File

In [9]:
#hummed_note = 'C4' 
hummed_note = 'E4' 
#hummed_note = 'A4' 
hum_wav_file = fr"C:\Users\benja\OneDrive\Documents\Python\liloquy-git\note-recognition\sound_files\Hum_{hummed_note}.wav"
fs_in, wav_sig_in = wav.read(hum_wav_file)

Audio(hum_wav_file)

  """


### Predict note with trained model

In [10]:
# Extract features from test data

hum_len = X.shape[1]
hums = np.empty((1,hum_len))
hums[0,:] = wav_sig_in[:hum_len,1]

X_feat = feat_extract(hums, fs_in, note_to_freq, SCALE)

# Use loaded model to predict note
predictions = model_to_demo.predict(X_feat)
for prediction in predictions:
    # XGBoost outputs are class predictions, so use label_encoder inverse to translate to notes
    if isinstance(model_to_demo, xgb.sklearn.XGBRegressor):
        print(f"Predicted note: {label_encoder.inverse_transform([round(prediction)])}")
    else:
        print('Predicted note: {}'.format(prediction))


Predicted note: E4


### Play back predicted note in piano

In [11]:
# XGBoost outputs are class predictions, so use label_encoder inverse to translate to notes
if isinstance(model_to_demo, xgb.sklearn.XGBRegressor):
    note_predict = Note(note=label_encoder.inverse_transform([round(prediction)])[0])
else:
    note_predict = Note(note=prediction)
note_predict.sound.play(0)

<Channel at 0x19322dc4828>

## Predict on length=2 melody 

### Concatenate wav files 

In [12]:
note1_select = widgets.RadioButtons(
    options=['C4', 'E4', 'A4'],
    description='Note 1',
    disabled=False
)

note2_select = widgets.RadioButtons(
    options=['C4', 'E4', 'A4'],
    description='Note 2',
    disabled=False
)

In [13]:
widgets.HBox(children=[note1_select, note2_select])

HBox(children=(RadioButtons(description='Note 1', options=('C4', 'E4', 'A4'), value='C4'), RadioButtons(descri…

In [14]:
MEL_NOTE_LIST = (note1_select.get_interact_value(), note2_select.get_interact_value()) # ('E4','C4')
MEL_FNAME = './melody_test.wav'
mel_sound = Melody(MEL_NOTE_LIST, instr='hum', fname=MEL_FNAME)
Audio(MEL_FNAME)

### Extract signal and generate note predictions

In [23]:
fs, wav_signal = wav.read(MEL_FNAME)
DEBUG = False
predictions = melody_transcribe(wav_signal, fs, model_to_demo, hum_len, SCALE, debug=DEBUG) 
print("Predicted notes: {}".format(predictions))

Predicted notes: ['E4' 'C4']


### Generate Melody object 

In [24]:
# Create Melody object from predictions
melody_predict = Melody(notes=predictions)
melody_predict.sound.play(0)

<Channel at 0x19322dc4a80>

### RECORD IN REAL TIME

In [57]:
REC_FILE_NAME = "./record_sound.wav"
NOTE_TOTAL = 3
NOTE_LEN_TIME = 2

In [63]:
rec_sound = melody_record(note_total=NOTE_TOTAL, note_len_time=NOTE_LEN_TIME, file_name=REC_FILE_NAME)

In [64]:
fs, wav_signal = wav.read(REC_FILE_NAME)
DEBUG = False
predictions = melody_transcribe(wav_signal, fs, model_to_demo, NOTE_LEN_TIME*fs, SCALE, debug=DEBUG)
print(predictions)

['C4' 'A4' 'C4']


In [65]:
# Create Melody object from predictions
melody_predict = Melody(notes=predictions)
melody_predict.sound.play(0)

<Channel at 0x19322e57b70>