## Python Imports

#### TODO: Write docstrings for vinyl functions

In [1]:
import librosa
import spotipy
import os, requests, time, random

import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers.recurrent import LSTM
from keras.layers.convolutional import Conv3D
from keras.layers.convolutional_recurrent import ConvLSTM2D
from keras.layers.normalization import BatchNormalization
from keras.optimizers import Adam

%matplotlib inline
import matplotlib.pyplot as plt
import librosa.display
import IPython.display as ipd

from src.obtain.spotify_metadata import generate_token, download_playlist_metadata
from src.vinyl.audio_downloader import download_preview_mp3
from src.vinyl.build_datasets import sample_non_zouk_songs
from src.vinyl.build_datasets import extract_features
from src.vinyl.build_datasets import build_dataset

Using TensorFlow backend.


## [Globals](https://www.geeksforgeeks.org/global-local-variables-python/)

In [2]:
# globals
spotify_username = 'djconxn'
user_id = "spotify:user:djconxn"
zoukables_uri = "spotify:playlist:79QPn32wwghlJfTImywNgV"

sample_mp3_dir = 'data/raw/mp3s'
metadata_dir = "data/interim/genre_metadata"
zoukables_metadata_path = os.path.join(metadata_dir, 'zoukables_metadata.tsv')

zouk_features_path = "data/processed/zoukable_spectral.npy"
non_zouk_features_path = "data/processed/non_zoukable_spectral.npy"

# Set a condition that overwrites all saved data sets
REFRESH = True

## Model Config

### Features Set

In [3]:
features_dict = {
    librosa.feature.mfcc : {'n_mfcc':13},
    librosa.feature.spectral_centroid : {},
    librosa.feature.chroma_stft : {'n_chroma':12},
    librosa.feature.spectral_contrast : {'n_bands':6},
    #librosa.feature.tempogram : {'win_length':192}
}

### Model Architecture
#### TODO: Design a schema for configuring Keras models to build

# Obtain Data

Set up the Spotify client, download metadata from a Zouk playlist and a non-Zouk playlist.

Download song mp3 samples.

## Authenticate Spotify Client

In [4]:
token=generate_token(username=spotify_username)
sp = spotipy.Spotify(auth=token)

## Download Zouk Playlist Metadata

In [5]:
if os.path.isfile(zoukables_metadata_path) and not REFRESH:
    zouk = pd.read_csv(zoukables_metadata_path, sep='\t')
else:
    zouk = download_playlist_metadata(sp, user_id, zoukables_uri, 'zoukables')
    zouk.to_csv(zoukables_metadata_path, sep='\t')
    print(zouk.shape)

zouk_songs = zouk['id'].tolist()

(595, 28)


## Download Zouk Playlist Sample mp3's

In [6]:
for i in zouk.index:
    song_id = zouk['id'][i]
    mp3_url = zouk['preview_mp3'][i]
    mp3_filepath = os.path.join(sample_mp3_dir, song_id + '.mp3')
    if not os.path.isfile(mp3_filepath):
        download_preview_mp3(mp3_url, mp3_filepath)

## Sample Non-Zouk Songs

#### TODO: `sample_non_zouk_songs` should return 
a DataFrame with id, song title, artist, preview url
#### TODO: `sample_non_zouk_songs` sometimes throws errors 
on `metadata = pd.read_csv(metadata_path, sep='\t').dropna()`
#### TODO: `sample_non_zouk_songs` should check 
if songs are in Zoukables list... these aren't mutually exclusive playlists!


In [7]:
genres = os.listdir(metadata_dir)
genres.remove("zoukables_metadata.tsv")

n = zouk.shape[0]
non_zouk_songs, sample_urls = sample_non_zouk_songs(n, genres, metadata_dir)

# Calculate Audio Features for Songs



Sample 10 other genres. Add the songs from their playlists to one list. Sample `n_zouk_songs` from that list. Use these as negative cases for training our zouk classifier. Train to convergence, then repeat with another sample of non-zouk songs.

This process should train a decent classifier for songs from this playlist, but I really need to find a much larger list of positive cases.

## Build Zouk Features Dataset
#### TODO: Save Features with mp3s, not in a Zouk/Non-Zouk npy file

In [8]:
zouk_urls = dict(zip(zouk['id'], zouk['preview_mp3']))

if os.path.isfile(zouk_features_path) and not REFRESH:
    zouk_data = np.load(zouk_features_path)
else:
    zouk_data = build_dataset(zouk_songs, zouk_urls, sample_mp3_dir, features_dict)
    np.save(zouk_features_path, zouk_data)

## Build Non-Zouk Features Dataset

In [9]:
if os.path.isfile(non_zouk_features_path) and not REFRESH:
    non_zouk_data = np.load(non_zouk_features_path)
else:
    non_zouk_data = build_dataset(non_zouk_songs, sample_urls, sample_mp3_dir, features_dict)
    np.save(non_zouk_features_path, non_zouk_data)

## Build Targets

In [10]:
target = np.array([1] * len(zouk_songs) + [0] * len(non_zouk_songs))

## Train Test Split

In [11]:
print(zouk_data.shape)
print(non_zouk_data.shape)

(595, 1294, 33)
(595, 1294, 33)


In [12]:
X = np.concatenate((zouk_data, non_zouk_data))

train_idx, test_idx, y_train, y_test = train_test_split(
    range(X.shape[0]), target, test_size=0.33, random_state=42, stratify=target)

X_train = X[train_idx,:,:]
X_test = X[test_idx,:,:]

In [13]:
# demonstrate data normalization with sklearn
#from sklearn.preprocessing import MinMaxScaler

# create scaler
#scaler = MinMaxScaler()
# fit and transform in one step
#X_train_norm = scaler.fit_transform(X_train)
#X_test_norm = scaler.transform(X_test)
# inverse transform
# inverse = scaler.inverse_transform(normalized)

ValueError: Found array with dim 3. MinMaxScaler expected <= 2.

# Generating Sequences for an LSTM Classifier

## Build Model

#### TODO: Build models externally, load them in here
(See task 0.3.2.1)

In [16]:
input_shape = (X_train.shape[1], X_train.shape[2])
print("Build LSTM model ...")
model = Sequential()

model.add(LSTM(units=128, dropout=0.05, recurrent_dropout=0.35, return_sequences=True, input_shape=input_shape))
model.add(LSTM(units=64, dropout=0.05, recurrent_dropout=0.35, return_sequences=True))
model.add(LSTM(units=32,  dropout=0.05, recurrent_dropout=0.35, return_sequences=False))
model.add(Dense(units=1, activation="sigmoid"))

# seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
#                    input_shape=input_shape,
#                    padding='same', return_sequences=True))
# seq.add(BatchNormalization())

# seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
#                    padding='same', return_sequences=True))
# seq.add(BatchNormalization())

# seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
#                    padding='same', return_sequences=True))
# seq.add(BatchNormalization())

# seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
#                    padding='same', return_sequences=True))
# seq.add(BatchNormalization())

# seq.add(Conv3D(filters=1, kernel_size=(3, 3, 3),
#                activation='sigmoid',
#                padding='same', data_format='channels_last'))

# seq.compile(loss='binary_crossentropy', optimizer='adadelta')
print("Compiling ...")
# Keras optimizer defaults:
# Adam   : lr=0.001, beta_1=0.9,  beta_2=0.999, epsilon=1e-8, decay=0.
# RMSprop: lr=0.001, rho=0.9,                   epsilon=1e-8, decay=0.
# SGD    : lr=0.01,  momentum=0.,                             decay=0.
opt = Adam()
model.compile(loss="binary_crossentropy", optimizer=opt, metrics=["accuracy"])
model.summary()

Build LSTM model ...




Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
Compiling ...


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_1 (LSTM)                (None, 1294, 128)         82944     
_________________________________________________________________
lstm_2 (LSTM)                (None, 1294, 64)          49408     
_________________________________________________________________
lstm_3 (LSTM)                (None, 32)                12416     
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 33        
Total params: 144,801
Trainable params: 144,801
Non-trainable params: 0
_________________________________________________________________


## Train Model
#### TODO: log the training reports to keep track of learning rates and training times.

In [17]:
print("Training ...")
batch_size = 35  # num of training examples per minibatch
num_epochs = 400
model.fit(
    X_train,
    y_train,
    batch_size=batch_size,
    epochs=num_epochs, 
    validation_split=.25, 
    verbose=1,
    callbacks=[
        keras.callbacks.EarlyStopping(patience=8, verbose=1, restore_best_weights=True),
        keras.callbacks.ReduceLROnPlateau(factor=.5, patience=3, verbose=1),
    ]
)

Training ...
Train on 597 samples, validate on 200 samples
Epoch 1/400
Epoch 2/400
Epoch 3/400
Epoch 4/400
Epoch 5/400
Epoch 6/400
Epoch 7/400
Epoch 8/400
Epoch 9/400
Epoch 10/400
Epoch 11/400
Epoch 12/400
Epoch 13/400
Epoch 14/400
Epoch 15/400
Epoch 16/400
Epoch 17/400
Epoch 18/400

Epoch 00018: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 19/400
Epoch 20/400
Epoch 21/400

Epoch 00021: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628.
Epoch 22/400
Epoch 23/400
Restoring model weights from the end of the best epoch
Epoch 00023: early stopping


<keras.callbacks.History at 0xb15706400>

## Evaluate Model

In [18]:
print("\nTesting ...")
score, accuracy = model.evaluate(
    X_test, y_test, batch_size=batch_size, verbose=1
)
print("Test loss:  ", score)
print("Test accuracy:  ", accuracy)


Testing ...
Test loss:   0.4752581867249564
Test accuracy:   0.8091603164151121


## Save Model

In [19]:
model.save("models/zouk_classifier_spectral_LSTM3.h5")

# Is It Any Good?

Do some explanatory analysis to see what songs are being misclassified. I know that the "labels" are sketchy, so I'll need to do some data cleaning and re-training. How bad is it?

In [20]:
all_songs = pd.DataFrame({'song_id':zouk_songs + non_zouk_songs,
                          'target':target})

trainers = all_songs.iloc[train_idx,:].reset_index()

sample0 = trainers[trainers.target==0].sample(10).index
sample1 = trainers[trainers.target==1].sample(10).index
sample_idx = sample0.append(sample1)
samples = trainers.loc[sample_idx]

In [21]:
y_pred = model.predict(X_train[sample_idx,:])
y_pred_bool = y_pred > 0.75
samples['prediction'] = y_pred_bool.astype(int)
print(classification_report(samples.target, y_pred_bool))

              precision    recall  f1-score   support

           0       0.75      0.90      0.82        10
           1       0.88      0.70      0.78        10

    accuracy                           0.80        20
   macro avg       0.81      0.80      0.80        20
weighted avg       0.81      0.80      0.80        20



In [22]:
fp_index = samples[(samples.target==0) & (samples.prediction==1)].index
fn_index = samples[(samples.target==1) & (samples.prediction==0)].index

print("False Positives:")
for i in fp_index:
    filepath = os.path.join(sample_mp3_dir, (samples['song_id'][i] + '.mp3'))
    ipd.display(ipd.Audio(filepath))

print("~" * 32)

print("False Negatives:")
for i in fn_index:
    filepath = os.path.join(sample_mp3_dir, (samples['song_id'][i] + '.mp3'))
    ipd.display(ipd.Audio(filepath))

False Positives:


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
False Negatives:


# Ship It!

Create a new notebook and copy over the code it needs to run the app from scratch.

Copy over the functions that return the output, and then iterate running the function and copying over the imports and function definitions that are needed to get it to execute without crashing.

(MVP for this should probably run on a single song, not all the songs on a playlist... downloading and extracting the features for many songs is going to take a long time.)

# References

- [Every Noise At Once](http://everynoise.com/)
- [Keras docs](https://keras.io/)
- [Librosa docs](https://librosa.github.io/librosa/index.html)
- [Spotipy docs](https://spotipy.readthedocs.io)
- [ruohoruotsi: LSTM Music Genre Classification on GitHub](https://github.com/ruohoruotsi/LSTM-Music-Genre-Classification)
- [Music Genre classification using a hierarchical Long Short Term Memory (LSTM) Model](http://www.cs.cuhk.hk/~khwong/p186_acm_00_main_lstm_music_rev5.pdf)
- [Using CNNs and RNNs for Music Genre Recognition](https://towardsdatascience.com/using-cnns-and-rnns-for-music-genre-recognition-2435fb2ed6af) [(GitHub)](https://github.com/priya-dwivedi/Music_Genre_Classification)
- [The dummy’s guide to MFCC](https://medium.com/prathena/the-dummys-guide-to-mfcc-aceab2450fd)
- [Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting](https://arxiv.org/abs/1506.04214v1)

# Storage Space Requirements

.model files = 1 - 6 MB

features = 250MB (spectral), 1.7GB(tempo)

mp3 previews = 365 kB ea

# Action Plan

First, update GitHub repo.

## Clean Up

- Reorganize directories
- Move unnecessary files into a scrap folder
- Update GitHub

## Mongo DB: Songs Database

- Song IDs
- Spotify metadata
- Librosa Features
- Genre Labels
- Python API (1.4.0.1/2/3, 2.1.0.1)
- Update GitHub

## Mongo DB: Models Database

- Keras schema (0.3.2.1)
- Feature sets
- Training reports
- Python API (3.1.0.1, 3.2.0.1)
- Update GitHub

## Spotify Connection

- Refresh Zoukables list when training models
- Update FP/FN screening playlists on Spotify
- Update GitHub

## Python Package

- Modules
- Docstrings (0.1.0.1)
- Conda environment
- Update GitHub

## Deployment
- Reproduce pipeline on other machines
- Reproduce pipeline for other genres
- Deploy to AWS