<a href="https://colab.research.google.com/github/ads-ayaz/ml-playbook/blob/master/20200611_Music_classification_SCRATCH.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h1><strong>Machine Learning Playbook</strong> | Aluance Digital</h1>

Machine learning projects have a lot of steps and stages to them. Executing a successful project can be made a whole lot easier when you have a framework that organizes the whole approach.

That's why Aluance developed this "playbook" it simultaneously acts as a machine learning project template and a collection of best practices. The playbook is continually updated, and can be forked whenever a new project is initiated.

# How to use this playbook

# Problem statement

# Libraries

In [112]:
import os

import numpy as np
import pandas as pd

import keras

from sklearn.model_selection import train_test_split
from sklearn.utils import shuffle

Using TensorFlow backend.


# Data preparation

## Acquisition

After uploading the source data file, use the following commands to unzip the contents into the `data` folder.


In [0]:
# Uncomment the lines below to execute.
#!unzip GTZAN.zip -d data

## Unpacking the data

The GTZAN data set includes a folder called `genres_original` that categorizes 30 second music samples in folders labeled by genre. Each sample is a `.wav` file, whose filename follows the format `<genre_label>.<num_id>.wav`. The collection comprises 10 genres folders each containing 100 audio files.

The `images_original` folder contains a Mel Spectogram image in `.png` format for each music sample. Each spectogram provides a visual representation for one of the audio files. The files are organized into folders by genre and the file naming convention is `<genre_label><num_id>.png`, where the num_ids match those of the corresponding  music sample audio files.

The data set also comes with two `.csv` files that list features for the music samples in either their full 30 second format (`features_30_sec.csv`) or in 3 second segments (`features_3_sec.csv`). Breaking the samples up effectively tripples the available data samples to 3,000.

Here are the fields that each of the `.csv` files contain:

| Field | Format | Description | Notes |
| --- | --- | --- | --- |
| filename | string | Name of the music sample `.wav` file.|Segments in the `3_sec` file denoted<br/> as `<fname>.<segment>.wav`. |
| length | int | Size of the file in bytes | [CHECK] |
| chroma_stft_mean<br/>chroma_stft_var | float | The mean and variance of the _chroma_ of the audio sample. | See: [Wikipedia](https://en.wikipedia.org/wiki/Chroma_feature)|
| rms_mean<br/>rms_var | float |||
| spectral_centroid_mean<br/>spectral_centroid_var | float |||
| spectral_bandwidth_mean<br/>spectral_bandwidth_var | float |||
| rolloff_mean<br/>rolloff_var | float |||
| zero_crossing_rate_mean<br/>zero_crossing_rate_var | float |||
| harmony_mean<br/>harmony_var | float |||
| perceptr_mean<br/>perceptr_var | float |||
| tempo | float | The _tempo_ or speed of the beat in BPM. ||
| mfcc00_var<br/>mfcc00_mean | float | Mean and variance of **mel-frequency cepstral coefficients**. | See: [Wikipedia](https://en.wikipedia.org/wiki/Mel-frequency_cepstrum) |
| label | string | Musical genre of the sample. ||


## Import

In [0]:
FOLDER_pwd = os.getcwd()
FOLDER_data_root = "data/"
FOLDER_csv = FOLDER_data_root + "Data/"

FILE_features30 = "features_30_sec.csv"
FILE_features3 = "features_3_sec.csv"

In [0]:
# Import using NumPy
raw_data = np.genfromtxt(FOLDER_csv + FILE_features30, delimiter=',')
raw_data = raw_data[1:, 2:]
np.shape(raw_data)
#raw_data[0]

In [0]:
# Import using Pandas
df_raw = pd.read_csv(FOLDER_csv + FILE_features30)

# Create X as matrix of [n_features, m_examples], dropping some cols
df_X_raw = df_raw.drop(['filename', 'length', 'label'], axis=1)

# Create Y as matrix of [1, m] labels
df_Y_raw = pd.DataFrame(df_raw['label'])
#Y_raw = np.reshape(np.array(df_raw['label']), (np.shape(X_raw)[0], 1))

# Shuffle X, Y
df_X, df_Y = shuffle(df_X_raw, df_Y_raw)

In [138]:
# Construct a dictionary from all the possible labels in Y
dictLabels = dict(map(reversed, enumerate(list(set(df_Y_raw['label'])))))

6

In [0]:
# Split the dataset into train, dev and test sets
X_train, X_test, y_train, y_test = train_test_split(df_X, df_Y, test_size=0.4, shuffle=False)
X_dev, X_test, y_dev, y_test = train_test_split(X_test, y_test, test_size=0.5, shuffle=False)

# One-hot encode the y sets
s = pd.Series(dictLabels)
y_train = keras.utils.to_categorical(s[y_train['label']])
y_dev = keras.utils.to_categorical(s[y_dev['label']])
y_test = keras.utils.to_categorical(s[y_test['label']])

[TODO]
* Normalize inputs

# Model

In [50]:
# Get the shape of the data
n, m = np.shape(X)
n, m

(57, 1000)

# Training

# Tuning

[TODO]
* Batch normalization

#Testing

# Error Analysis

# Insights