# 🎵 Music Genre Classification

This notebook presents a machine learning pipeline for classifying music genres using Support Vector Machines (SVM) with One-vs-Rest and One-vs-One strategies.

We use the GTZAN dataset and audio features (MFCC). The model used is RBF Binary Kernel SVM with both multiclass classification strategies One-vs-Rest and One-vs-One.


In [1]:
%load_ext autoreload
%autoreload 2

import numpy as np
import matplotlib.pyplot as plt
import random
import sys
import os

import courselib.utils.loaders as loaders
from courselib.utils.splits import train_test_split
from sklearn.preprocessing import StandardScaler
from courselib.models.multiclass_svm import KernelMulticlassOvR
from courselib.models.multiclass_svm import KernelMulticlassOvO


# Add the repo root (two levels up from this notebook) to sys.path
sys.path.insert(0, os.path.abspath("../../"))

config = {
    "C": 1.0,
    "kernel": "rbf",
    "random_seed": 42,
    "training_data_fraction": 0.8
}

np.random.seed(config["random_seed"])
random.seed(config["random_seed"])

ModuleNotFoundError: No module named 'pandas'

### Loading
 
The GTZAN dataset has data in 2 formats:
1. 30s fragments of each song
2. Each fragment from 1. is split into 10 segments with 3s length

We will work with the 30s file.

In [None]:
df = loaders.load_music_30_sec()
#df = loaders.load_music_3_sec()

### Train-Test Split

We will train the models on all MFCC features and do the train-test split proportion 80/20.

In [None]:
# Extract only MFCC columns from df
mfcc_columns = [col for col in df.columns if col.startswith('mfcc')]
features = mfcc_columns + ['label']

# Do train test split
X, Y, X_train, Y_train, X_test, Y_test = train_test_split(
    df[features],
    training_data_fraction=config["training_data_fraction"],
    class_column_name='label',
    shuffle=True,
    return_numpy=True
)

print('Training data split as follows:')
print(f'  Training data samples: {len(X_train)}')
print(f'      Test data samples: {len(X_test)}')

## Preprocessing

We standardize our features using z-score normalization. This step rescales all features to have zero mean and unit variance.

We also compute a reasonable value for the kernel width parameter $\sigma$ using the *median heuristic*:
$$
\sigma = \sqrt{ \frac{ \text{median}( \| x_i - x_j \|^2 ) }{2} }
$$

This tunning was done because of low accuracy on not preprocessed data.

In [None]:
# Standardize features (z-score)
scaler = StandardScaler().fit(X_train)
X_train_z = scaler.transform(X_train)
X_test_z  = scaler.transform(X_test)

# Compute kernel width using median heuristic (for RBF kernel)
subset = X_train_z[np.random.choice(len(X_train_z), 500, replace=False)]
d2 = np.sum((subset[:, None, :] - subset[None, :, :])**2, axis=-1)
sigma = np.sqrt(0.5 * np.median(d2[d2 > 0]))  # avoid zero distances

X_train = X_train_z
X_test = X_test_z

### OvR model

First we train the OvR classification strategy with the RBF Binary Kernel SVM model.

Then evaluate the performance ovr the test data. Both single model as well as overall performance statistics are calculated.

In [None]:
svmOvR = KernelMulticlassOvR(kernel=config["kernel"], sigma=sigma, C=config["C"])
svmOvR.fit(X_train, Y_train)

In [None]:
svmOvR.evaluate_models(X_test, Y_test)
svmOvR.evaluate_accuracy(X_test, Y_test)

### OvO model

First we train the OvO classification strategy with the RBF Binary Kernel SVM model.

Then evaluate the performance ovr the test data. Both single model as well as overall performance statistics are calculated.

In [None]:
svmOvO = KernelMulticlassOvO(kernel=config["kernel"], sigma=sigma, C=config["C"])
svmOvO.fit(X_train, Y_train)

In [None]:
svmOvO.evaluate_models(X_test, Y_test)
svmOvO.evaluate_accuracy(X_test, Y_test)