# Music Clustering

### Table of Contents

1. Read in the pre-processed audio as a Pandas DataFrame.
2. Distance computation and low-dimensional embedding of multidimensional features.
3. Dataset normalisation
4. PCA to project data onto 2D plane
5. Analysis of PCA components
6. Perform GaussianMixture clustering on projected data
7. Analysis of number of clusters with BIC and AIC

In [1]:
import math

import numpy as np
import pandas as pd

from sklearn.cluster import KMeans
from sklearn.mixture import GaussianMixture
from sklearn.decomposition import PCA
from sklearn.manifold import MDS, TSNE
from scipy.spatial.distance import pdist, squareform

import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import seaborn as sns

### Storage Variables

These point the program to the directory containing the pre-processed audio data, and saves the current dataframes as temporary files so that consecutive runs can pick up where the previous run left off.

In [2]:
import config

### Universal Variables

These would be used to perform many repeated computations e.g. normalisation.

In [3]:
from sklearn.preprocessing import StandardScaler, MinMaxScaler
scaler = StandardScaler()
minmax_scaler = MinMaxScaler()

## Loading in the Dataset

Data is loaded from the `./features.pkl.pbz2` files located in their respective `data/extracted/playlist-name` directories. All data is concatenated into a single DataFrame, with the `playlist` column indicating which folder they came from.

Setting `read_temp` means the program will read from the `data/temp` folder and recover previous progress.

In [4]:
from src.helpers import PandasAudioRepository

read_temp = False
if read_temp:
    dataset = pd.read_pickle(config.fresh_load_dataset_dir)
else:
    dataset = PandasAudioRepository.load_all_feature_datasets(config.extracted_dir)

dataset.to_pickle(config.fresh_load_dataset_dir, compression='bz2')
dataset

Unnamed: 0,song_name,artist,playlist,zero_crossings_mean,zero_crossings_var,bpm,spectral_centroid_mean,spectral_centroid_var,spectral_rolloff_mean,spectral_rolloff_var,...,mfcc_var_7,mfcc_mean_8,mfcc_var_8,mfcc_mean_9,mfcc_var_9,mfcc_mean_10,mfcc_var_10,chord_trajectory,note_trajectory,tonnetz
0,Ivan Sings,Aram Khachaturian,kino,0.030815,0.029865,143.554688,728.505121,164591.144472,1044.706810,1.492931e+06,...,59.754135,-7.222707,53.279484,-6.638159,63.120949,-7.461281,58.827705,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[-0.013063414466682632, -0.1234544114431008, -..."
1,"Prélude in E Minor, Op. 28, No. 4",Frédéric Chopin,kino,0.028196,0.027401,103.359375,615.425486,95544.686241,892.440162,8.328338e+05,...,66.532372,-7.709404,67.271782,-8.138650,51.361748,-8.201083,51.866173,"[12.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 2.0,...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[0.04845747056013336, 0.02526867983571359, 0.0..."
2,Above the Trees,Kino,kino,0.052121,0.049405,143.554688,1053.924804,248527.612506,1937.848230,1.480721e+06,...,113.514336,-4.448510,74.239128,-2.306997,76.906097,-2.640234,74.699959,"[17.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[-0.05386779890017199, -0.027705138076340542, ..."
3,All,Kino,kino,0.044240,0.042283,161.499023,619.260455,49458.448746,981.976649,2.497879e+05,...,109.742783,3.967615,61.414219,4.160887,66.464058,-0.379875,82.376328,"[3.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[0.007021180094766433, 0.0103778882764489, 0.2..."
4,Anew,Kino,kino,0.048969,0.046571,161.499023,677.808914,72961.813450,1098.534181,4.831254e+05,...,75.191750,-0.318088,72.683647,-0.499459,70.440079,-3.557541,71.182747,"[105.0, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 5.0, 0.0...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[0.047920894082682464, 0.024257793432034533, 0..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
420,Tenderness - Woven Remix,Tony Anderson,tony-anderson,0.050498,0.047948,99.384014,1690.752815,996877.763828,3788.666509,6.661065e+06,...,85.660057,-0.473025,55.165321,1.394155,64.965172,-3.817915,56.059143,"[49.0, 0.0, 0.0, 6.0, 0.0, 3.0, 0.0, 0.0, 17.0...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[0.03940684682031146, 0.1047418567047819, 0.03..."
421,Tenderness,Tony Anderson,tony-anderson,0.019365,0.018990,151.999081,500.969118,144860.433188,764.697603,1.021311e+06,...,42.235184,1.522114,30.496746,-3.054044,27.486416,-6.413583,28.495449,"[46.0, 0.0, 0.0, 2.0, 0.0, 1.0, 0.0, 0.0, 8.0,...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[-0.0033105168498387544, 0.024618212288380806,..."
422,Cambodia - Ross Lara Remix,Tony Anderson,tony-anderson,0.053039,0.050226,129.199219,1618.810406,687875.800819,3493.059627,5.250681e+06,...,62.413822,-2.939895,52.945293,2.519716,61.408394,-2.011139,51.967083,"[246.0, 1.0, 0.0, 5.0, 0.0, 0.0, 0.0, 0.0, 2.0...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[0.0015526135778904798, 0.1095253043758808, 0...."
423,Cambodia,Tony Anderson,tony-anderson,0.042857,0.041020,129.199219,1234.479968,276949.182834,2474.291523,1.718215e+06,...,118.341034,-1.348665,86.386292,1.038753,90.882286,-4.584059,92.075935,"[295.0, 0.0, 0.0, 5.0, 0.0, 2.0, 0.0, 0.0, 0.0...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[-0.18101009841973742, -0.004248825583113958, ..."


## Low-Dimensional Embedding of High-Dimensional Features

Features that are high-dimensional include:
1. Note Trajectory (16384 dimensions)
2. Chord Trajectory (625 dimensions)
3. Tonnetz (2048 dimensions)

The methodology for each high-dimensional feature is as follows:
1. Calculate pairwise distances
2. Embed points in 2D while preserving their distances between each other
3. Introduce the 2D coordinates as additional features/columns in the dataset

In [5]:
from src.helpers import standardize_tonnetz

note_trajectories = dataset.pop('note_trajectory').apply(pd.Series)
chord_trajectories = dataset.pop('chord_trajectory').apply(pd.Series)
tonnetz = dataset.pop('tonnetz').apply(standardize_tonnetz).apply(pd.Series)

### Calculate Pairwise Distances

First, trajectory matrices are normalised. This is important because longer music tend to stay on the same notes/chords longer. We want music that have similar harmonic transitions to be close together, no matter how long or short they are.

Then, Euclidean distance is used to calculate the distance between each point or 'row' in the dataset.

In [6]:
note_distances = pdist(minmax_scaler.fit_transform(note_trajectories.T).T, 'euclidean')
chord_distances = pdist(minmax_scaler.fit_transform(chord_trajectories.T).T, 'euclidean')
tonnetz_distances = pdist(tonnetz, 'euclidean') # skipping normalisation because tonnetz has its own scale that represents melodic movement

The MDS algorithm will attempt to plot each point on a 2D plane while preserving the calculated distances as much as possible.

In [7]:
note_mds = MDS(n_components=2, dissimilarity='precomputed', normalized_stress=False)
chord_mds = MDS(n_components=2, dissimilarity='precomputed', normalized_stress=False)
tonnetz_mds = MDS(n_components=2, dissimilarity='precomputed', normalized_stress=False)

note_coordinates = note_mds.fit_transform(squareform(note_distances))
chord_coordinates = chord_mds.fit_transform(squareform(chord_distances))
tonnetz_coordinates = tonnetz_mds.fit_transform(squareform(tonnetz_distances))

The obtained 2D coordinates for each feature is then added to the main dataset as additional columns.

In [8]:
note_coordinates_df = pd.DataFrame(note_coordinates, columns=['x', 'y'])
chord_coordinates_df = pd.DataFrame(chord_coordinates, columns=['x', 'y'])
tonnetz_coordinates_df = pd.DataFrame(tonnetz_coordinates, columns=['x', 'y'])

dataset['note_x'] = note_coordinates_df['x']
dataset['note_y'] = note_coordinates_df['y']

dataset['chord_x'] = chord_coordinates_df['x']
dataset['chord_y'] = chord_coordinates_df['y']

dataset['tonnetz_x'] = tonnetz_coordinates_df['x']
dataset['tonnetz_y'] = tonnetz_coordinates_df['y']

dataset

Unnamed: 0,song_name,artist,playlist,zero_crossings_mean,zero_crossings_var,bpm,spectral_centroid_mean,spectral_centroid_var,spectral_rolloff_mean,spectral_rolloff_var,...,mfcc_mean_9,mfcc_var_9,mfcc_mean_10,mfcc_var_10,note_x,note_y,chord_x,chord_y,tonnetz_x,tonnetz_y
0,Ivan Sings,Aram Khachaturian,kino,0.030815,0.029865,143.554688,728.505121,164591.144472,1044.706810,1.492931e+06,...,-6.638159,63.120949,-7.461281,58.827705,-2.734850,-2.391100,0.355409,-1.385659,-5.217868,0.639658
1,"Prélude in E Minor, Op. 28, No. 4",Frédéric Chopin,kino,0.028196,0.027401,103.359375,615.425486,95544.686241,892.440162,8.328338e+05,...,-8.138650,51.361748,-8.201083,51.866173,0.645407,-2.584327,-1.300806,-0.535847,3.778130,1.075258
2,Above the Trees,Kino,kino,0.052121,0.049405,143.554688,1053.924804,248527.612506,1937.848230,1.480721e+06,...,-2.306997,76.906097,-2.640234,74.699959,0.640059,2.990306,1.334453,0.097345,-3.622262,-5.931397
3,All,Kino,kino,0.044240,0.042283,161.499023,619.260455,49458.448746,981.976649,2.497879e+05,...,4.160887,66.464058,-0.379875,82.376328,2.680270,-2.280966,-1.090596,-0.187411,7.525536,4.263240
4,Anew,Kino,kino,0.048969,0.046571,161.499023,677.808914,72961.813450,1098.534181,4.831254e+05,...,-0.499459,70.440079,-3.557541,71.182747,-3.557114,-0.148699,0.235442,1.084642,5.148219,-6.983374
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
420,Tenderness - Woven Remix,Tony Anderson,tony-anderson,0.050498,0.047948,99.384014,1690.752815,996877.763828,3788.666509,6.661065e+06,...,1.394155,64.965172,-3.817915,56.059143,-0.208783,-0.246062,-1.102564,0.012477,-0.364212,0.165032
421,Tenderness,Tony Anderson,tony-anderson,0.019365,0.018990,151.999081,500.969118,144860.433188,764.697603,1.021311e+06,...,-3.054044,27.486416,-6.413583,28.495449,0.466951,-1.327135,0.099290,-0.529944,-1.773970,4.449489
422,Cambodia - Ross Lara Remix,Tony Anderson,tony-anderson,0.053039,0.050226,129.199219,1618.810406,687875.800819,3493.059627,5.250681e+06,...,2.519716,61.408394,-2.011139,51.967083,-0.248014,-0.089996,0.001711,0.158012,1.289300,3.632961
423,Cambodia,Tony Anderson,tony-anderson,0.042857,0.041020,129.199219,1234.479968,276949.182834,2474.291523,1.718215e+06,...,1.038753,90.882286,-4.584059,92.075935,-1.467439,-0.836703,-0.040569,0.378061,4.867373,6.972623


## Save Progress

In [9]:
dataset.to_pickle(config.dim_reduction_result_dir, compression='bz2')

## Visualisation of Reduced Dimensions

### Note Trajectory

We can see that points form a ball around the centre. This is a sign that they are all more or less equidistant from each other.

This can be due to there being too many potential notes, so even similar sounding songs might not play the same notes. This is the consequence of the curse of dimensionality.

However, small clusters do form.

In [13]:
note_trajectory_fig = px.scatter(dataset, x='note_x', y='note_y', color='playlist', hover_data=config.metadata_columns)
note_trajectory_fig.write_image(config.image_dir + '/note-trajectory-mds.png', width=1000, scale=2)
note_trajectory_fig

### Chord Trajectory

Here, clustering is much stronger. There are much lesser chords to choose from in this implementation of the chord trajectory.

In [14]:
chord_trajectory = px.scatter(dataset, x='chord_x', y='chord_y', color='playlist', hover_data=config.metadata_columns)
chord_trajectory.write_image(config.image_dir + '/chord-trajectory-mds.png', width=1000, scale=2)
chord_trajectory

### Tonnetz Distances

The values of each song's tonnetz is not normalised as the tonnetz has its own scale and each value represents a movement in melody. Songs clustered together tend to have similar key signatures and chord progressions, which is impressive.

However, it is still important to note that even though chord progressions and key match on some level, they could happen in different contexts such as having completely different energy levels. One might be classified as cafe & chill, and another one might be classified as pop synth. It is important to incorporate this feature together with other spectral features.

In [15]:
tonnetz_fig = px.scatter(dataset, x='tonnetz_x', y='tonnetz_y', color='playlist', hover_data=config.metadata_columns)
tonnetz_fig.write_image(config.image_dir + '/tonnetz-mds.png', width=1000, scale=2)
tonnetz_fig