# Ricommender Music Clustering Experiment

## Description
Music Clustering Experiment based on music_metadata csv file.

Part of IF4092 Tugas Akhir II project about Multicontext Music Recommender System

## Author
Ferdinandus Richard

## Import Required Modules

In [31]:
from sklearn.cluster import KMeans

import numpy as np
import pandas as pd

## Music Frame Experiment

### Load Music Frame CSV

Music frames metadata are loaded. Features used in this experiment are:
- `title`: title of the song
- `artist`: singer of the song
- `mean_thirteen_first_mfcc`: the mean value of thirteen first MFCC in the frame of the song
- `zcr`: zero crossing rate in the frame of the song
- `max_chroma`: maximum chromagram in the frame of the song

In [32]:
music_frame_dataframe = pd.read_csv('music_frame_metadata.csv', encoding='latin')
music_frame_dataframe

Unnamed: 0,title,artist,mean_thirteen_first_mfcc,zcr,max_chroma
0,Want You Back,5 Seconds of Summer,-35.777269,0.163086,5
1,Want You Back,5 Seconds of Summer,-35.777269,0.316406,2
2,Want You Back,5 Seconds of Summer,-35.777269,0.466797,2
3,Want You Back,5 Seconds of Summer,-35.777269,0.612793,3
4,Want You Back,5 Seconds of Summer,-35.777269,0.602051,3
5,Want You Back,5 Seconds of Summer,-35.777269,0.596191,3
6,Want You Back,5 Seconds of Summer,-35.777269,0.598633,3
7,Want You Back,5 Seconds of Summer,-35.777269,0.598633,3
8,Want You Back,5 Seconds of Summer,-35.777269,0.616211,3
9,Want You Back,5 Seconds of Summer,-35.777269,0.635254,2


In [33]:
cleaned_music_frame_dataframe = music_frame_dataframe.drop('title', axis=1)
cleaned_music_frame_dataframe = cleaned_music_frame_dataframe.drop('artist', axis=1)
cleaned_music_frame_dataframe

Unnamed: 0,mean_thirteen_first_mfcc,zcr,max_chroma
0,-35.777269,0.163086,5
1,-35.777269,0.316406,2
2,-35.777269,0.466797,2
3,-35.777269,0.612793,3
4,-35.777269,0.602051,3
5,-35.777269,0.596191,3
6,-35.777269,0.598633,3
7,-35.777269,0.598633,3
8,-35.777269,0.616211,3
9,-35.777269,0.635254,2


In [34]:
encoded_max_chroma_dataframe = pd.get_dummies(cleaned_music_frame_dataframe['max_chroma'], prefix='max_chroma')
cleaned_music_frame_dataframe = pd.concat([cleaned_music_frame_dataframe, encoded_max_chroma_dataframe], axis=1)
cleaned_music_frame_dataframe = cleaned_music_frame_dataframe.drop('max_chroma', axis=1)
cleaned_music_frame_dataframe

Unnamed: 0,mean_thirteen_first_mfcc,zcr,max_chroma_0,max_chroma_1,max_chroma_2,max_chroma_3,max_chroma_4,max_chroma_5,max_chroma_6,max_chroma_7,max_chroma_8,max_chroma_9,max_chroma_10,max_chroma_11
0,-35.777269,0.163086,0,0,0,0,0,1,0,0,0,0,0,0
1,-35.777269,0.316406,0,0,1,0,0,0,0,0,0,0,0,0
2,-35.777269,0.466797,0,0,1,0,0,0,0,0,0,0,0,0
3,-35.777269,0.612793,0,0,0,1,0,0,0,0,0,0,0,0
4,-35.777269,0.602051,0,0,0,1,0,0,0,0,0,0,0,0
5,-35.777269,0.596191,0,0,0,1,0,0,0,0,0,0,0,0
6,-35.777269,0.598633,0,0,0,1,0,0,0,0,0,0,0,0
7,-35.777269,0.598633,0,0,0,1,0,0,0,0,0,0,0,0
8,-35.777269,0.616211,0,0,0,1,0,0,0,0,0,0,0,0
9,-35.777269,0.635254,0,0,1,0,0,0,0,0,0,0,0,0


## Music Experiment

### Load Music Metadata CSV

Load music metadata CSV based on music_metadata csv file. Features loaded are:
- `file`: file name of the song
- `title`: title of the song
- `artist`: singer of the song
- `album`: album of the song
- `mfcc`: mean of the overall mfcc values
- `zcr`: mean of zero crossing rate of the song
- `tempo`: music tempo
- `pitch`: pitch of the song
- `chroma`: mean of the chroma of the song

In [35]:
music_dataframe = pd.read_csv('music_metadata.csv', encoding='latin')
music_dataframe

Unnamed: 0,file,title,artist,album,mfcc,zcr,tempo,pitch,chroma
0,G:\Code\TugasAkhir\ricommender\musics\5 Second...,Want You Back,5 Seconds of Summer,Youngblood,3.714279,0.113458,99.384014,0.02,0.423991
1,G:\Code\TugasAkhir\ricommender\musics\5 Second...,Youngblood,"5 Seconds Of Summer, Luke Hemmings, Calum Hood...",Youngblood,2.604396,0.105390,117.453835,0.04,0.398548
2,G:\Code\TugasAkhir\ricommender\musics\Adele - ...,Make You Feel My Love,Adele,19 (Deluxe Edition),-2.166559,0.051057,151.999081,0.02,0.299245
3,G:\Code\TugasAkhir\ricommender\musics\Alan Wal...,All Falls Down (feat. Noah Cyrus & Digital Far...,Alan Walker,All Falls Down - Single,-3.486512,0.148028,129.199219,0.01,0.356382
4,G:\Code\TugasAkhir\ricommender\musics\Alan Wal...,Alone,Alan Walker,Alone - Single,4.487372,0.109462,95.703125,-0.02,0.334470
5,G:\Code\TugasAkhir\ricommender\musics\Alan Wal...,Tired (feat. Gavin James),Alan Walker,Tired (feat. Gavin James) - Single,3.471507,0.117655,123.046875,-0.01,0.386283
6,G:\Code\TugasAkhir\ricommender\musics\Alessia ...,How Far I'll Go (Alessia Cara Version),Alessia Cara,Moana: Original Motion Picture Soundtrack,-1.703088,0.070486,123.046875,0.01,0.381406
7,G:\Code\TugasAkhir\ricommender\musics\Alessia ...,Scars To Your Beautiful,Alessia Cara,Know-It-All,6.058626,0.096843,95.703125,0.00,0.366295
8,G:\Code\TugasAkhir\ricommender\musics\Alesso_f...,Let Me Go (feat. Florida Georgia Line & Watt),Hailee Steinfeld & Alesso,Let Me Go - Single,4.722919,0.094320,103.359375,0.04,0.370252
9,G:\Code\TugasAkhir\ricommender\musics\Alicia K...,If I Ain't Got You,Alicia Keys,The Diary of Alicia Keys,-0.528489,0.071205,117.453835,0.01,0.333439


In [36]:
cleaned_music_dataframe = music_dataframe.drop('file', axis=1)
cleaned_music_dataframe = cleaned_music_dataframe.drop('title', axis=1)
cleaned_music_dataframe = cleaned_music_dataframe.drop('artist', axis=1)
cleaned_music_dataframe = cleaned_music_dataframe.drop('album', axis=1)
cleaned_music_dataframe

Unnamed: 0,mfcc,zcr,tempo,pitch,chroma
0,3.714279,0.113458,99.384014,0.02,0.423991
1,2.604396,0.105390,117.453835,0.04,0.398548
2,-2.166559,0.051057,151.999081,0.02,0.299245
3,-3.486512,0.148028,129.199219,0.01,0.356382
4,4.487372,0.109462,95.703125,-0.02,0.334470
5,3.471507,0.117655,123.046875,-0.01,0.386283
6,-1.703088,0.070486,123.046875,0.01,0.381406
7,6.058626,0.096843,95.703125,0.00,0.366295
8,4.722919,0.094320,103.359375,0.04,0.370252
9,-0.528489,0.071205,117.453835,0.01,0.333439


In [37]:
music_kmeans = KMeans(n_clusters=5, random_state=0)
music_kmeans.fit(cleaned_music_dataframe)
music_kmeans.labels_

array([2, 0, 4, 3, 2, 3, 3, 2, 2, 0, 2, 1, 3, 1, 1, 2, 3, 3, 3, 0, 1, 2, 3,
       2, 3, 1, 2, 0, 2, 0, 0, 3, 2, 1, 2, 0, 3, 3, 4, 1, 2, 0, 3, 1, 2, 0,
       0, 0, 2, 1, 1, 3, 3, 2, 2, 3, 1, 3, 1, 1, 2, 0, 1, 2, 1, 3, 3, 1, 2,
       3, 1, 2, 0, 0, 3, 3, 1, 2, 4, 2, 3, 2, 4, 2, 1, 2, 0, 2, 2, 2, 2, 2,
       4, 3, 2, 1, 0, 4, 0, 3, 2, 3, 4, 3, 3, 2, 1, 4, 3, 0, 0, 0, 0, 2, 2,
       0, 2, 2, 4, 0, 3, 1, 1, 0, 0, 0, 0, 2, 2, 4, 1, 3, 2, 2, 0, 0, 0, 0,
       3, 3, 2, 4, 0, 3, 0, 0, 3, 0, 0, 2, 0, 2, 2, 1, 1, 1, 0, 0, 1, 1, 0,
       0, 2, 1, 2, 0, 2, 3, 0, 2, 2, 0, 2, 1, 0, 3, 3, 4, 2, 2, 0, 3, 1, 3,
       3, 0, 0, 3, 0, 2, 4, 2, 3, 1, 3, 0, 1, 4, 3, 0, 0, 3, 2, 0, 3, 4, 3,
       2, 4, 2, 2, 2, 1, 3, 1, 2, 4, 3, 2, 0, 1, 3, 0, 3, 0, 0, 3, 2, 0, 3,
       4, 2])