# Spotify Song Clustering with Unsupervised Learning

## 1. Problem Description & Data Collection / Provenance

(Describe the project goal, unsupervised learning framing, and where the data comes from.)

## 2. Exploratory Data Analysis (EDA)

2.1 Load and inspect the data  
2.2 Feature descriptions  
2.3 Univariate distributions (histograms, boxplots)  
2.4 Correlations between audio features  
2.5 Missing values, outliers, and data cleaning decisions  
2.6 Feature scaling / transformations (if needed)

## 3. Unsupervised Learning Methods

3.1 Problem framing as clustering / dimensionality reduction  
3.2 Baseline clustering approach (e.g. K-Means)  
3.3 Model selection and hyperparameters (e.g. choice of k)  
3.4 Alternative method (e.g. GMM or hierarchical clustering)  
3.5 Dimensionality reduction for visualization (PCA, maybe t-SNE/UMAP)

## 4. Results and Discussion

4.1 Cluster interpretations (audio feature profiles)  
4.2 Example tracks from each cluster  
4.3 Comparison between clustering methods  
4.4 Limitations and cautions

## 5. Conclusion and Future Work

(Summarize findings, insights, and what youâ€™d do next.)

In [4]:
# Imports

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

pd.set_option('display.max_columns', None)

In [5]:
# Read data
data_path = "../data/genres_v2.csv"
df = pd.read_csv(data_path)
df.head()

  df = pd.read_csv(data_path)


Unnamed: 0.1,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,genre,song_name,Unnamed: 0,title
0,0.831,0.814,2,-7.364,1,0.42,0.0598,0.0134,0.0556,0.389,156.985,audio_features,2Vc6NJ9PW9gD9q343XFRKx,spotify:track:2Vc6NJ9PW9gD9q343XFRKx,https://api.spotify.com/v1/tracks/2Vc6NJ9PW9gD...,https://api.spotify.com/v1/audio-analysis/2Vc6...,124539,4,Dark Trap,Mercury: Retrograde,,
1,0.719,0.493,8,-7.23,1,0.0794,0.401,0.0,0.118,0.124,115.08,audio_features,7pgJBLVz5VmnL7uGHmRj6p,spotify:track:7pgJBLVz5VmnL7uGHmRj6p,https://api.spotify.com/v1/tracks/7pgJBLVz5Vmn...,https://api.spotify.com/v1/audio-analysis/7pgJ...,224427,4,Dark Trap,Pathology,,
2,0.85,0.893,5,-4.783,1,0.0623,0.0138,4e-06,0.372,0.0391,218.05,audio_features,0vSWgAlfpye0WCGeNmuNhy,spotify:track:0vSWgAlfpye0WCGeNmuNhy,https://api.spotify.com/v1/tracks/0vSWgAlfpye0...,https://api.spotify.com/v1/audio-analysis/0vSW...,98821,4,Dark Trap,Symbiote,,
3,0.476,0.781,0,-4.71,1,0.103,0.0237,0.0,0.114,0.175,186.948,audio_features,0VSXnJqQkwuH2ei1nOQ1nu,spotify:track:0VSXnJqQkwuH2ei1nOQ1nu,https://api.spotify.com/v1/tracks/0VSXnJqQkwuH...,https://api.spotify.com/v1/audio-analysis/0VSX...,123661,3,Dark Trap,ProductOfDrugs (Prod. The Virus and Antidote),,
4,0.798,0.624,2,-7.668,1,0.293,0.217,0.0,0.166,0.591,147.988,audio_features,4jCeguq9rMTlbMmPHuO7S3,spotify:track:4jCeguq9rMTlbMmPHuO7S3,https://api.spotify.com/v1/tracks/4jCeguq9rMTl...,https://api.spotify.com/v1/audio-analysis/4jCe...,123298,4,Dark Trap,Venom,,
