A project for PRML at IIT Jodhpur that delivers personalized song recommendations based on audio features using unsupervised learning techniques like Kernel PCA and KMeans.
This system recommends songs in five different languages—Hindi, Tamil, Korean, English, and a miscellaneous category—based on the user's input of a song and preferred language.
By analyzing the input song's audio features, we find the most similar songs within the same language group using:
- Dimensionality Reduction (Kernel PCA)
- Clustering (K-Means with Elbow Method and KMeans++ initialization)
- Cosine Similarity for Recommendations
- Source: Spotify Tracks Dataset on Kaggle
- Size: 61,711 tracks
- Languages: Hindi, Tamil, Korean, English, Misc
- Key Features:
tempo,valence,energy,acousticness,danceability,year,popularity,loudness, etc.
- Removed non-audio columns (e.g., artwork, album name)
- Dropped missing/duplicate rows
- One-hot encoded categorical features (
key,time_signature) - Standardized numerical features
- Grouped and saved songs by language
- Applied Kernel PCA (RBF kernel) for capturing non-linear patterns
- Compared with PCA — Kernel PCA yielded better visual separability
- Custom implementation from scratch
- KMeans++ for better centroid initialization
- Elbow Method to find optimal
kper language - Dunn Index used for validation
- Final models saved for each language group
- Tried with multiple
epsilonvalues - Poor cluster formation, most points marked as outliers
- K-Means chosen over DBSCAN for better performance
- For a given input song, identify its cluster
- Use cosine similarity to recommend top-
nmost similar songs within the same cluster - Metadata (track name, artist, URL, artwork) retrieved for each recommendation
- Provided consistent and meaningful recommendations across different language groups
- Evaluated qualitatively with positive alignment to user preferences in mood, rhythm, and genre
-
Clone the Repository
git clone <repo-url> cd MusicApp -
Start the Frontend
cd frontend npm install npm run dev -
Start the Backend (in a second terminal)
python app.pyMake sure to install necessary Python imports using:
pip install -r requirements.txt -
Use the App
- Open your browser and go to
http://localhost:3000(localhost where the frontend runs) - Click the "Get Started" button
- Select a language and search for a song
- Click "Show Recommendations" to view similar songs
- Open your browser and go to
- Use lyrics-based features
- Integrate deep learning models for better embeddings
- Add user feedback loop for more personalization
- Vaibhav Garg
- Saher Dev
- Swayam
- Arnav Kataria
- Tanisha Sonkar