A content-based music recommender that surfaces sonically similar tracks using a K-Nearest Neighbors model trained on Spotify audio features. Takes a song as input and returns five recommendations based on audio feature similarity across a dataset of 114,000+ tracks. For songs outside the dataset, the system falls back to GPT-4o-mini, with results enriched via the Spotify API.
Built as a personal project to explore ML-based recommendation systems and full-stack development.
🔗 Live demo: spin-dmpk.onrender.com
Hosted on Render's free tier — may take ~30 seconds to wake up on first load.
- Type a song name into the search bar (e.g.
Blinding LightsorBlinding Lights - The Weeknd) - Select a track from the dropdown or press Enter
- The CD spins while recommendations are generated
- Click any track card to preview it via the Spotify embed player
- Heart songs to save them to your picks list
Source: Spotify Tracks Dataset (Kaggle, via maharshipandya)
Size: 114,000+ tracks after deduplication
Features used for similarity:
| Feature | Description |
|---|---|
danceability |
How suitable a track is for dancing |
energy |
Perceptual measure of intensity and activity |
valence |
Musical positiveness (happy vs. sad) |
tempo |
Estimated beats per minute |
acousticness |
Confidence the track is acoustic |
instrumentalness |
Predicts whether a track contains no vocals |
liveness |
Detects presence of a live audience |
loudness |
Overall loudness in decibels |
speechiness |
Presence of spoken words in a track |
Preprocessing:
- Dropped rows with missing
track_name,artists, oralbum_name - Deduplicated on
(track_name, artists) - Features normalized with
StandardScalerbefore model fitting
| Layer | Technologies |
|---|---|
| Backend | Python, FastAPI |
| ML Model | scikit-learn (KNN), pandas, joblib |
| Music Metadata | Spotify Web API (Spotipy) |
| Fallback Recommendations | OpenAI GPT-4o-mini |
| Frontend | Plain HTML, CSS, JavaScript |
| Deployment | Render |
The KNN model is trained on Spotify's audio feature vectors, including:
danceability,energy,valencetempo,acousticness,instrumentalnessloudness,speechiness,liveness
Features are normalized using StandardScaler before fitting. At inference time, the query song's feature vector is retrieved from the dataset and its 5 nearest neighbors are returned by cosine distance.
- Search — user queries are matched against the dataset by track name and optionally artist
- KNN lookup — if found, the model returns the 5 most similar songs by audio feature distance
- Spotify enrichment — album art, preview URLs, and track IDs are fetched via the Spotify API
- ChatGPT fallback — if the track is not in the dataset, GPT-4o-mini suggests similar songs which are then enriched with Spotify metadata
1. Clone the repository
git clone https://github.com/rafiamb/spin.git
cd spin2. Install dependencies
pip install -r requirements.txt3. Configure environment variables
Create a .env file in the project root:
SPOTIFY_CLIENT_ID=your_spotify_client_id
SPOTIFY_CLIENT_SECRET=your_spotify_client_secret
OPENAI_API_KEY=your_openai_api_key
Spotify credentials can be obtained by registering an application at developer.spotify.com.
4. Start the server
uvicorn app:app --reloadThe application will be available at http://localhost:8000.
spin/
├── app.py # FastAPI application and API routes
├── requirements.txt
├── static/
│ ├── index.html
│ ├── style.css
│ ├── app.js
│ └── CD.png
└── model/
├── knn_model.joblib # trained KNN model
├── scaler.joblib # fitted StandardScaler
├── songs.parquet # song dataset (114k tracks)
└── features.json # feature column names
This project is licensed under the MIT License.