Just One Hip Algorithm for Navigating Narratives
The Lyric-based Music Recommendation System, J.O.H.A.N.N, aims to provide personalized music recommendations by analyzing the sentiment of song lyrics. Utilizing the Spotify API to gather a user's saved songs, web-crawling Genius.com to extract lyrics, and employing GPT sentiment analysis, the system suggests songs with matching emotional tones.
Have you ever wondered whether your friends pay more attention to the lyrics or the instrumental part of a song? Many focus on the beat, leaving lyric enthusiasts like me in the dark when it comes to recommendation algorithms. This project addresses that gap by creating a sentiment-driven music recommendation system that combines data from the Spotify API, Genius.com, and ChatGPT.
Current stage of development:
- 12 & 13. Tests & Documentation
- Spotify Integration: Fetches a user's saved songs from Spotify to create a personalized music library. 🎧
- Lyrics Extraction: Web-crawls Genius.com to extract lyrics for each song in the library.
- Sentiment Analysis: Analyzes the sentiment of song lyrics to determine emotional tones. ❤️
- Playlist Generation: Generates a playlist title and description based on the cluster of songs.
- Playlist Picture: Generates a playlist picture based on the title and description.
- Recommendation Engine: Provides personalized song recommendations based on similar sentiment analysis on a single song. 🔍
-
Install dependencies:
pip3 install -r requirements.txt
-
Set up Spotify API credentials:
- Visit the Spotify Developers page. Create an account, a project (Spotify calls it an 'app'), and find the client ID & client secret in 'settings'.
-
Set up Postgres:
- Use the provided schema file.
-
Set up .env file:
- Follow the instructions in the .env file.
-
Run the webcrawler:
python3 webcrawl_lyrics.py
-
Authenticate Spotify and initiate the recommendation process:
- Follow the on-screen instructions.
-
Run the summary making program:
python3 summary.py --generate_batch
- Check the status at any time:
python3 summary.py --check_status
- Note: This checks status every 10 min. Use 'ctrl+c' to stop manually.
-
Run the embeddings:
python3 embeddings.py
-
Run the mapping with HDB or K-Means:
python3 mapping.py --dense <CLUSTER_MIN_SIZE_INT> <CLUSTER_MIN_SAMPLES_INT>
- Example:
python3 mapping.py --dense 10 4
- Note: Run multiple times to dial in the right clusters.
-
Create the playlists:
python3 make_cluster_playlists.py
Database: Postgres 🗄️
Program: Python 🐍
- Spotify API 🎵
- Selenium (for dynamic web crawling) 🌐
- Beautiful Soup (for static web crawling) 🍲
- Clustering algorithms 📊
- ChatGPT API 🤖