Skip to content

NilsKulmbacher/spotify-language-classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Spotify Language Classifier

Spotify Language Classifier Python React FastAPI

I love listening to music, especially the music I come across when travelling. Whatever friends or locals send me, I throw into random catch-all playlists. But at some point you want to actually find that French song from Paris, or the Spanish track someone played on the beach — and suddenly you're scrolling through 400 songs with no way to filter by language.

Spotify tracks all kinds of audio features — danceability, tempo, energy — but language isn't one of them. So I built this.

Spotify Language Classifier connects to your Spotify account, scans any playlist, detects the language of each song using lyrics analysis, and automatically creates separate playlists per language. Songs it can't figure out end up in an "unclassified" list.


Features

  • Sign in with Spotify — secure OAuth 2.0 PKCE authentication
  • Pick any playlist — browse your playlists with cover art and track counts, or paste a playlist URL/ID directly
  • Supports up to 3,000 songs — processed in batches with a real-time progress bar
  • Accurate language detection — fetches lyrics via LRCLIB (free, unlimited) with optional Genius fallback, then runs fasttext + lingua-py language detection
  • 176 languages supported — covers mainstream and niche music worldwide
  • Handles edge cases — instrumental tracks, multilingual songs, and unclassified songs all get their own lists
  • Creates playlists automatically — one new Spotify playlist per detected language, right in your account

How It Works

1. You select a playlist
2. App fetches all track metadata from Spotify (artist, title, album, duration)
3. For each song, lyrics are fetched from LRCLIB (using duration for accurate matching)
   └── If not found: falls back to Genius API (if configured)
4. Lyrics are analyzed by fasttext (176-language model)
   └── If low confidence: re-analyzed by lingua-py for short/mixed texts
5. Songs are grouped by language
6. You choose which language playlists to create in your Spotify account

Setup

Prerequisites

1. Clone the repo

git clone https://github.com/nilskulmbacher/spotify-language-classifier.git
cd spotify-language-classifier

2. Configure Spotify

  1. Go to the Spotify Developer Dashboard
  2. Create a new app
  3. Under Redirect URIs, add: http://127.0.0.1:8000/auth/callback
  4. Copy your Client ID and Client Secret

3. Backend setup

cd backend
python3 -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate
pip install -r requirements.txt

cp .env.example .env
# Edit .env with your Spotify credentials and a generated SECRET_KEY:
# python -c "import secrets; print(secrets.token_hex(32))"

uvicorn main:app --reload --port 8000

4. Frontend setup

cd frontend
npm install

cp .env.example .env
# Edit .env — set VITE_SPOTIFY_CLIENT_ID to your Spotify Client ID

npm run dev
# Open http://localhost:5173

Optional: Genius API (Better Lyrics Coverage)

LRCLIB covers ~3 million songs for free with no rate limits. For broader coverage (especially mainstream/English music), add a free Genius API token:

  1. Go to genius.com/api-clients and create an app
  2. Add to backend/.env:
    GENIUS_CLIENT_ID=...
    GENIUS_CLIENT_SECRET=...
    GENIUS_ACCESS_TOKEN=...
    

If Genius credentials are not set, the app shows a warning and uses LRCLIB only.


Environment Variables

Backend (backend/.env)

Variable Required Description
SPOTIFY_CLIENT_ID Yes Spotify Developer Dashboard
SPOTIFY_CLIENT_SECRET Yes Spotify Developer Dashboard
SPOTIFY_REDIRECT_URI Yes Must match dashboard. Default: http://127.0.0.1:8000/auth/callback
GENIUS_CLIENT_ID No Genius API — improves lyrics coverage
GENIUS_CLIENT_SECRET No Genius API
GENIUS_ACCESS_TOKEN No Genius API
SECRET_KEY Yes Random hex for session signing
FRONTEND_URL Yes For CORS. Default: http://localhost:5173

Frontend (frontend/.env)

Variable Required Description
VITE_API_BASE_URL Yes Backend URL. Default: http://localhost:8000
VITE_SPOTIFY_CLIENT_ID Yes Same as backend SPOTIFY_CLIENT_ID
VITE_SPOTIFY_REDIRECT_URI Yes Same as backend SPOTIFY_REDIRECT_URI

Language Classification Details

Result Meaning
English, Spanish, etc. Song lyrics detected with high confidence (>85%)
Multilingual Two or more languages each above 20% confidence
Instrumental No lyrics found on any source
Unclassified Lyrics found but language confidence too low

Tech Stack

Component Technology
Backend Python 3.11, FastAPI, uvicorn
Frontend React 18, TypeScript, Vite, Tailwind CSS
Language Detection fasttext (primary), lingua-py (fallback)
Lyrics LRCLIB (primary), Genius (optional fallback)
Auth Spotify OAuth 2.0 PKCE

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors