I love listening to music, especially the music I come across when travelling. Whatever friends or locals send me, I throw into random catch-all playlists. But at some point you want to actually find that French song from Paris, or the Spanish track someone played on the beach — and suddenly you're scrolling through 400 songs with no way to filter by language.
Spotify tracks all kinds of audio features — danceability, tempo, energy — but language isn't one of them. So I built this.
Spotify Language Classifier connects to your Spotify account, scans any playlist, detects the language of each song using lyrics analysis, and automatically creates separate playlists per language. Songs it can't figure out end up in an "unclassified" list.
- Sign in with Spotify — secure OAuth 2.0 PKCE authentication
- Pick any playlist — browse your playlists with cover art and track counts, or paste a playlist URL/ID directly
- Supports up to 3,000 songs — processed in batches with a real-time progress bar
- Accurate language detection — fetches lyrics via LRCLIB (free, unlimited) with optional Genius fallback, then runs fasttext + lingua-py language detection
- 176 languages supported — covers mainstream and niche music worldwide
- Handles edge cases — instrumental tracks, multilingual songs, and unclassified songs all get their own lists
- Creates playlists automatically — one new Spotify playlist per detected language, right in your account
1. You select a playlist
2. App fetches all track metadata from Spotify (artist, title, album, duration)
3. For each song, lyrics are fetched from LRCLIB (using duration for accurate matching)
└── If not found: falls back to Genius API (if configured)
4. Lyrics are analyzed by fasttext (176-language model)
└── If low confidence: re-analyzed by lingua-py for short/mixed texts
5. Songs are grouped by language
6. You choose which language playlists to create in your Spotify account
- Python 3.11+
- Node.js 18+
- A Spotify Developer account
git clone https://github.com/nilskulmbacher/spotify-language-classifier.git
cd spotify-language-classifier- Go to the Spotify Developer Dashboard
- Create a new app
- Under Redirect URIs, add:
http://127.0.0.1:8000/auth/callback - Copy your Client ID and Client Secret
cd backend
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env with your Spotify credentials and a generated SECRET_KEY:
# python -c "import secrets; print(secrets.token_hex(32))"
uvicorn main:app --reload --port 8000cd frontend
npm install
cp .env.example .env
# Edit .env — set VITE_SPOTIFY_CLIENT_ID to your Spotify Client ID
npm run dev
# Open http://localhost:5173LRCLIB covers ~3 million songs for free with no rate limits. For broader coverage (especially mainstream/English music), add a free Genius API token:
- Go to genius.com/api-clients and create an app
- Add to
backend/.env:GENIUS_CLIENT_ID=... GENIUS_CLIENT_SECRET=... GENIUS_ACCESS_TOKEN=...
If Genius credentials are not set, the app shows a warning and uses LRCLIB only.
| Variable | Required | Description |
|---|---|---|
SPOTIFY_CLIENT_ID |
Yes | Spotify Developer Dashboard |
SPOTIFY_CLIENT_SECRET |
Yes | Spotify Developer Dashboard |
SPOTIFY_REDIRECT_URI |
Yes | Must match dashboard. Default: http://127.0.0.1:8000/auth/callback |
GENIUS_CLIENT_ID |
No | Genius API — improves lyrics coverage |
GENIUS_CLIENT_SECRET |
No | Genius API |
GENIUS_ACCESS_TOKEN |
No | Genius API |
SECRET_KEY |
Yes | Random hex for session signing |
FRONTEND_URL |
Yes | For CORS. Default: http://localhost:5173 |
| Variable | Required | Description |
|---|---|---|
VITE_API_BASE_URL |
Yes | Backend URL. Default: http://localhost:8000 |
VITE_SPOTIFY_CLIENT_ID |
Yes | Same as backend SPOTIFY_CLIENT_ID |
VITE_SPOTIFY_REDIRECT_URI |
Yes | Same as backend SPOTIFY_REDIRECT_URI |
| Result | Meaning |
|---|---|
English, Spanish, etc. |
Song lyrics detected with high confidence (>85%) |
Multilingual |
Two or more languages each above 20% confidence |
Instrumental |
No lyrics found on any source |
Unclassified |
Lyrics found but language confidence too low |
| Component | Technology |
|---|---|
| Backend | Python 3.11, FastAPI, uvicorn |
| Frontend | React 18, TypeScript, Vite, Tailwind CSS |
| Language Detection | fasttext (primary), lingua-py (fallback) |
| Lyrics | LRCLIB (primary), Genius (optional fallback) |
| Auth | Spotify OAuth 2.0 PKCE |