SpotifyScraper

Extract Spotify data without the official API. Access tracks, albums, artists, playlists, and podcasts - no authentication required.

Why SpotifyScraper?

🔓 No API Key Required - Start extracting data immediately
🚀 Fast & Lightweight - Optimized for speed and minimal dependencies
📊 Complete Metadata - Get all available track, album, artist details
🎙️ Podcast Support - Extract podcast episodes and show information
💿 Media Downloads - Download cover art and preview clips
🔄 Bulk Operations - Process multiple URLs efficiently
🛡️ Robust & Reliable - Comprehensive error handling and retries

Installation

# Basic installation
pip install spotifyscraper

# With Selenium support (includes automatic driver management)
pip install spotifyscraper[selenium]

# All features
pip install spotifyscraper[all]

Quick Start

Basic Usage

from spotify_scraper import SpotifyClient

# Initialize client with rate limiting (default 0.5s between requests)
client = SpotifyClient()

# Get track info with enhanced metadata
track = client.get_track_info("https://open.spotify.com/track/4iV5W9uYEdYUVa79Axb7Rh")
print(f"{track['name']} by {track['artists'][0]['name']}")
# Output: One More Time by Daft Punk

# Access new fields (when available)
print(f"Track #{track.get('track_number', 'N/A')} on disc {track.get('disc_number', 'N/A')}")
print(f"Popularity: {track.get('popularity', 'Not available')}")

# Download cover art
cover_path = client.download_cover("https://open.spotify.com/track/4iV5W9uYEdYUVa79Axb7Rh")
print(f"Cover saved to: {cover_path}")

client.close()

CLI Usage

# Get track info
spotify-scraper track https://open.spotify.com/track/4iV5W9uYEdYUVa79Axb7Rh

# Download album with covers
spotify-scraper download album https://open.spotify.com/album/0JGOiO34nwfUdDrD612dOp --with-covers

# Export playlist to JSON
spotify-scraper playlist https://open.spotify.com/playlist/37i9dQZF1DXcBWIGoYBM5M --output playlist.json

Important Notes

Field Availability

Not all fields shown in Spotify's API documentation are available via web scraping:

❌ NOT Available: popularity, followers, genres, detailed statistics
✅ Available: name, artists, album info, duration, preview URLs
⚠️ Authentication Required: lyrics (needs OAuth, not just cookies)

Core Features

🎵 Track Information

# Get complete track metadata
track = client.get_track_info(track_url)

# Available data:
# - name, id, uri, duration_ms
# - artists (with names, IDs, and verification status)  
# - album (with name, ID, release date, images, total_tracks)
# - preview_url (30-second MP3)
# - is_explicit, is_playable
# - track_number, disc_number (when available)
# - popularity (when available)
# - external URLs

# Example: Access enhanced metadata
if 'artists' in track:
    for artist in track['artists']:
        print(f"Artist: {artist['name']}")
        if 'verified' in artist:
            print(f"  Verified: {artist['verified']}")
        if 'url' in artist:
            print(f"  URL: {artist['url']}")

if 'album' in track:
    album = track['album']
    print(f"Album: {album['name']} ({album.get('total_tracks', 'N/A')} tracks)")

# Note: Lyrics require OAuth authentication
# SpotifyScraper cannot access lyrics as Spotify requires Bearer tokens

💿 Album Information

# Get album with all tracks
album = client.get_album_info(album_url)

print(f"Album: {album.get('name', 'Unknown')}")
print(f"Artist: {(album.get('artists', [{}])[0].get('name', 'Unknown') if album.get('artists') else 'Unknown')}")
print(f"Released: {album.get('release_date', 'N/A')}")
print(f"Tracks: {album.get('total_tracks', 0)}")

# List all tracks
for track in album['tracks']:
    print(f"  {track['track_number']}. {track.get('name', 'Unknown')}")

👤 Artist Information

# Get artist profile
artist = client.get_artist_info(artist_url)

print(f"Artist: {artist.get('name', 'Unknown')}")
print(f"Followers: {artist.get('followers', {}).get('total', 'N/A'):,}")
print(f"Genres: {', '.join(artist.get('genres', []))}")
print(f"Popularity: {artist.get('popularity', 'N/A')}/100")

# Get top tracks
for track in artist.get('top_tracks', [])[:5]:
    print(f"  - {track.get('name', 'Unknown')}")

📋 Playlist Information

# Get playlist details
playlist = client.get_playlist_info(playlist_url)

print(f"Playlist: {playlist.get('name', 'Unknown')}")
print(f"Owner: {playlist.get('owner', {}).get('display_name', playlist.get('owner', {}).get('id', 'Unknown'))}")
print(f"Tracks: {playlist.get('track_count', 0)}")
print(f"Followers: {playlist.get('followers', {}).get('total', 'N/A'):,}")

# Get all tracks
for track in playlist['tracks']:
    print(f"  - {track.get('name', 'Unknown')} by {(track.get('artists', [{}])[0].get('name', 'Unknown') if track.get('artists') else 'Unknown')}")

🎙️ Podcast Support (NEW!)

Episode Information

# Get episode details
episode = client.get_episode_info(episode_url)

print(f"Episode: {episode.get('name', 'Unknown')}")
print(f"Show: {episode.get('show', {}).get('name', 'Unknown')}")
print(f"Duration: {episode.get('duration_ms', 0) / 1000 / 60:.1f} minutes")
print(f"Release Date: {episode.get('release_date', 'N/A')}")
print(f"Has Video: {'Yes' if episode.get('has_video') else 'No'}")

# Download episode preview (1-2 minute clip)
preview_path = client.download_episode_preview(
    episode_url,
    path="podcast_previews/",
    filename="episode_preview"
)
print(f"Preview downloaded to: {preview_path}")

Show Information

# Get podcast show details
show = client.get_show_info(show_url)

print(f"Show: {show.get('name', 'Unknown')}")
print(f"Publisher: {show.get('publisher', 'Unknown')}")
print(f"Total Episodes: {show.get('total_episodes', 'N/A')}")
print(f"Categories: {', '.join(show.get('categories', []))}")

# Get recent episodes
for episode in show.get('episodes', [])[:5]:
    print(f"  - {episode.get('name', 'Unknown')} ({episode.get('duration_ms', 0) / 1000 / 60:.1f} min)")

CLI Commands for Podcasts

# Get episode info
spotify-scraper episode info https://open.spotify.com/episode/...

# Download episode preview
spotify-scraper episode download https://open.spotify.com/episode/... -o previews/

# Get show info with episodes
spotify-scraper show info https://open.spotify.com/show/...

# List show episodes
spotify-scraper show episodes https://open.spotify.com/show/... -o episodes.json

Note: Full episode downloads require Spotify Premium authentication. SpotifyScraper currently supports preview clips only.

📥 Media Downloads

# Download track preview (30-second MP3)
audio_path = client.download_preview_mp3(
    track_url,
    path="previews/",
    filename="custom_name.mp3"
)

# Download cover art
cover_path = client.download_cover(
    album_url,
    path="covers/",
    size_preference="large",  # small, medium, large
    format="jpeg"  # jpeg or png
)

# Download all playlist covers
from spotify_scraper.utils.common import SpotifyBulkOperations

bulk = SpotifyBulkOperations(client)
covers = bulk.download_playlist_covers(
    playlist_url,
    output_dir="playlist_covers/"
)

🔄 Bulk Operations

from spotify_scraper.utils.common import SpotifyBulkOperations

# Process multiple URLs
urls = [
    "https://open.spotify.com/track/...",
    "https://open.spotify.com/album/...",
    "https://open.spotify.com/artist/..."
]

bulk = SpotifyBulkOperations()
results = bulk.process_urls(urls, operation="all_info")

# Export results
bulk.export_to_json(results, "spotify_data.json")
bulk.export_to_csv(results, "spotify_data.csv")

# Batch download media
downloads = bulk.batch_download(
    urls,
    output_dir="downloads/",
    media_types=["audio", "cover"]
)

📊 Data Analysis

from spotify_scraper.utils.common import SpotifyDataAnalyzer

analyzer = SpotifyDataAnalyzer()

# Analyze playlist
stats = analyzer.analyze_playlist(playlist_data)
print(f"Total duration: {stats['basic_stats']['total_duration_formatted']}")
print(f"Most common artist: {stats['artist_stats']['top_artists'][0]}")
print(f"Average popularity: {stats['basic_stats']['average_popularity']}")

# Compare playlists
comparison = analyzer.compare_playlists(playlist1, playlist2)
print(f"Common tracks: {comparison['track_comparison']['common_tracks']}")
print(f"Similarity: {comparison['track_comparison']['similarity_percentage']:.1f}%")

Advanced Configuration

Browser Selection

# Use requests (default, fast)
client = SpotifyClient(browser_type="requests")

# Use Selenium (for JavaScript content)
client = SpotifyClient(browser_type="selenium")

# Auto-detect (falls back to Selenium if needed)
client = SpotifyClient(browser_type="auto")

Authentication

# Using cookies file (exported from browser)
client = SpotifyClient(cookie_file="spotify_cookies.txt")

# Using cookie dictionary
client = SpotifyClient(cookies={"sp_t": "your_token"})

# Using headers
client = SpotifyClient(headers={
    "User-Agent": "Custom User Agent",
    "Accept-Language": "en-US,en;q=0.9"
})

Proxy Support

client = SpotifyClient(proxy={
    "http": "http://proxy.example.com:8080",
    "https": "https://proxy.example.com:8080"
})

Logging

# Set logging level
client = SpotifyClient(log_level="DEBUG")

# Or use standard logging
import logging
logging.basicConfig(level=logging.INFO)

API Reference

SpotifyClient

The main client for interacting with Spotify.

Methods:

get_track_info(url) - Get track metadata
get_track_lyrics(url) - Get track lyrics (requires auth)
get_track_info_with_lyrics(url) - Get track with lyrics
get_album_info(url) - Get album metadata
get_artist_info(url) - Get artist metadata
get_playlist_info(url) - Get playlist metadata
download_preview_mp3(url, path, filename) - Download track preview
download_cover(url, path, size_preference, format) - Download cover art
close() - Close the client and clean up resources

SpotifyBulkOperations

Utilities for processing multiple URLs.

Methods:

process_urls(urls, operation) - Process multiple URLs
export_to_json(data, output_file) - Export to JSON
export_to_csv(data, output_file) - Export to CSV
batch_download(urls, output_dir, media_types) - Batch download media
process_url_file(file_path, operation) - Process URLs from file
extract_urls_from_text(text) - Extract Spotify URLs from text

SpotifyDataAnalyzer

Tools for analyzing Spotify data.

Methods:

analyze_playlist(playlist_data) - Get playlist statistics
compare_playlists(playlist1, playlist2) - Compare two playlists

Examples

Download All Album Tracks

# Get album info
album = client.get_album_info(album_url)

# Download all track previews
for track in album['tracks']:
    track_url = f"https://open.spotify.com/track/{track['id']}"
    client.download_preview_mp3(track_url, path=f"album_{album.get('name', 'Unknown')}/")

Export Artist Discography

artist = client.get_artist_info(artist_url)

# Get all albums
albums_data = []
for album in artist['albums']['items']:
    album_url = f"https://open.spotify.com/album/{album['id']}"
    album = client.get_album_info(album_url)
    albums_data.append(album_info)

# Export to JSON
import json
with open(f"{artist.get('name', 'Unknown')}_discography.json", "w") as f:
    json.dump(albums_data, f, indent=2)

Create Playlist Report

from spotify_scraper.utils.common import SpotifyDataFormatter

formatter = SpotifyDataFormatter()

# Get playlist
playlist = client.get_playlist_info(playlist_url)

# Create markdown report
markdown = formatter.format_playlist_markdown(playlist)
with open("playlist_report.md", "w") as f:
    f.write(markdown)

# Create M3U file
tracks = [item['track'] for item in playlist['tracks']]
formatter.export_to_m3u(tracks, "playlist.m3u")

Error Handling

from spotify_scraper.core.exceptions import (
    SpotifyScraperError,
    URLError,
    ExtractionError,
    DownloadError
)

try:
    track = client.get_track_info(url)
except URLError:
    print("Invalid Spotify URL")
except ExtractionError as e:
    print(f"Failed to extract data: {e}")
except SpotifyScraperError as e:
    print(f"General error: {e}")

Command Line Interface

# General syntax
spotify-scraper [COMMAND] [URL] [OPTIONS]

# Commands:
#   track      Get track information
#   album      Get album information
#   artist     Get artist information
#   playlist   Get playlist information
#   download   Download media files

# Global options:
#   --output, -o      Output file path
#   --format, -f      Output format (json, csv, txt)
#   --pretty          Pretty print output
#   --log-level       Set logging level
#   --cookies         Cookie file path

# Examples:
spotify-scraper track $URL --pretty
spotify-scraper album $URL -o album.json
spotify-scraper playlist $URL -f csv -o playlist.csv
spotify-scraper download track $URL --with-cover --path downloads/

Environment Variables

Configure SpotifyScraper using environment variables:

export SPOTIFY_SCRAPER_LOG_LEVEL=DEBUG
export SPOTIFY_SCRAPER_BROWSER_TYPE=selenium
export SPOTIFY_SCRAPER_COOKIE_FILE=/path/to/cookies.txt
export SPOTIFY_SCRAPER_PROXY_HTTP=http://proxy:8080

Requirements

Python 3.8 or higher
Operating System: Windows, macOS, Linux, BSD
Dependencies:
- requests (for basic operations)
- beautifulsoup4 (for HTML parsing)
- selenium (optional, for JavaScript content)

Troubleshooting

Common Issues

1. SSL Certificate Errors

client = SpotifyClient(verify_ssl=False)  # Not recommended for production

2. Rate Limiting

import time
for url in urls:
    track = client.get_track_info(url)
    time.sleep(1)  # Add delay between requests

3. Cloudflare Protection

# Use Selenium backend
client = SpotifyClient(browser_type="selenium")

4. Missing Data

# Some fields might be None
track = client.get_track_info(url)
artist_name = track.get('artists', [{}])[0].get('name', 'Unknown')

Contributing

We welcome contributions! Please see our Contributing Guide for details.

# Clone the repository
git clone https://github.com/AliAkhtari78/SpotifyScraper.git

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run linting
black src/ tests/
flake8 src/ tests/
mypy src/

License

SpotifyScraper is released under the MIT License. See LICENSE for details.

Disclaimer

This library is for educational and personal use only. Always respect Spotify's Terms of Service and robots.txt. The authors are not responsible for any misuse of this library.

Support

SpotifyScraper - Extract Spotify data with ease 🎵

Made with ❤️ by Ali Akhtari

Name		Name	Last commit message	Last commit date
Latest commit History 303 Commits
.github		.github
docs		docs
examples		examples
scripts		scripts
src/spotify_scraper		src/spotify_scraper
tests		tests
wiki		wiki
.coveragerc		.coveragerc
.gitattributes		.gitattributes
.gitignore		.gitignore
.nojekyll		.nojekyll
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
.readthedocs.yaml		.readthedocs.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DEVELOPMENT_NOTES.md		DEVELOPMENT_NOTES.md
Dockerfile		Dockerfile
GITHUB_SECRETS_SETUP.md		GITHUB_SECRETS_SETUP.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
MCP_TESTING.md		MCP_TESTING.md
MISSING_PAGES_AUDIT.md		MISSING_PAGES_AUDIT.md
MISSING_PAGES_LIST.md		MISSING_PAGES_LIST.md
Makefile		Makefile
POST_RELEASE_CHECKLIST.md		POST_RELEASE_CHECKLIST.md
PYPI_DEPLOYMENT_INSTRUCTIONS.md		PYPI_DEPLOYMENT_INSTRUCTIONS.md
PYPI_README.md		PYPI_README.md
README.md		README.md
README_TESTING.md		README_TESTING.md
RELEASE_NOTES.md		RELEASE_NOTES.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml
mkdocs.yml		mkdocs.yml
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Uh oh!

License

AliAkhtari78/SpotifyScraper

Folders and files

Latest commit

History

Repository files navigation

SpotifyScraper

Why SpotifyScraper?

Installation

Quick Start

Basic Usage

CLI Usage

Important Notes

Field Availability

Core Features

🎵 Track Information

💿 Album Information

👤 Artist Information

📋 Playlist Information

🎙️ Podcast Support (NEW!)

Episode Information

Show Information

CLI Commands for Podcasts

📥 Media Downloads

🔄 Bulk Operations

📊 Data Analysis

Advanced Configuration

Browser Selection

Authentication

Proxy Support

Logging

API Reference

SpotifyClient

SpotifyBulkOperations

SpotifyDataAnalyzer

Examples

Download All Album Tracks

Export Artist Discography

Create Playlist Report

Error Handling

Command Line Interface

Environment Variables

Requirements

Troubleshooting

Common Issues

Contributing

License

Disclaimer

Support

About

Topics

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 26

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors 5

Languages

Packages