# üîÑ Spotify Library Sync

This notebook downloads your Spotify library and saves it locally for offline analysis.

**What it does:**
- ‚úÖ Fetches all your playlists (owned only)
- ‚úÖ Fetches your Liked Songs (‚ù§Ô∏è master playlist)
- ‚úÖ Downloads track and artist metadata
- ‚úÖ Saves everything to `../data/` as parquet files
- ‚úÖ Incremental updates (only fetches changes)

**Run this first!** Then use `02_analyze_library.ipynb` for analysis.

**üí° Tip:** For automated daily syncs, use `scripts/spotify_sync.py` instead (configured via cron job). See `README.md` for details.

## 1Ô∏è‚É£ Setup

Install dependencies and configure credentials.

In [12]:
# Install dependencies (run once)
%pip install -q pandas spotipy pyarrow tqdm python-dotenv

Note: you may need to restart the kernel to use updated packages.


In [13]:
# Add project to path
import sys
from pathlib import Path

PROJECT_ROOT = Path("..").resolve()
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))

print(f"‚úÖ Project root: {PROJECT_ROOT}")

‚úÖ Project root: /Users/aryamaan/Desktop/Projects/spotim8


In [14]:
import os
from dotenv import load_dotenv

# Load credentials from ../.env file
env_path = PROJECT_ROOT / ".env"
if env_path.exists():
    load_dotenv(env_path)
    print(f"‚úÖ Loaded credentials from {env_path}")
else:
    print(f"‚ö†Ô∏è  No .env file found at {env_path}")
    print("   Create one with SPOTIPY_CLIENT_ID, SPOTIPY_CLIENT_SECRET, SPOTIPY_REDIRECT_URI")

# Verify credentials are set
client_id = os.environ.get("SPOTIPY_CLIENT_ID", "")
if client_id and client_id != "YOUR_CLIENT_ID":
    print(f"   Client ID: {client_id[:8]}...")
else:
    print("   ‚ùå SPOTIPY_CLIENT_ID not set!")

‚úÖ Loaded credentials from /Users/aryamaan/Desktop/Projects/spotim8/.env
   Client ID: 8263fcc5...


## 2Ô∏è‚É£ Connect to Spotify

This will open a browser window for authentication on first run.

In [15]:
from spotim8 import Spotim8, set_response_cache
from spotim8.catalog import CacheConfig

# Data directory (stores downloaded data)
DATA_DIR = PROJECT_ROOT / "data"
DATA_DIR.mkdir(exist_ok=True)

# Enable API response caching to avoid rate limits
# Cached responses are reused for 1 hour
API_CACHE_DIR = DATA_DIR / ".api_cache"
set_response_cache(API_CACHE_DIR, ttl=3600)

# Initialize client with caching
sf = Spotim8.from_env(
    progress=True,
    cache=CacheConfig(dir=DATA_DIR)
)

print(f"‚úÖ Connected to Spotify!")
print(f"üìÅ Data will be saved to: {DATA_DIR}")

üì¶ API response cache enabled: /Users/aryamaan/Desktop/Projects/spotim8/data/.api_cache (TTL: 3600s)
‚úÖ Connected to Spotify!
üìÅ Data will be saved to: /Users/aryamaan/Desktop/Projects/spotim8/data


## 3Ô∏è‚É£ Sync Your Library

This fetches your playlists and tracks. First run may take a few minutes.

In [16]:
# Sync library (incremental - only fetches changes)
stats = sf.sync(
    owned_only=True,           # Only your playlists, not followed ones
    include_liked_songs=True   # Include Liked Songs as master playlist
)

print(f"\nüìä Sync complete!")

üîÑ Starting library sync...
‚úÖ All playlists up to date!
‚úÖ Sync complete! Checked 327 playlists, updated 0, added 0 track entries

üìä Sync complete!


## 4Ô∏è‚É£ Build Full Data Tables

Now let's build all the detailed tables (tracks, artists, etc.)

In [17]:
# Fetch all data tables (uses cache if available)
print("üì• Building data tables...\n")

playlists = sf.playlists()
print(f"‚úÖ Playlists: {len(playlists):,}")

playlist_tracks = sf.playlist_tracks()
print(f"‚úÖ Playlist-track links: {len(playlist_tracks):,}")

tracks = sf.tracks()
print(f"‚úÖ Unique tracks: {len(tracks):,}")

track_artists = sf.track_artists()
print(f"‚úÖ Track-artist links: {len(track_artists):,}")

artists = sf.artists()
print(f"‚úÖ Artists: {len(artists):,}")

# Build the wide table (everything joined)
library = sf.library_wide()
print(f"‚úÖ Library wide table: {len(library):,} rows")

üì• Building data tables...

‚úÖ Playlists: 757
‚úÖ Playlist-track links: 44,102
‚úÖ Unique tracks: 5,270
‚úÖ Track-artist links: 8,504
‚úÖ Artists: 2,610
‚úÖ Library wide table: 44,347 rows


## 5Ô∏è‚É£ View Your Data

In [18]:
# Show status summary
sf.print_status()


        SPOTIM8 DATA STATUS
üìÅ Cache directory: /Users/aryamaan/Desktop/Projects/spotim8/data
üë§ User: 31iol2qamank24owygxo7kpq533y
üïê Last sync: 2025-12-24T21:58:44.934198+00:00

üìä Cached data:
   ‚Ä¢ Playlists: 757
   ‚Ä¢ Playlist tracks: 44,102
   ‚Ä¢ Unique tracks: 5,270
   ‚Ä¢ Track-artist links: 8,504
   ‚Ä¢ Artists: 2,610



In [19]:
# Preview playlists
print("üìÇ Your Playlists:")
playlists[["name", "track_count", "is_liked_songs", "is_owned"]].head(15)

üìÇ Your Playlists:


Unnamed: 0,name,track_count,is_liked_songs,is_owned
0,‚ù§Ô∏è Liked Songs,5115,True,True
1,OtherDec25,27,False,True
2,DanceDec25,6,False,True
3,HipHopDec25,15,False,True
4,Dec25,47,False,True
5,AJamLatin,26,False,True
6,AJamIndie,121,False,True
7,AJamRock,125,False,True
8,AJamPop,348,False,True
9,AJamR&B/Soul,357,False,True


In [20]:
# Preview tracks
print("üéµ Sample Tracks:")
tracks[["name", "album_name", "popularity", "duration_ms"]].head(10)

üéµ Sample Tracks:


Unnamed: 0,name,album_name,popularity,duration_ms
0,Figaro,Madvillainy,59,145706
1,Meat Grinder,Madvillainy,64,131866
2,Rhymes Like Dimes,Operation: Doomsday (Complete),66,258613
3,Rapp Snitch Knishes,MM..FOOD,74,172893
4,All Caps,Madvillainy,70,130479
5,Jeep (feat. Terror Reid),I‚Äôm Not Supposed To Be Here,58,120896
6,Cannonball (feat. Don Toliver),Euphoria,72,122568
7,Weak,Promised Land,67,202560
8,Jungle,USB,62,198805
9,Beto‚Äôs Horns - fred remix,USB,64,226222


In [21]:
# Preview artists
print("üé§ Top Artists (by followers):")
artists.nlargest(10, "followers")[["name", "genres", "popularity", "followers"]]

üé§ Top Artists (by followers):


Unnamed: 0,name,genres,popularity,followers
1160,Arijit Singh,"[hindi pop, bollywood, desi, bangla pop]",92,168800957
304,Taylor Swift,[],100,148493497
786,Ed Sheeran,[soft pop],90,123881028
161,Billie Eilish,[],93,121012260
575,The Weeknd,[],96,115913722
254,Ariana Grande,[pop],95,108576516
636,Eminem,"[rap, hip hop]",91,106037116
1097,Bad Bunny,"[reggaeton, trap latino, urbano latino, latin]",98,105279863
77,Drake,[rap],98,105142837
783,Justin Bieber,[],94,85978354


## 6Ô∏è‚É£ Check Saved Files

In [22]:
# List saved files
print(f"üìÅ Files in {DATA_DIR}:\n")
for f in sorted(DATA_DIR.glob("*.parquet")):
    size_kb = f.stat().st_size / 1024
    print(f"   {f.name:30} {size_kb:>8.1f} KB")

üìÅ Files in /Users/aryamaan/Desktop/Projects/spotim8/data:

   artists.parquet                   198.0 KB
   library_wide.parquet             1969.4 KB
   playlist_tracks.parquet           624.1 KB
   playlists.parquet                 124.3 KB
   track_artists.parquet             217.0 KB
   tracks.parquet                    610.8 KB


---

## ‚úÖ Done!

Your library is now saved locally. Next steps:

1. **Analyze**: Open `02_analyze_library.ipynb` for visualizations
2. **Playlist Analysis**: Open `03_playlist_analysis.ipynb` for genre clustering
3. **Re-sync**: Run this notebook again anytime to fetch new changes

The data is cached, so future runs are fast!