# 🎧 Top Monthly Recs — End‑to‑End Pipeline (Spotify + Local Recommender)

This notebook runs the **entire flow**:

1. **Export** your top 50 songs for the past month → `top_monthly_songs.csv` (via your `csv_out.py`)
2. **Preprocess** that CSV to build a local recommender model → `df_cleaned.pkl`, `cosine_sim.pkl` (via your `preprocess.py`)
3. **Create/Update** your recommendations playlist using the local model, skipping anything already saved in your Library (via your `top_monthly_recommendation.py`)

### Prereqs
- Files in the **same folder** as this notebook:
  - `csv_out.py` (exports `top_monthly_songs.csv`)
  - `preprocess.py` (reads `top_monthly_songs.csv`, writes `df_cleaned.pkl`, `cosine_sim.pkl`)
  - `top_monthly_recommendation.py` (uses the local model; **must include the saved‑tracks filtering** we added)
- `.env` file with your Spotify creds:
  - `SPOTIPY_CLIENT_ID`
  - `SPOTIPY_CLIENT_SECRET`
  - `SPOTIPY_REDIRECT_URI`

> If your export script has a different filename, update the **Settings** cell below.

In [7]:
# --- Settings ---
CSV_EXPORT_SCRIPT = "csv_out.py"                 # your script that writes top_monthly_songs.csv
PREPROCESS_SCRIPT = "preprocess.py"              # builds df_cleaned.pkl & cosine_sim.pkl
RECOMMENDER_SCRIPT = "top_monthly_recommendation.py"  # creates/updates the playlist
TOP_CSV = "top_monthly_songs.csv"                # output CSV name expected by preprocess.py

# Expected model outputs from preprocess.py
DF_PATH = "df_cleaned.pkl"
SIM_PATH = "cosine_sim.pkl"

import os, sys, pathlib, subprocess, shutil

def abort(msg: str):
    print(msg)
    raise SystemExit(1)

print("📂 Working directory:", os.getcwd())
for f in [CSV_EXPORT_SCRIPT, PREPROCESS_SCRIPT, RECOMMENDER_SCRIPT]:
    if not os.path.exists(f):
        abort(f"❌ Required file not found: {f}\n   Make sure this notebook sits in the same folder as your scripts.")
print("✅ Found required scripts.")

📂 Working directory: /Users/joemay/Documents/spotipy_scripts/monthly recommend
❌ Required file not found: top_monthly_recommendation.py
   Make sure this notebook sits in the same folder as your scripts.


SystemExit: 1

In [None]:
import sys, subprocess

def pip_install(pkgs):
    print("📦 Installing:", " ".join(pkgs))
    subprocess.check_call([sys.executable, "-m", "pip", "install", *pkgs])

# Safe to re-run; pip will skip if already satisfied
pip_install(["spotipy", "python-dotenv", "pandas", "numpy", "scikit-learn", "joblib", "nltk"])

print("✅ Dependencies installed.")

In [None]:
from dotenv import load_dotenv
import os

if not os.path.exists(".env"):
    print("⚠️  .env not found in current directory. You'll be prompted to login via browser, but having .env is recommended.")
load_dotenv()

required_env = ["SPOTIPY_CLIENT_ID", "SPOTIPY_CLIENT_SECRET", "SPOTIPY_REDIRECT_URI"]
missing = [k for k in required_env if not os.getenv(k)]
if missing:
    print("⚠️  Missing in .env:", ", ".join(missing))
    print("   You can still proceed if your scripts handle auth flow, but it's better to add them to .env.")
else:
    print("✅ .env looks good.")

## Step 1 — Export top 50 monthly tracks to CSV

In [None]:
import subprocess, sys, os, time

print("▶️ Running:", CSV_EXPORT_SCRIPT)
proc = subprocess.run([sys.executable, CSV_EXPORT_SCRIPT], capture_output=True, text=True)
print(proc.stdout)
if proc.returncode != 0:
    print(proc.stderr)
    raise RuntimeError("CSV export failed. See error above.")

if not os.path.exists(TOP_CSV):
    raise FileNotFoundError(f"Expected CSV not found: {TOP_CSV}")
print("✅ CSV created:", TOP_CSV)

In [None]:
import pandas as pd

print("🔎 Preview of", TOP_CSV)
df_top = pd.read_csv(TOP_CSV)
display(df_top.head(10))
print("Rows:", len(df_top))

## Step 2 — Build local recommender (TF‑IDF + Cosine)

In [None]:
import subprocess, sys, os

# Clean old artifacts to avoid confusion
for f in [DF_PATH, SIM_PATH]:
    if os.path.exists(f):
        os.remove(f)

print("▶️ Running:", PREPROCESS_SCRIPT)
proc = subprocess.run([sys.executable, PREPROCESS_SCRIPT], capture_output=True, text=True)
print(proc.stdout)
if proc.returncode != 0:
    print(proc.stderr)
    raise RuntimeError("Preprocess failed. See error above.")

missing = [p for p in [DF_PATH, SIM_PATH] if not os.path.exists(p)]
if missing:
    raise FileNotFoundError("Missing expected artifact(s): " + ", ".join(missing))

print("✅ Recommender artifacts created:", DF_PATH, SIM_PATH)

## Step 3 — Create/Update playlist with **new** (not already saved) recommendations

In [None]:
import subprocess, sys

print("▶️ Running:", RECOMMENDER_SCRIPT)
proc = subprocess.run([sys.executable, RECOMMENDER_SCRIPT], capture_output=True, text=True)
print(proc.stdout)
if proc.returncode != 0:
    print(proc.stderr)
    raise RuntimeError("Recommendation script failed. See error above.")

print("✅ Done. Check Spotify for your updated playlist!")

### 🔧 Troubleshooting

- **CSV not found**: Make sure `csv_out.py` writes `top_monthly_songs.csv` into the same folder as this notebook.
- **Authentication issues**: Ensure `.env` has `SPOTIPY_CLIENT_ID`, `SPOTIPY_CLIENT_SECRET`, `SPOTIPY_REDIRECT_URI`.
- **Missing artifacts after preprocess**: Confirm your `preprocess.py` actually writes `df_cleaned.pkl` and `cosine_sim.pkl` (and that it's looking at `top_monthly_songs.csv`).
- **Already-saved tracks showing up**: Verify you’re using the updated `top_monthly_recommendation.py` that filters against your Library (Liked Songs) and has `user-library-read` scope.