# DJ Song Mixing Recommendation System

**Team:** Ashley Wu, Bonny Koo, Nathan Suh, Leo Lee  
**Course:** CS Machine Learning - UVA Fall 2025

## Project Overview

This notebook implements and compares three recommendation systems for DJ song mixing:
1. **Rule-Based System**: Hard constraints on BPM (±6) and key compatibility
2. **Audio Similarity Baseline**: Content-based filtering using audio features
3. **Hybrid ML System**: XGBoost combining rules + learned patterns

**Hypothesis:** A hybrid system combining DJ mixing rules with ML outperforms both pure rule-based and pure similarity approaches.

## 1. Setup and Data Loading

In [1]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
import warnings
warnings.filterwarnings('ignore')

# Set plotting style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

print("Libraries imported successfully!")

ModuleNotFoundError: No module named 'seaborn'

In [None]:
# Import our custom modules
from data_preprocessing import prepare_dataset, load_spotify_data, clean_data, add_camelot_notation
from model_rule_based import RuleBasedDJRecommender
from model_audio_similarity import AudioSimilarityRecommender
from model_hybrid_ml import HybridMLRecommender
from utils import get_camelot_notation, get_compatible_keys, calculate_bpm_distance

print("Custom modules imported!")

In [None]:
# Load and preprocess data
# TODO: Update this path to your Spotify dataset location
DATA_PATH = 'data/spotify_data.csv'

print("Loading and preprocessing dataset...")
df = prepare_dataset(DATA_PATH)

print(f"\nDataset loaded: {len(df)} songs")
print(f"Columns: {list(df.columns)}")

# Display first few rows
display_cols = ['name', 'artists', 'tempo', 'camelot', 'energy', 'valence'] if 'name' in df.columns else ['tempo', 'camelot', 'energy', 'valence']
df[display_cols].head()

## 2. Exploratory Data Analysis

In [None]:
# BPM distribution
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Histogram
axes[0].hist(df['tempo'], bins=50, edgecolor='black', alpha=0.7)
axes[0].axvline(df['tempo'].mean(), color='red', linestyle='--', label=f'Mean: {df["tempo"].mean():.1f}')
axes[0].set_xlabel('BPM')
axes[0].set_ylabel('Frequency')
axes[0].set_title('BPM Distribution')
axes[0].legend()

# Box plot by key
df.boxplot(column='tempo', by='camelot', ax=axes[1], rot=45)
axes[1].set_title('BPM Distribution by Key')
axes[1].set_xlabel('Camelot Key')
axes[1].set_ylabel('BPM')

plt.tight_layout()
plt.show()

print(f"BPM Statistics:")
print(f"  Mean: {df['tempo'].mean():.1f}")
print(f"  Median: {df['tempo'].median():.1f}")
print(f"  Std: {df['tempo'].std():.1f}")

In [None]:
# Energy vs Valence scatter plot
plt.figure(figsize=(10, 6))
plt.scatter(df['energy'], df['valence'], alpha=0.3, s=20)
plt.xlabel('Energy')
plt.ylabel('Valence (Musical Positivity)')
plt.title('Energy vs Valence Distribution')
plt.colorbar(plt.scatter(df['energy'], df['valence'], c=df['tempo'], cmap='viridis', alpha=0.5, s=20), label='BPM')
plt.show()

In [None]:
# Key distribution (Camelot Wheel)
key_counts = df['camelot'].value_counts().sort_index()

plt.figure(figsize=(12, 5))
key_counts.plot(kind='bar', edgecolor='black')
plt.xlabel('Camelot Key')
plt.ylabel('Number of Songs')
plt.title('Distribution of Musical Keys (Camelot Notation)')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

print(f"\nMost common keys:")
print(key_counts.head())

## 3. Model 1: Rule-Based DJ System

Traditional DJ approach using:
- BPM matching (±6 BPM tolerance)
- Key compatibility (Camelot Wheel)
- Energy flow scoring

In [None]:
# Initialize Rule-Based Recommender
rule_based = RuleBasedDJRecommender(df, bpm_tolerance=6)

# Test on a sample song
test_idx = 0  # Change this to test different songs
print(f"\nTesting Rule-Based System on song index {test_idx}:")
rule_recommendations = rule_based.recommend(test_idx, n_recommendations=10, verbose=True)

In [None]:
# Evaluate Rule-Based System on test set
np.random.seed(42)
test_indices = np.random.choice(len(df), min(50, len(df)), replace=False)

print("Evaluating Rule-Based System...")
rule_metrics = rule_based.batch_evaluate(test_indices, n_recommendations=10)

print("\nRule-Based System Metrics:")
for key, value in rule_metrics.items():
    if 'rate' in key or 'score' in key:
        print(f"  {key}: {value:.2%}")
    else:
        print(f"  {key}: {value}")

## 4. Model 2: Audio Similarity Baseline

Content-based filtering using:
- Cosine similarity on audio features
- No BPM/key constraints
- Tests if pure audio similarity works

In [None]:
# Initialize Audio Similarity Recommender
audio_features = ['energy', 'valence', 'danceability', 'acousticness', 'instrumentalness', 'loudness']
audio_sim = AudioSimilarityRecommender(df, features=audio_features)

# Test on same song
print(f"\nTesting Audio Similarity System on song index {test_idx}:")
audio_recommendations = audio_sim.recommend(test_idx, n_recommendations=10, verbose=True)

In [None]:
# Evaluate Audio Similarity System
print("Evaluating Audio Similarity System...")
audio_metrics = audio_sim.batch_evaluate(test_indices, n_recommendations=10)

print("\nAudio Similarity Metrics:")
for key, value in audio_metrics.items():
    if 'rate' in key or 'score' in key:
        print(f"  {key}: {value:.2%}")
    else:
        print(f"  {key}: {value}")

## 5. Model 3: Hybrid ML System

XGBoost model combining:
- BPM features (40% weight)
- Key compatibility (30% weight)
- Energy/audio features (30% weight)

In [None]:
# Initialize and train Hybrid ML Recommender
hybrid_ml = HybridMLRecommender(df)

print("Training Hybrid ML System...")
print("(This may take a few minutes)")
training_metrics = hybrid_ml.train(n_pairs=10000, test_size=0.2)

In [None]:
# Visualize feature importance
importance_df = training_metrics['feature_importance']

plt.figure(figsize=(10, 6))
plt.barh(importance_df['feature'], importance_df['importance'])
plt.xlabel('Importance')
plt.title('XGBoost Feature Importance')
plt.tight_layout()
plt.show()

print("\nFeature Importance:")
print(importance_df)

In [None]:
# Test Hybrid ML System
print(f"\nTesting Hybrid ML System on song index {test_idx}:")
hybrid_recommendations = hybrid_ml.recommend(test_idx, n_recommendations=10, verbose=True)

## 6. Model Comparison

Compare all three models on the same test set

In [None]:
# Comprehensive comparison
comparison_df = pd.DataFrame([
    {
        'Model': 'Rule-Based',
        'BPM Compatibility': rule_metrics['bpm_compatibility_rate'],
        'Key Compatibility': rule_metrics['key_compatibility_rate'],
        'Avg Score': rule_metrics['avg_overall_score']
    },
    {
        'Model': 'Audio Similarity',
        'BPM Compatibility': audio_metrics['bpm_compatibility_rate'],
        'Key Compatibility': audio_metrics['key_compatibility_rate'],
        'Avg Score': audio_metrics['avg_similarity_score']
    },
])

print("\nModel Comparison:")
print(comparison_df.to_string(index=False))

# Visualization
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# BPM Compatibility
comparison_df.plot(x='Model', y='BPM Compatibility', kind='bar', ax=axes[0], legend=False)
axes[0].set_ylabel('Compatibility Rate')
axes[0].set_title('BPM Compatibility by Model')
axes[0].set_ylim([0, 1])

# Key Compatibility
comparison_df.plot(x='Model', y='Key Compatibility', kind='bar', ax=axes[1], legend=False, color='orange')
axes[1].set_ylabel('Compatibility Rate')
axes[1].set_title('Key Compatibility by Model')
axes[1].set_ylim([0, 1])

plt.tight_layout()
plt.show()

## 7. Analysis & Insights

In [None]:
# Test BPM tolerance sensitivity
print("Testing BPM Tolerance Sensitivity:")
print("="*60)

tolerances = [4, 6, 8, 10]
tolerance_results = []

for tol in tolerances:
    rb = RuleBasedDJRecommender(df, bpm_tolerance=tol)
    metrics = rb.batch_evaluate(test_indices[:10], n_recommendations=10)
    tolerance_results.append({
        'Tolerance': f'±{tol}',
        'Avg Recommendations': len(rb.recommend(0, verbose=False)),
        'BPM Compat': metrics['bpm_compatibility_rate'],
        'Key Compat': metrics['key_compatibility_rate']
    })

tolerance_df = pd.DataFrame(tolerance_results)
print(tolerance_df.to_string(index=False))

plt.figure(figsize=(10, 5))
plt.plot([int(t[1:]) for t in tolerance_df['Tolerance']], tolerance_df['Avg Recommendations'], marker='o')
plt.xlabel('BPM Tolerance')
plt.ylabel('Avg Number of Recommendations')
plt.title('Impact of BPM Tolerance on Recommendation Pool Size')
plt.grid(True)
plt.show()

## 8. Conclusions

### Key Findings:

1. **Rule-Based System**: Achieves 100% BPM and key compatibility (by design) but may be too rigid
2. **Audio Similarity**: Discovers unexpected matches but poor DJ mixing compatibility (~X% BPM match)
3. **Hybrid ML**: [TO BE FILLED] Balances both approaches

### Research Questions Answered:

1. **Which feature matters most?** BPM compatibility is critical (see feature importance)
2. **Optimal BPM tolerance?** ±6 BPM provides good balance
3. **Does genre affect mixing?** [Analyze if genre data available]
4. **Can ML discover patterns?** Yes, through feature importance analysis

### Future Work:

- Implement full playlist generation
- Add real-time beat matching
- Test with professional DJs
- Deploy as web application

## 9. Demo: Interactive Recommendation

Try recommending songs for different input tracks!

In [None]:
# Interactive demo function
def demo_all_models(song_idx, n_recs=5):
    """
    Compare all three models for a given song
    """
    print("\n" + "="*70)
    print("COMPARING ALL THREE MODELS")
    print("="*70)
    
    # Show source song
    source = df.iloc[song_idx]
    print(f"\nSOURCE SONG (Index {song_idx}):")
    if 'name' in df.columns:
        print(f"  {source['name']} - {source.get('artists', 'Unknown')}")
    camelot = get_camelot_notation(source['key'], source['mode'])
    print(f"  BPM: {source['tempo']:.1f} | Key: {camelot} | Energy: {source['energy']:.2f}")
    print()
    
    # Model 1: Rule-Based
    print("\nMODEL 1: RULE-BASED")
    print("-"*70)
    rule_based.recommend(song_idx, n_recommendations=n_recs, verbose=True)
    
    # Model 2: Audio Similarity
    print("\nMODEL 2: AUDIO SIMILARITY")
    print("-"*70)
    audio_sim.recommend(song_idx, n_recommendations=n_recs, verbose=True)
    
    # Model 3: Hybrid ML (if trained)
    if hybrid_ml.model is not None:
        print("\nMODEL 3: HYBRID ML")
        print("-"*70)
        hybrid_ml.recommend(song_idx, n_recommendations=n_recs, verbose=True)

# Try it!
demo_all_models(song_idx=5, n_recs=5)

## Save Results for Demo Video

In [None]:
# Save comparison results
comparison_df.to_csv('results/model_comparison.csv', index=False)
print("Results saved to results/model_comparison.csv")

# Save feature importance
if hybrid_ml.model is not None:
    training_metrics['feature_importance'].to_csv('results/feature_importance.csv', index=False)
    print("Feature importance saved to results/feature_importance.csv")