# Spotify Data - Exploratory Data Analysis

## Goal
Understand the dataset, identify patterns, and generate insights about music trends.

## What We'll Explore
1. Dataset overview
2. Distribution of audio features
3. Popularity analysis
4. Temporal trends
5. Genre/Playlist comparisons
6. Correlations


In [None]:
# Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path

# Configure plotting
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")
%matplotlib inline

# Display settings
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)

print("✅ Imports complete!")


## 1. Load Data


In [None]:
# Load processed data
data_path = Path('../data/processed/spotify_features.parquet')

if data_path.exists():
    df = pd.read_parquet(data_path)
    print(f"✅ Loaded {len(df):,} rows from parquet")
else:
    csv_path = data_path.with_suffix('.csv')
    df = pd.read_csv(csv_path)
    print(f"✅ Loaded {len(df):,} rows from CSV")

print(f"Shape: {df.shape}")
df.head()


## 2. Basic Statistics


In [None]:
df.describe()


## 3. Visualizations

Run the analysis scripts for detailed visualizations, or continue exploring here!
