# Chromatic Mood of Fashion Eras

**Goal:** Analyze how color palettes in Vogue runway images change over time using the Sanzo Wada color dataset.

This notebook provides a complete workflow for analyzing the chromatic evolution of fashion from the 1980s to present.

## 1. Setup and Imports

In [7]:
import sys
!{sys.executable} -m pip install opencv-python

Collecting opencv-python
  Using cached opencv_python-4.12.0.88-cp37-abi3-macosx_13_0_arm64.whl.metadata (19 kB)
Using cached opencv_python-4.12.0.88-cp37-abi3-macosx_13_0_arm64.whl (37.9 MB)
Installing collected packages: opencv-python
Successfully installed opencv-python-4.12.0.88

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49m/Library/Frameworks/Python.framework/Versions/3.13/bin/python3.13 -m pip install --upgrade pip[0m


In [1]:
!pip install numpy
!pip install pandas
!pip install pathlib
!pip install tqdm
!pip install matplotlib
!pip install scikit-learn
!pip install pillow
!pip install opencv-python
!pip install plotly
!pip install colormath



In [2]:
import numpy as np
import pandas as pd
from pathlib import Path
import warnings
warnings.filterwarnings("ignore")

from tqdm.auto import tqdm
import matplotlib.pyplot as plt

# Import our custom modules
from chromatic_utils import (
    extract_dominant_colors_lab,
    calculate_color_statistics,
    load_sanzo_wada_palettes,
    find_closest_wada_palette,
    aggregate_by_year,
    aggregate_by_decade,
    get_palette_frequency_by_year,
    get_dominant_palette_per_decade,
    compute_decade_color_distance,
    analyze_by_designer,
    analyze_by_season,
    plot_temporal_trends,
    plot_palette_heatmap,
    plot_color_diversity,
    plot_decade_color_strips,
    plot_top_palettes,
    plot_lab_distribution,
    plot_seasonal_comparison,
    create_summary_visualization
)

print("✓ All libraries loaded successfully")

ModuleNotFoundError: No module named 'colormath'

## 2. Configuration

In [None]:
# Paths
DATA_CSV = Path("vogue_dataset_output/vogue_runway_merged_30k.csv")
OUTPUT_DIR = Path("chromatic_analysis_output")
OUTPUT_DIR.mkdir(exist_ok=True)

# Processing parameters
IMAGE_RESIZE = 256  # Resize for faster processing
N_CLUSTERS = 6      # Dominant colors per image
SAMPLE_SIZE = None  # Set to int (e.g. 1000) for testing

# Analysis parameters
MIN_YEAR = 1988
MAX_YEAR = 2025

print(f"Output directory: {OUTPUT_DIR.absolute()}")

## 3. Load Data

In [None]:
df = pd.read_csv(DATA_CSV)
print(f"Loaded {len(df):,} images")

# Filter and clean
df = df[(df["year"] >= MIN_YEAR) & (df["year"] <= MAX_YEAR)].copy()
df = df[df["has_image"] == True].copy()
df = df.dropna(subset=["image_path", "year"])

if SAMPLE_SIZE:
    df = df.sample(n=min(SAMPLE_SIZE, len(df)), random_state=42)

df = df.reset_index(drop=True)
print(f"Filtered: {len(df):,} images from {df['year'].min()} to {df['year'].max()}")
df.head()

## 4. Load Sanzo Wada Palettes

In [None]:
df_sanzo = load_sanzo_wada_palettes()
df_sanzo.head(10)

## 5. Extract Colors and Map to Palettes

This step processes all images to extract dominant colors and map them to Sanzo Wada palettes.  
**Note:** This may take several minutes for large datasets.

In [None]:
results = []

print(f"Processing {len(df)} images...")

for idx, row in tqdm(df.iterrows(), total=len(df), desc="Extracting colors"):
    image_path = row["image_path"]
    
    # Extract colors
    color_data = extract_dominant_colors_lab(image_path, n_colors=N_CLUSTERS, target_size=IMAGE_RESIZE)
    if color_data is None:
        continue
    
    # Map to Sanzo Wada
    wada_match = find_closest_wada_palette(
        color_data["colors_lab"],
        df_sanzo,
        weights=color_data["proportions"]
    )
    
    # Calculate statistics
    stats = calculate_color_statistics(color_data["colors_lab"], color_data["proportions"])
    
    results.append({
        "key": row["key"],
        "designer": row["designer"],
        "year": row["year"],
        "season": row["season"],
        "category": row["category"],
        "image_path": image_path,
        "palette_id": wada_match["palette_id"],
        "palette_name": wada_match["palette_name"],
        "palette_distance": wada_match["mean_distance"],
        **stats
    })

df_results = pd.DataFrame(results)
print(f"✓ Processed {len(df_results):,} images")

# Save results
df_results.to_csv(OUTPUT_DIR / "color_analysis_results.csv", index=False)
df_results.head()

## 6. Temporal Analysis

In [None]:
# Aggregate by year
yearly_stats = aggregate_by_year(df_results)
print("Yearly Statistics:")
print(yearly_stats.head())

# Aggregate by decade
decade_stats = aggregate_by_decade(df_results)
print("\nDecade Statistics:")
print(decade_stats)

# Palette frequencies
palette_by_year = get_palette_frequency_by_year(df_results)
top_palettes = df_results["palette_id"].value_counts().head(15).index.tolist()

## 7. Visualizations

### 7.1 Summary Dashboard

In [None]:
create_summary_visualization(yearly_stats, OUTPUT_DIR / "summary_dashboard.png")
plt.show()

### 7.2 Temporal Trends

In [None]:
plot_temporal_trends(yearly_stats, OUTPUT_DIR / "temporal_trends.png")
plt.show()

### 7.3 Palette Frequency Heatmap

In [None]:
plot_palette_heatmap(palette_by_year, top_palettes, OUTPUT_DIR / "palette_heatmap.png")
plt.show()

### 7.4 Color Diversity

In [None]:
plot_color_diversity(yearly_stats, OUTPUT_DIR / "color_diversity.png")
plt.show()

### 7.5 Decade Color Strips

In [None]:
decade_palettes = get_dominant_palette_per_decade(df_results, df_sanzo)
plot_decade_color_strips(decade_palettes, OUTPUT_DIR / "decade_strips.png")
plt.show()

### 7.6 Top Palettes

In [None]:
plot_top_palettes(df_results, n=10, output_path=OUTPUT_DIR / "top_palettes.png")
plt.show()

### 7.7 LAB Color Distribution

In [None]:
plot_lab_distribution(df_results, OUTPUT_DIR / "lab_distribution.png")
plt.show()

## 8. Additional Analysis

### 8.1 Seasonal Analysis

In [None]:
season_stats = analyze_by_season(df_results)
print(season_stats)
plot_seasonal_comparison(season_stats, OUTPUT_DIR / "seasonal_comparison.png")
plt.show()

### 8.2 Designer Analysis

In [None]:
designer_stats = analyze_by_designer(df_results, min_images=50)
print("Top designers by image count:")
print(designer_stats.head(10))
designer_stats.to_csv(OUTPUT_DIR / "designer_analysis.csv")

### 8.3 Decade-to-Decade Color Distance

In [None]:
decade_distances = compute_decade_color_distance(df_results)
print(decade_distances)
decade_distances.to_csv(OUTPUT_DIR / "decade_distances.csv", index=False)

## 9. Summary

This notebook has analyzed the chromatic evolution of fashion imagery by:

1. **Extracting** dominant colors using K-Means in LAB space
2. **Mapping** colors to historical Sanzo Wada palettes
3. **Analyzing** temporal trends in lightness, saturation, and diversity
4. **Visualizing** the evolution across decades
5. **Comparing** seasonal and designer-specific color preferences

All results have been saved to the output directory.

In [None]:
print(f"Analysis complete! Results saved to: {OUTPUT_DIR.absolute()}")
print("\nOutput files:")
for f in sorted(OUTPUT_DIR.glob("*")):
    print(f"  - {f.name}")