Skip to content

ashish-code/VideoAesthetics

Repository files navigation

Video Aesthetics Analysis

Python pytest scikit-learn License: MIT

Automated aesthetic quality assessment of movie trailers. Aggregates per-frame neural aesthetic scores into a histogram feature vector and trains a linear SVM to distinguish high-rated from low-rated trailers — validated against IMDb audience ratings.


Key Idea

Image aesthetics (is this photo beautiful?) has been studied extensively. This project extends aesthetic analysis to video by treating a trailer as the aggregate of aesthetics across its perceptually salient key-frames:

Movie Trailer
     ↓
Scene Detection → Exemplar Frame Extraction
     ↓
Per-frame Aesthetic Score  [ILGnet / NIMA, trained on AVA2]
     ↓
16-bin Histogram of scores  [distribution over the trailer]
     ↓
Linear SVM  →  High-rated vs Low-rated trailer

The hypothesis: visually compelling trailers show a characteristic distribution of aesthetic scores — more high-quality frames, consistent composition, balanced exposure — that a simple linear classifier can detect.


Pipeline

flowchart TB
    subgraph Acquire["Data Acquisition (scripts/)"]
        IMDB["IMDb Top Lists\n(acquire_imdb_list.py)"] --> RATINGS["Movie IDs + Ratings"]
        RATINGS --> DL["Download Trailers\n(acquire_trailers.py)\nYouTube Data API v3 + yt-dlp"]
    end

    subgraph Extract["Feature Extraction (src/)"]
        DL --> SCENES["Scene Detection\nPySceneDetect content threshold=30"]
        SCENES --> FRAMES["Exemplar Frame Extraction\nscene.SceneExtractor"]
        FRAMES --> SCORE["Per-frame Aesthetic Score\nscorer.AestheticScorer  ← ILGnet/Caffe"]
    end

    subgraph Classify["Classification (src/)"]
        SCORE --> HIST["16-bin Aesthetic Histogram\nfeatures.build_aesthetic_histogram"]
        RATINGS --> LABEL["Binary Label\nIMDb > 8.2 = high, < 2.8 = low"]
        HIST --> SVM["Linear SVM  6-fold Stratified CV\nclassifier.TrailerClassifier"]
        LABEL --> SVM
        SVM --> ROC["ROC / AUC\nclassifier.CVResults"]
    end
Loading

Repository Layout

VideoAesthetics/
├── src/
│   └── video_aesthetics/           # Installable Python package
│       ├── __init__.py             # Public API exports
│       ├── _timeout.py             # Thread-based timeout decorator
│       ├── scorer.py               # AestheticScorer — lazy-loading Caffe/ILGnet wrapper
│       ├── scene.py                # SceneExtractor — PySceneDetect + pims frame extraction
│       ├── features.py             # build_aesthetic_histogram, build_dataset, CSV I/O
│       ├── classifier.py           # TrailerClassifier — linear SVM, CV, ROC/AUC
│       └── cli.py                  # video-aesthetics console entry point
├── scripts/
│   ├── acquire_imdb_list.py        # Scrape IMDb Top/Bottom/Genre lists (cinemagoer)
│   └── acquire_trailers.py         # YouTube search + yt-dlp download pipeline
├── tests/
│   ├── test_features.py            # 18 tests: histogram, CSV I/O, dataset construction
│   ├── test_classifier.py          # 11 tests: SVM CV, ROC, plot output
│   └── test_timeout.py             # 6 tests: timelimit decorator
├── util/                           # Legacy Python 2/3 scripts (reference only)
│   ├── _legacy_aesthetic_score.py
│   ├── _legacy_trailer_aesthetic_desc.py
│   ├── _legacy_find_frames.py
│   └── ...
├── aesthetic_data.csv              # Sample aesthetic histogram features
├── ROC.png                         # ROC curve from SVM evaluation
├── pyproject.toml                  # Build config (hatchling), uv deps, pytest settings
└── requirements.txt                # Legacy flat requirements (see pyproject.toml)

Installation

Requires Python 3.10+. Uses uv for dependency management.

git clone https://github.com/ashish-code/VideoAesthetics.git
cd VideoAesthetics

# Using uv (recommended)
uv sync --extra dev

# Or using pip
pip install -e ".[dev]"

For data acquisition (IMDb + YouTube + download):

uv sync --extra acquisition

For scene detection and frame extraction:

uv sync --extra scene

Quick Start

1 — Data Acquisition

# Set your YouTube Data API v3 key (never hard-code it)
export YOUTUBE_API_KEY="your-key-here"

# Scrape IMDb Top 250 movie IDs
python scripts/acquire_imdb_list.py --list top250 --output data/top250_ids.txt

# Download trailers (YouTube search → yt-dlp)
python scripts/acquire_trailers.py \
    --imdb-list data/top250_ids.txt \
    --output-dir data/trailer_video \
    --trailer-list data/trailer_ratings.txt

2 — Feature Extraction

# Set the dataset root
export VIDEO_AESTHETICS_DATA_ROOT=/path/to/data

# Extract exemplar frames via scene detection
video-aesthetics scenes --data-root $VIDEO_AESTHETICS_DATA_ROOT

# Score frames with ILGnet (requires Caffe + model weights)
export ILGNET_ROOT=~/Repos/ILGnet
video-aesthetics score --data-root $VIDEO_AESTHETICS_DATA_ROOT

# Build the aesthetic histogram feature CSV
video-aesthetics features --data-root $VIDEO_AESTHETICS_DATA_ROOT

3 — Classification

# Train SVM and plot ROC curve
video-aesthetics classify \
    --data-root $VIDEO_AESTHETICS_DATA_ROOT \
    --roc-output results/ROC.png

Python API

from pathlib import Path
from video_aesthetics.features import build_dataset, save_feature_csv
from video_aesthetics.classifier import TrailerClassifier

# Load pre-computed features
X, y, ids = build_dataset(
    rating_file=Path("data/trailer_ratings.txt"),
    score_dir=Path("data/scene_aesthetics"),
)

# Cross-validate and plot ROC
clf = TrailerClassifier(n_folds=6)
results = clf.cross_validate(X, y)
print(f"Mean AUC: {results.mean_auc:.3f} ± {results.std_auc:.3f}")
clf.plot_roc(results, output_path=Path("results/ROC.png"))

Running Tests

# Run all 35 tests
pytest

# With coverage
pytest --cov=video_aesthetics --cov-report=term-missing

Aesthetic Scoring Model

Frame-level aesthetic scores are computed using a pre-trained ILGnet (Image-Level Generative network) model trained on the AVA2 dataset (Murray et al., CVPR 2012). The model is loaded via the Caffe framework and outputs a continuous aesthetic quality score in [0, 1] for each frame.

Note: Caffe has no official Python 3.10+ support. The AestheticScorer class uses lazy model loading to avoid import-time errors when Caffe is absent. A drop-in PyTorch NIMA replacement (e.g. idealo/image-quality-assessment) can be substituted by subclassing AestheticScorer.

Set the model path via environment variable:

export ILGNET_ROOT=~/Repos/ILGnet   # directory with deploy.prototxt + .caffemodel
export VIDEO_AESTHETICS_GPU=1        # optional: use GPU (requires GPU-enabled Caffe)

Datasets

Source Content API
IMDb Top 250 Movie ratings cinemagoer (Python IMDb API)
YouTube Data API v3 Trailer video URLs YOUTUBE_API_KEY env var
yt-dlp Video download Open-source, no API key required

Rating thresholds: Trailers from movies rated > 8.2 (high) or < 2.8 (low) are used for binary classification, focusing on the discriminative extremes of the rating distribution.


API Reference

AestheticScorer(ilgnet_root=None, use_gpu=False)

Lazy-loading wrapper for the ILGnet Caffe model.

  • score(image_path)float — aesthetic quality in [0, 1].
  • score_trailer(trailer_id, ...)Path — score all frames, write CSV.
  • process_trailer_list(rating_file, ...) — batch mode with skip-on-exist.

SceneExtractor(data_root, ...)

PySceneDetect-based scene boundary detection and frame extraction.

  • extract(trailer_id) — detect scenes, save one PNG per scene (with timeout).
  • process_trailer_list(rating_file) — batch mode with shuffle + skip.

build_aesthetic_histogram(scores, num_bins=16)np.ndarray

Convert per-frame scores to a normalised histogram feature vector.

build_dataset(rating_file, score_dir, ...)(X, y, ids)

Build feature matrix and binary labels from scored trailers.

TrailerClassifier(n_folds=6, svm_c=1.0)

Linear SVM with stratified k-fold cross-validation.

  • cross_validate(X, y)CVResults — per-fold and aggregate ROC/AUC.
  • evaluate(X, y)float — accuracy on a held-out test split.
  • plot_roc(results, output_path=None) — render and optionally save the ROC curve.

References

  1. Murray, N., Marchesotti, L., Perronnin, F. (2012). AVA: A Large-Scale Database for Aesthetic Visual Analysis. CVPR.
  2. Talebi, H., Milanfar, P. (2018). NIMA: Neural Image Assessment. IEEE TIP.
  3. Gupta, A. (2018). Video Aesthetics Analysis via Aggregate Frame Aesthetic Descriptors.

License

MIT — see LICENSE.


Built by Ashish Gupta · Senior Data Scientist, BrightAI

About

Aesthetic quality assessment of movie trailers: scene detection → per-frame NIMA scoring → histogram features → linear SVM classifier. Correlated with IMDb audience ratings.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages