viser — Video Encoding Optimizer

Name: Viser blends vision + optimizer — it sees the optimal encoding for every video. It's also French viser ("to aim/see").

Acknowledgment: viser builds on decades of research in rate-distortion theory, perceptual quality measurement, and content-adaptive streaming. Thank you to the engineers and researchers at Netflix, Beamr, Fraunhofer, Mux, and the broader video encoding community whose published work, open-source tools, and foundational science inform every part of this project.

viser analyzes video content and computes optimal encoding parameters using perceptual quality measurement (VMAF) and convex hull (Pareto frontier) analysis. Instead of applying a one-size-fits-all bitrate ladder, viser tailors encoding decisions to each video's content complexity, producing better quality at lower bitrates.

Content Type	Fixed Ladder @ 3 Mbps 1080p	viser Custom Ladder
Talking head (news anchor)	Excellent — bits wasted	Same quality, half the bitrate
Animation (Pixar-style)	Very good — some waste	Same quality, ~30% less bitrate
Sports (football game)	Acceptable — needs more	Same bitrate, higher quality
Film grain (dark thriller)	Poor — severely underbit	Same bitrate, transparent quality

Optimization Methods

Method	Granularity	Best For	Description
Per-Title	Whole video	VOD catalogs	Computes a custom bitrate ladder per video using convex hull analysis across resolutions, codecs, and quality levels
Per-Shot	Shot (2-30s)	Feature films, episodic	Detects scene boundaries and allocates bits across shots using Trellis optimization — complex scenes get more bits, simple get fewer
Segment-Level CRF	1-second segments	Variable complexity content	Adapts CRF per segment with closed-loop VMAF verification to maintain consistent quality
Context-Aware	Per device class	Multi-device streaming	Generates device-specific ladders (mobile/desktop/TV) with resolution caps, codecs, and VMAF models

Architecture

viser/
├── crates/
│   ├── viser-ffmpeg/         FFmpeg/FFprobe wrapper (encode, probe, path, cache)
│   ├── viser-quality/        VMAF/PSNR/SSIM measurement
│   ├── viser-hull/           Convex hull (Pareto frontier) + BD-Rate
│   ├── viser-ladder/         Ladder selection with crossover enforcement
│   ├── viser-shot/           Shot/scene detection (FFmpeg scdet)
│   ├── viser-complexity/     Spatial/temporal/DCT complexity analysis
│   ├── viser-encoding/       Shared config, preset mapping, temp cleanup
│   ├── viser-pertitle/       Per-title analysis pipeline
│   ├── viser-pershot/        Per-shot + Trellis optimization
│   ├── viser-persegment/     Segment-level CRF adaptation
│   ├── viser-contextaware/   Device-specific ladder generation
│   ├── viser-checkpoint/     Resume support for long analyses
│   ├── viser-compare/        Browser-based comparison player
│   ├── viser-chart/          Chart generation (plotters)
│   └── viser-cli/            CLI binary (clap)
├── docs/                     Principles and science docs
├── Cargo.toml
├── LICENSE
└── rustfmt.toml

Installation

# Cargo (from source)
cargo install viser-cli

# With optional revelo probe engine (pure-Rust, no ffprobe needed)
cargo install viser-cli --features revelo

# Homebrew (once accepted into homebrew-core)
brew install viser

# Or build from source
git clone https://github.com/vbasky/viser.git
cd viser
cargo build --release

Pre-built binaries for Linux, macOS (ARM + Intel), and Windows are available on the releases page.

Quick Start

Prerequisites

Rust 1.88+ (edition 2024) — install via rustup
FFmpeg with libvmaf — build from source or use a package manager
FFmpeg/FFprobe must be on PATH, or set VISER_FFMPEG / VISER_FFPROBE env vars

# Build viser
cargo build --release

# Run tests (170+ tests, ~0.5s)
cargo test --workspace

# Run your first per-title analysis
./target/release/viser per-title analyze -i video.y4m \
  --resolutions 240p --codecs libx264 --preset ultrafast

Usage

Per-title encoding (whole-video ladder)

# Run a full per-title analysis with H.264 + AV1, 3 resolutions, 7 CRF values each
viser per-title analyze -i video.mp4 \
  --codecs libx264,libsvtav1 \
  --resolutions 480p,720p,1080p \
  --preset veryfast \
  --parallel 4 \
  -o analysis.json

# Deliver the selected ladder rungs as final encodes
viser per-title deliver \
  --analysis analysis.json \
  --output-dir delivery \
  --mode capped-crf \
  --parallel 4 \
  --manifest delivery/manifest.json

per-title analyze now automatically detects audio bitrate from the source and reserves it in the delivery budget. HDR sources are detected and gated behind --allow-hdr (currently best-effort only). The analysis JSON includes an audio_bitrate_kbps field that delivery can use for budget planning.

Per-shot encoding (scene-level bit allocation)

# Detect scene boundaries
viser per-shot detect -i video.mp4 --threshold 10

# Run per-shot analysis with Trellis optimization
viser per-shot analyze -i video.mp4 --target-bitrate 2000

Segment-level CRF adaptation

viser per-segment analyze -i video.mp4 --target-vmaf 93 --codec libx264

Context-aware encoding (device-specific ladders)

viser context-aware analyze -i video.mp4 --devices mobile,desktop,tv

Content type detection (screen vs natural video)

# Analyze spatial/temporal/DCT complexity and classify content type
viser complexity analyze -i video.mp4

Screen content (slides, code, UI screencasts) needs different encoding strategies than natural video — viser detects it from complexity heuristics: static frames, sharp edges, and DCT energy vs temporal motion tradeoffs.

Visual QA with comparison player

# Measure per-frame VMAF between reference and encoded
viser quality measure --reference original.mp4 --distorted encoded.mp4 \
  --per-frame -o vmaf_data.json

# Launch side-by-side comparison player with VMAF timeline
viser compare --reference original.mp4 --encoded encoded.mp4 \
  --vmaf-data vmaf_data.json

Inspection

# Probe with ffprobe (default)
viser inspect probe video.mp4

# Probe with revelo (pure-Rust, no ffprobe needed — build with --features revelo)
viser inspect probe video.mp4 --probe-engine revelo

Direct encode

viser encode input.mp4 -o out.mp4
viser encode input.mp4 -o capped.mp4 --mode capped-crf --crf 20 --max-bitrate 3000
viser encode input.mp4 -o rung_3000k.mp4 --mode vbr --target-bitrate 3000
viser quality measure --reference a.mp4 --distorted b.mp4

per-title deliver reads a saved analysis JSON, encodes the selected ladder rungs as final delivery outputs, and writes a manifest describing the emitted files with their target and measured bitrates. Delivery supports both 2-pass VBR and capped-CRF output, plus optional local chunked encoding with automatic concatenation.

per-title analyze now detects HDR sources from probe metadata. By default it refuses HDR inputs because libvmaf-based analysis is still SDR-centric; pass --allow-hdr only for best-effort workflows. viser inspect probe surfaces the detected dynamic range and color metadata to make that decision explicit.

Supported Codecs

Codec	Flag	Notes
H.264/AVC	`libx264`	Fastest encode, widest device support
H.265/HEVC	`libx265`	~30-40% better compression than H.264
AV1	`libsvtav1`	~50% better compression, royalty-free, SVT-AV1 4.0

Design

Principle	Description
Content-aware	Tailors encoding to each video's visual complexity, not one-size-fits-all
VMAF-driven	Uses perceptual quality scores that correlate with human eyes, not PSNR
Pareto-optimal	Finds the set of encoding points where no improvement is possible without tradeoff
Four granularities	Whole-video, per-scene, per-second, per-device — pick the right level
Async + parallel	tokio-based concurrent trial encodes, semaphore-controlled parallelism
Resumable	SHA-256 checkpointing means multi-hour analyses survive crashes
BSD-2-Clause	Permissive license, no patent grant implications

Project Scale

Metric	Value
Workspace crates	15
Optimization methods	4 (per-title, per-shot, per-segment, context-aware)
Codecs	3 (H.264, H.265, AV1)
Quality metrics	5 (VMAF, PSNR, SSIM, SSIMULACRA2, Butteraugli)
License	BSD-2-Clause
MSRV	1.85

Status

All four optimization methods ported from the prior Go implementation, plus three additional features shipped since 0.3.0.

Per-Title — Convex hull, BD-Rate, resolution crossover enforcement, Netflix/Apple fixed ladder comparison, CRF and QP trial modes, checkpointing, audio bitrate-aware ladder budgets.
Delivery Path — 2-pass VBR delivery from saved analysis, capped-CRF delivery, manifest export, parallel rung generation, local chunked delivery with concat assembly.
Per-Shot — Shot detection (scdet), per-shot hulls, Trellis Lagrangian bit allocation.
Segment-Level CRF — Complexity analysis (entropy + YDIF + DCT energy), binary-search CRF per 1-second segment, closed-loop VMAF verification.
Context-Aware — Device profiles (mobile/desktop/TV/4K TV) with resolution caps, codec preferences, VMAF model selection.
Screen Content Detection — Classifies video as natural or screen content (slides/code/UI) from spatial/temporal/DCT heuristics, for encoding strategy selection.
Pure-Rust Probing — Optional revelo probe engine replaces ffprobe for metadata extraction; build with --features revelo, use with --probe-engine revelo.
SSIMULACRA2 + Butteraugli — Two extra quality metrics in viser-quality, run alongside VMAF/PSNR/SSIM.

Test Suite

The workspace contains 170+ tests covering:

# Run all tests
cargo test --workspace

# Run tests for specific algorithm crates
cargo test -p viser-hull        # convex hull, BD-rate (24 tests)
cargo test -p viser-ladder       # ladder selection, crossover, savings (19 tests)
cargo test -p viser-pershot      # Trellis optimization (12 tests)
cargo test -p viser-complexity   # complexity analysis + screen content detection (21 tests)
cargo test -p viser-ffmpeg       # probe, encode args, revelo adapter (26 tests)
cargo test -p viser-encoding     # config validation, preset mapping (13 tests)

# Run with revelo probe engine enabled
cargo test --features revelo -p viser-ffmpeg -p viser-cli

Tests cover: convex hull (empty, single, interior removal, unsorted input, per-codec), BD-rate (minimum points, negative efficiency, overlap, singular matrices, cubic fit), Trellis (empty, single shot, duration weighting, identical shots, empty hull fallback, lambda search bounds), ladder (empty, zero rungs, bitrate/VMAF filters, max VMAF cap, sorted output, Netflix/Apple reference ladders, savings), screen content (slides 90%, natural 0%, code capture 70%, empty), and revelo probe (codec mapping, color transfer, pixel format, frame rate formatting).

Backlog

Chart generation (plotters integration — not yet wired into CLI)
Distributed chunked encoding — multi-machine orchestration is still out of scope; current chunking is local-only
REST API
Scene-transition smoothing — per-shot ladders switch abruptly between shots
ABR switching optimization — ladder rungs tuned for client switching behavior, not just quality-spaced
Cost-aware optimization — factor storage/CDN cost into ladder selection
HW acceleration in quality measurement — VMAF runs on CPU via libvmaf; GPU-accelerated path

Limitations (design scope)

viser is designed for content-adaptive VOD encoding and explicitly does not address:

No ML prediction — 42+ trial encodes per analysis every time; Bitmovin/Mux predict ladders from source features in minutes, not hours. viser measures, not predicts.
HDR analysis is best-effort only — HDR is detected and gated behind --allow-hdr, but quality scoring still depends on SDR-oriented libvmaf.
No hardware encoders — NVENC, QuickSync, VideoToolbox not integrated. Software encoders only (libx264, libx265, libsvtav1).
No streaming-aware optimization — delivery can emit manifest metadata for files it wrote, but not HLS/DASH playlists or switching-aware ladder tuning.

Documentation

Document	Description
Per-Title Encoding	Convex hull, R-D optimization, ladder selection
Per-Shot Encoding	Shot detection, Trellis, constant-slope bit allocation
Content-Adaptive Encoding	Device profiles, multi-codec hulls
Segment-Level CRF	CRF tuning with complexity analysis
Quality Metrics	VMAF, PSNR, SSIM, BD-Rate
Rate Control	CRF vs QP vs VBR
Shot Detection	scdet, PySceneDetect, TransNetV2
Chunked Encoding	Parallel encoding for production
Comparison Player	Side-by-side QA with VMAF timeline
Robustness Assessment	Project assessment: strengths, weaknesses, and feature gaps

License

BSD 2-Clause License — see LICENSE for details.

H.264/HEVC encoding may require patent licenses depending on use case. AV1 is royalty-free. See NOTICE for third-party attributions.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.cargo		.cargo
.github/workflows		.github/workflows
crates		crates
docs		docs
scripts		scripts
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
deny.toml		deny.toml
justfile		justfile
rust-toolchain.toml		rust-toolchain.toml
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

viser — Video Encoding Optimizer

Optimization Methods

Architecture

Installation

Quick Start

Prerequisites

Usage

Per-title encoding (whole-video ladder)

Per-shot encoding (scene-level bit allocation)

Segment-level CRF adaptation

Context-aware encoding (device-specific ladders)

Content type detection (screen vs natural video)

Visual QA with comparison player

Inspection

Direct encode

Supported Codecs

Design

Project Scale

Status

Test Suite

Backlog

Limitations (design scope)

Documentation

License

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

viser — Video Encoding Optimizer

Optimization Methods

Architecture

Installation

Quick Start

Prerequisites

Usage

Per-title encoding (whole-video ladder)

Per-shot encoding (scene-level bit allocation)

Segment-level CRF adaptation

Context-aware encoding (device-specific ladders)

Content type detection (screen vs natural video)

Visual QA with comparison player

Inspection

Direct encode

Supported Codecs

Design

Project Scale

Status

Test Suite

Backlog

Limitations (design scope)

Documentation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages