A 100% pure Rust media transcoding and streaming framework. No C libraries, no FFI wrappers, no *-sys crates — just Rust, all the way down.
- Pure Rust implementation. Never depend on
ffmpeg,libav,x264,libvpx,libopus, or any other C library — directly or transitively. Every codec, container, and filter is implemented from the spec. - Clean abstractions for codecs, containers, timestamps, and streaming formats.
- Composable pipelines: media input → demux → decode → transform → encode → mux → output, with pass-through mode for remuxing without re-encoding.
- Modular workspace: per-format crates for complex modern codecs/containers, a shared crate for simple standard formats, and an aggregator crate that ties them together behind Cargo features.
- Wrapping existing C codec libraries.
- Perfect feature parity with FFmpeg on day one. Codec and container coverage grows incrementally.
- GPU-specific acceleration (may come later through pure-Rust compute libraries, but never C drivers).
The workspace is a set of Cargo crates under crates/, grouped by role:
- Infrastructure —
oxideav-core(primitives: Packet / Frame / Rational / Timestamp / PixelFormat / ExecutionContext),oxideav-codec(Decoder / Encoder traits + registry),oxideav-container(Demuxer / Muxer traits + registry),oxideav-pipeline(source → transforms → sink composition). - I/O —
oxideav-source(generic SourceRegistry + file driver + BufferedSource),oxideav-http(HTTP/HTTPS driver, opt-in via feature). - Effects + conversions —
oxideav-audio-filter(Volume / NoiseGate / Echo / Resample / Spectrogram),oxideav-pixfmt(pixel-format conversion matrix + palette generation + dither). - Job graph —
oxideav-job(JSON transcode graph + pipelined multithreaded executor). - Containers — one crate each for
oxideav-ogg/-mkv/-mp4/-avi/-iff. Simple containers (WAV, raw PCM, slin) live insideoxideav-basic. - Codec crates — one crate per codec family; see the
Codecs table below for the per-codec status. Tracker formats
(
oxideav-mod,oxideav-s3m) are decoder-only by design. Codec scaffolds that register-but-refuse (JPEG XL, JPEG 2000, AVIF) reserve their codec ids so the API surface stays forward-compatible. - Aggregator —
oxideavre-exports every enabled crate behind Cargo features.Registries::with_all_features()builds a registry covering every format compiled in. - Binaries —
oxideav-cli(theoxideavCLI:list/probe/remux/transcode/run/validate/dry-run) andoxideplay(reference SDL2 + TUI player).
Use cargo run --release -p oxideav-cli -- list to enumerate the codec
and container matrix actually compiled into the release binary.
- Packet — a chunk of compressed (encoded) data belonging to one stream, with timestamps.
- Frame — a chunk of uncompressed data (audio samples or a video picture).
- Stream — one media track inside a container (audio, video, subtitle…).
- TimeBase / Timestamp — rational time base per stream; timestamps are integers in that base.
- Demuxer — reads a container, emits Packets per stream.
- Decoder — turns Packets of a given codec into Frames.
- Encoder — turns Frames into Packets.
- Muxer — writes Packets into an output container.
- Pipeline — connects these pieces. A pipeline can pass Packets straight from Demuxer to Muxer (remux, no quality loss) or route through Decoder → [Filter] → Encoder.
Every codec crate in OxideAV is designed to be usable on its own.
Pull only oxideav-core (types), oxideav-codec (trait + registry),
and the codec itself:
[dependencies]
oxideav-core = "0.0"
oxideav-codec = "0.0"
oxideav-g711 = "0.0" # or any other codec crateuse oxideav_codec::CodecRegistry;
use oxideav_core::{CodecId, CodecParameters, Frame, Packet, TimeBase};
let mut reg = CodecRegistry::new();
oxideav_g711::register(&mut reg);
let mut params = CodecParameters::audio(CodecId::new("pcm_mulaw"));
params.sample_rate = Some(8_000);
params.channels = Some(1);
let mut dec = reg.make_decoder(¶ms)?;
dec.send_packet(&Packet::new(0, TimeBase::new(1, 8_000), ulaw_bytes))?;
let Frame::Audio(a) = dec.receive_frame()? else { unreachable!() };
// `a.data[0]` is S16 PCM.The canonical walkthrough of the send_packet / receive_frame /
flush / reset loop lives in
oxideav-codec's README.
Each codec crate's README has a concrete example tailored to its
payload shape.
oxideav list (via the CLI) prints the live, build-time-accurate
codec + container matrix with per-implementation capability flags —
that's the source of truth at any point. The tables below are the
human-readable summary, grouped + collapsible so the page stays
scannable.
Containers (click to expand)
Container format detection is content-based: each container ships a
probe that scores the first 256 KB against its magic bytes. The file
extension is a tie-breaker hint, not the source of truth — a .mp4
that's actually a WAV opens correctly.
| Container | Demux | Mux | Seek | Notes |
|---|---|---|---|---|
| WAV | ✅ | ✅ | ✅ | LIST/INFO metadata; byte-offset seek |
| FLAC | ✅ | ✅ | ✅ | VORBIS_COMMENT, streaminfo, PICTURE block; SEEKTABLE-based seek |
| Ogg | ✅ | ✅ | ✅ | Vorbis/Opus/Theora/Speex pages + comments; page-granule bisection |
| Matroska | ✅ | ✅ | ✅ | MKV/MKA/MKS; DocType-aware probe; Cues-based seek |
| WebM | ✅ | ✅ | ✅ | First-class: separate fourcc, codec whitelist (VP8/VP9/AV1/Vorbis/Opus); inherits Matroska Cues seek |
| MP4 | ✅ | ✅ | ✅ | mp4/mov/ismv brands, faststart, iTunes ilst metadata; sample-table seek |
| AVI | ✅ | ✅ | ✅ | LIST INFO, avih duration; idx1 keyframe-index seek |
| MP3 | ✅ | ✅ | ✅ | ID3v2/v1 tags + cover art, Xing/VBRI TOC seek (+ CBR fallback), frame sync with mid-stream resync |
| IFF / 8SVX | ✅ | ✅ | — | Amiga IFF with NAME/AUTH/ANNO/CHRS |
| IVF | ✅ | — | — | VP8 elementary stream container |
| AMV | ✅ | — | — | Chinese MP4 player format (RIFF-like) |
| WebP | ✅ | — | — | RIFF/WEBP (lossy + lossless + animation) |
| PNG / APNG | ✅ | ✅ | — | 8 + 16-bit, all color types, APNG animation |
| GIF | ✅ | ✅ | — | GIF87a/GIF89a, LZW, animation + NETSCAPE2.0 loop |
| JPEG | ✅ | ✅ | — | Still-image wrapper around the MJPEG codec |
| slin | ✅ | ✅ | — | Asterisk raw-PCM: .sln/.slin/.sln8..192 |
| MOD / S3M | ✅ | — | — | Tracker modules (decode-only by design) |
Cross-container remux works for any pair whose codecs don't require rewriting (FLAC ↔ MKV, Ogg ↔ MKV, MP4 ↔ MOV, etc.).
Audio (click to expand)
| Codec | Decode | Encode |
|---|---|---|
| PCM (s8/16/24/32/f32/f64) | ✅ all variants | ✅ all variants |
| slin (Asterisk raw PCM) | ✅ .sln/.slin/.sln16/.sln48 etc. | ✅ same — headerless S16LE |
| FLAC | ✅ bit-exact vs reference | ✅ bit-exact vs reference |
| Vorbis | ✅ matches lewton/ffmpeg (type-0/1/2 residue) | ✅ stereo coupling + ATH floor |
| Opus | ✅ CELT mono+stereo; SILK NB/MB/WB mono 10+20+40+60 ms; SILK stereo | ✅ CELT-only full-band 20 ms (new_celt_only_full_band() constructor) |
| MP1 | ✅ all modes, RMS 2.9e-5 vs ffmpeg | ✅ CBR (greedy allocator, 89 dB PSNR on pure tone) |
| MP2 | ✅ all modes, RMS 2.9e-5 vs ffmpeg | ✅ CBR mono+stereo (greedy allocator, ~31 dB PSNR) |
| MP3 | ✅ MPEG-1 Layer III (M/S stereo) | ✅ CBR mono+stereo |
| AAC-LC | ✅ mono+stereo, M/S, IMDCT | ✅ mono+stereo, ffmpeg accepts |
| CELT | ✅ full §4.3 pipeline (energy + PVQ + IMDCT + post-filter) | ✅ mono + stereo dual-stereo (intra-only long-block; energy + PVQ + fMDCT) |
| Speex | ✅ NB modes 1-8 + WB via QMF+SB-CELP (+ formant postfilter) | ✅ NB mode-5 CELP + WB sub-mode-1 (QMF split, 16 kHz @ ~16.6 kbit/s) |
| GSM 06.10 | ✅ full RPE-LTP | ✅ full RPE-LTP (standard + WAV-49) |
| G.711 (μ-law / A-law) | ✅ ITU tables | ✅ ITU tables (pcm_mulaw / pcm_alaw + aliases) |
| G.722 | ✅ 64 kbit/s QMF + dual-band ADPCM (37 dB PSNR, self-consistent tables) | ✅ same roundtrip |
| G.723.1 | ✅ ACELP + MP-MLQ decode | ✅ 5.3k ACELP + 6.3k MP-MLQ (default 6.3k) |
| G.728 | ✅ LD-CELP 50-order backward-adaptive; placeholder codebook | ✅ exhaustive 128×8 analysis-by-synthesis |
| G.729 | ✅ CS-ACELP (non-spec tables, produces audible speech) | ✅ symmetric encoder |
| IMA-ADPCM (AMV) | ✅ | ✅ (33.8 dB PSNR roundtrip) |
| 8SVX | ✅ | ✅ via FORM/8SVX container muxer |
Video (click to expand)
| Codec | Decode | Encode |
|---|---|---|
| MJPEG | ✅ baseline + progressive 4:2:0/4:2:2/4:4:4/grey | ✅ baseline |
| FFV1 | ✅ v3, 4:2:0/4:4:4 | ✅ v3 |
| MPEG-1 video | ✅ I+P+B frames | ✅ I+P+B frames (half-pel ME, FWD/BWD/BI B-modes, 43 dB PSNR) |
| MPEG-4 Part 2 | ✅ I+P-VOP, half-pel MC | ✅ I+P-VOP (41-43 dB PSNR, 21% vs all-I) |
| Theora | ✅ I+P frames | ✅ I+P frames (45 dB PSNR, 3.7× vs all-I) |
| H.263 | ✅ I+P pictures, half-pel MC | ✅ I+P pictures, diamond-pattern motion search (±15 pel range), 46 dB PSNR on sliding-gradient |
| H.264 | Baseline I-slice skeleton: CAVLC + intra-pred + transforms + deblocking; 100% on solid-gray IDR | — |
| H.265 (HEVC) | NAL + VPS/SPS/PPS/slice parse | — |
| VP8 | ✅ I+P frames (6-tap sub-pel + MV decode + ref management) | ✅ I-frame only (DC_PRED, 42 dB PSNR at qindex 50) |
| VP9 | Header parse + bool decoder, DC/V/H intra-pred, 4×4/8×8 iDCT; partition syntax pending | — |
| AV1 | OBU + sequence/frame header parse, range decoder, DC/V/H intra-pred, 4×4/8×8 DCT | — |
| AMV video | ✅ (synthesised JPEG header + vertical flip) | ✅ (via MJPEG encoder, 33 dB PSNR roundtrip) |
| ProRes 422 | ✅ 4:2:2 Proxy/LT/Standard (forward/inverse 8×8 DCT + zig-zag + exp-Golomb) | ✅ same (Yuv422P in; 44 dB PSNR at quant 4). MP4 .mov FourCC wiring still TODO. |
Image (click to expand)
| Codec | Decode | Encode |
|---|---|---|
| PNG / APNG | ✅ 5 color types × 8/16-bit, all 5 filters, APNG animation | ✅ same matrix + APNG emit |
| GIF | ✅ GIF87a/89a, LZW, interlaced, animation | ✅ GIF89a, animation, per-frame palettes |
| WebP VP8L | ✅ full lossless (Huffman + LZ77 + transforms) | ✅ lossless (no transforms, byte-identical roundtrip) |
| WebP VP8 | ✅ lossy (via VP8 decoder) | ✅ lossy (via VP8 I-frame enc, 32 dB PSNR) |
| JPEG (still) | ✅ via MJPEG codec | ✅ via MJPEG codec |
Trackers (decode-only by design) (click to expand)
| Codec | Decode | Encode |
|---|---|---|
| MOD | ✅ 4-channel Paula-style mixer + main effects | — |
| S3M | ✅ stereo + SCx/SDx/SBx effects | — |
Subtitles (click to expand)
All text formats parse to a unified IR (SubtitleCue with rich-text
Segments: bold / italic / underline / strike / color / font / voice /
class / karaoke / timestamp / raw) so cross-format conversion preserves
as much styling as each pair can represent. Bitmap-native formats (PGS,
DVB, VobSub) decode directly to Frame::Video(Rgba).
Text formats — in oxideav-subtitle:
| Format | Decode | Encode | Notes |
|---|---|---|---|
| SRT (SubRip) | ✅ | ✅ | <b>/<i>/<u>/<s>, <font color> hex + 17 named, <font face size> |
| WebVTT | ✅ | ✅ | Header, STYLE ::cue(.class), REGION, inline b/i/u/c/v/lang/ruby/timestamp, cue settings |
| MicroDVD | ✅ | ✅ | frame-based, {y:b/i/u/s}, {c:$BBGGRR}, {f:family} |
| MPL2 | ✅ | ✅ | decisecond timing, / italic, | break |
| MPsub | ✅ | ✅ | relative-start timing, FORMAT=TIME, TITLE=/AUTHOR= |
| VPlayer | ✅ | ✅ | HH:MM:SS:text, end inferred |
| PJS | ✅ | ✅ | frame-based, quoted body |
| AQTitle | ✅ | ✅ | -->> N frame markers |
| JACOsub | ✅ | ✅ | \B/\I/\U, #TITLE/#TIMERES headers |
| RealText | ✅ | ✅ | HTML-like <time>/<b>/<i>/<u>/<font>/<br/> |
| SubViewer 1/2 | ✅ | ✅ | marker-based v1, [INFORMATION] header v2 |
| TTML | ✅ | ✅ | W3C Timed Text, <tt>/<head>/<styling>/<style>/<p>/<span>/<br/>, tts:* styling |
| SAMI | ✅ | ✅ | Microsoft, <SYNC Start=ms> + <STYLE> CSS classes |
| EBU STL | ✅ | ✅ | ISO/IEC 18041 binary GSI+TTI (text mode only; bitmap + colour variants deferred) |
Advanced text (own crate) — oxideav-ass:
| Format | Decode | Encode | Notes |
|---|---|---|---|
| ASS / SSA | ✅ | ✅ | Script Info + V4+/V4 Styles (BGR+inv-alpha) + override tags (b/i/u/s/c/fn/fs/pos/an/k/kf/ko/N/n/h). Animated tags (\t, \fad, \move, \clip, \fscx/y, \frz, \blur) preserved as opaque raw so text survives round-trip |
Bitmap-native (own crate) — oxideav-sub-image:
| Format | Decode | Encode | Notes |
|---|---|---|---|
PGS / HDMV (.sup) |
✅ | — | Blu-ray subtitle stream; PCS/WDS/PDS/ODS + RLE + YCbCr palette → RGBA |
| DVB subtitles | ✅ | — | ETSI EN 300 743 segments + 2/4/8-bit pixel-coded objects |
VobSub (.idx+.sub) |
✅ | — | DVD SPU with control commands + RLE + 16-colour palette |
Cross-format transforms (text side): srt_to_webvtt,
webvtt_to_srt in oxideav-subtitle; srt_to_ass, webvtt_to_ass,
ass_to_srt, ass_to_webvtt in oxideav-ass. Other pairs go through
the unified IR directly (parse → IR → write).
Text → RGBA rendering — any decoder producing Frame::Subtitle can
be wrapped with RenderedSubtitleDecoder::make_rendered_decoder(inner, width, height) which emits Frame::Video(Rgba) at the caller-
specified canvas size, one new frame per visible-state change.
Embedded 8×16 bitmap font covers ASCII + Latin-1 supplement; bold via
smear, italic via shear; 4-offset outline. No TrueType dep, no CJK.
In-container subtitles (MKV / MP4 subtitle tracks) remain a scoped follow-up.
Scaffolds — API registered, pixel/sample decode not yet implemented (click to expand)
| Codec | Status |
|---|---|
| JPEG XL | stub — registered, returns Error::Unsupported on decode/encode |
| JPEG 2000 | stub — ditto |
| AVIF | stub — gated on full AV1 tile decode, which is itself pending |
The oxideav-id3 crate parses ID3v2.2 / v2.3 / v2.4 tags (whole-tag
- per-frame unsync, extended header, v2.4 data-length indicator,
encrypted/compressed frames recorded as
Unknown) plus the legacy 128-byte ID3v1 trailer. Text frames (T*, TXXX), URLs (W*, WXXX), COMM / USLT, and APIC / PIC picture frames are handled structurally; less-common frames (SYLT, RGAD/RVA2, PRIV, GEOB, UFID, POPM, MCDI, …) survive asUnknownwith their raw bytes available.
oxideav-mp3 and oxideav-flac containers surface the extracted
fields via the standard Demuxer::metadata() (Vorbis-comment-style
keys: title, artist, album, date, genre, track,
composer, …) and cover art via a new
Demuxer::attached_pictures() method returning
&[AttachedPicture] (MIME type + one-of-21 picture-type enum +
description + raw image bytes). FLAC's native
METADATA_BLOCK_PICTURE is handled natively; FLAC wrapped in ID3
(a few oddball taggers) works via the fallback path.
oxideav probe file.mp3 prints a Metadata: section and an
Attached pictures: section with per-picture summary.
The oxideav-audio-filter crate provides:
- Volume — gain adjustment with configurable scale factor
- NoiseGate — threshold-based gate with attack/hold/release
- Echo — delay line with feedback
- Resample — polyphase windowed-sinc sample rate conversion
- Spectrogram — STFT → image (Viridis/Magma colormaps, RGB + PNG output)
The oxideav-pixfmt crate is the shared conversion layer for video
codecs. The PixelFormat enum covers ~30 first-tier formats (ffmpeg
equivalent names in parentheses):
- RGB family:
Rgb24,Bgr24,Rgba,Bgra,Argb,Abgr, plus 16-bit-per-channelRgb48Le/Rgba64Le. - YUV planar:
Yuv420P/Yuv422P/Yuv444Pat 8 / 10 / 12-bit, plus JPEG-full-range variants (YuvJ420P,YuvJ422P,YuvJ444P). - YUV semi-planar:
Nv12,Nv21. YUV packed:Yuyv422,Uyvy422. - Grayscale:
Gray8,Gray10Le,Gray12Le,Gray16Le. - Alpha-bearing:
Ya8,Yuva420P. - Palette:
Pal8. 1-bit:MonoBlack,MonoWhite.
oxideav_pixfmt::convert(src, dst_format, &ConvertOptions) handles
the live conversion matrix (RGB all-to-all swizzles, YUV↔RGB under
BT.601 / BT.709 × limited / full range, NV12/NV21 ↔ Yuv420P, Gray ↔
RGB, Rgb48 ↔ Rgb24, Pal8 ↔ RGB with optional dither). Palette
generation via generate_palette() offers MedianCut and Uniform
strategies. Dither options: None, 8×8 ordered Bayer, Floyd-Steinberg.
Codecs declare accepted_pixel_formats on their CodecCapabilities;
the job graph (below) auto-inserts conversion when the upstream
format doesn't match.
The oxideav-job crate is a declarative way to describe multi-output
transcode pipelines. A job is a JSON object: keys are output
filenames (or reserved sinks like @null / @display), values
describe tracks grouped by audio / video / subtitle / all,
and each track carries a recursive input tree of source refs and
filter / convert nodes.
{
"threads": 8,
"@in": {"all": [{"from": "movie.mp4"}]},
"out.mkv": {
"video": [{"from": "@in", "codec": "h264", "codec_params": {"crf": 23}}],
"audio": [{"from": "@in", "codec": "flac"}]
},
"out.png": {"video": [{"from": "@in", "convert": "rgba"}]}
}The executor has two modes: serial (threads == 1) runs one
packet at a time; pipelined (threads ≥ 2, default when
available_parallelism() ≥ 2) spawns one worker thread per stage
per track connected by bounded mpsc channels. The mux/sink loop runs
on the caller's thread so JobSink implementations don't need to be
Send (the SDL2 player sink in oxideplay stays a single-threaded
object). Both modes produce byte-identical output for deterministic
jobs.
Decoder / Encoder trait hook: set_execution_context(&ExecutionContext)
(default no-op) lets codecs opt into slice- / GOP-parallel work later
without trait churn.
Explicit pixel-format conversion nodes ({"convert": "yuv420p", "input": ...}) fit anywhere in the input tree; the resolver also
auto-inserts a PixConvert stage between Decode and Encode when a
codec's accepted_pixel_formats list excludes the upstream format.
The source layer decouples I/O from container parsing. Container
demuxers receive an already-opened Box<dyn ReadSeek> and never touch
the filesystem directly. The SourceRegistry resolves URIs to readers:
| Scheme | Driver | Notes |
|---|---|---|
bare path / file:// |
built-in | std::fs::File |
http:// / https:// |
oxideav-http (opt-in) |
ureq + rustls, Range-request seeking |
The HTTP driver is off by default in the library (http cargo feature)
and on by default in oxideplay and oxideav-cli.
BufferedSource wraps any ReadSeek with a prefetch ring buffer
(64 MiB default in oxideplay, configurable via --buffer-mib). A
worker thread fills the ring ahead of the read cursor; seeks inside the
window are free.
$ oxideav probe https://download.blender.org/peach/bigbuckbunny_movies/BigBuckBunny_320x180.mp4
Input: https://download.blender.org/peach/bigbuckbunny_movies/BigBuckBunny_320x180.mp4
Format: mp4
Duration: 00:09:56.46
Stream #0 [Video] codec=h264 video 320x180
Stream #1 [Audio] codec=aac audio 2ch @ 48000 Hz
An opt-in binary crate oxideplay implements a reference player with
SDL2 (audio + video) and a crossterm TUI. SDL2 is loaded at runtime
via libloading — oxideplay doesn't link against SDL2 at build
time, so the binary builds and ships without requiring SDL2 dev
headers. If SDL2 isn't installed on the target machine, the player
exits cleanly with a "library not found" message instead of failing
to start. The core oxideav library remains 100% pure Rust.
cargo run -p oxideplay -- /path/to/file.mkv
cargo run -p oxideplay -- https://example.com/video.mp4
Keybinds: q quit, space pause, ← / → seek ±10 s, ↑ / ↓ seek
±1 min (up = forward, down = back), pgup / pgdn seek ±10 min, *
volume up, / volume down. Works from the SDL window (when a video
stream is present) or from the TTY.
oxideav command-line verbs: list, probe, remux, transcode,
run, validate, dry-run. Inputs can be local paths or HTTP(S)
URLs.
$ oxideav list # print registered codecs + containers
$ oxideav probe song.flac
$ oxideav transcode song.flac song.wav
$ oxideav remux input.ogg output.mkv
$ oxideav probe https://example.com/video.mp4
# JSON job graph
$ oxideav run job.json
$ oxideav run - < job.json
$ oxideav run --inline '{"out.mkv":{"audio":[{"from":"in.mp3"}]}}'
$ oxideav run --threads 4 job.json # override thread budget
$ oxideav validate job.json # check without running
$ oxideav dry-run job.json # print the resolved DAG
oxideplay --job <file> runs a job where @display / @out binds
to the SDL2 player sink; other outputs (file paths) write to disk in
the same run.
cargo build --workspace
cargo test --workspace
The oxideav binary is produced by the oxideav-cli crate:
cargo run -p oxideav-cli -- --help
A handful of fully-spec-complete codecs have been extracted into their
own repositories under the
OxideAV organization and are consumed from
crates.io. To hack on them locally alongside this repo, clone them as
siblings and run scripts/dev-patch.sh:
# layout: parent/
# ├── oxideav/ (this repo)
# └── oxideav-<name>/ (any OxideAV/oxideav-* clone)
git clone git@github.com:OxideAV/oxideav-gsm.git ../oxideav-gsm
./scripts/dev-patch.sh # generates .cargo/config.toml
cargo run -p oxideplay -- some.wav
scripts/dev-patch.sh rewrites .cargo/config.toml with a
[patch.crates-io] entry for every ../oxideav-* sibling it finds,
plus every in-workspace crates/oxideav-* crate. The file is
gitignored, so each dev owns their own layout. Re-run the script after
adding or removing a sibling.
MIT — see LICENSE. Copyright © 2026 Karpelès Lab Inc.