Diffy

Difference video compression: exploits what generic codecs ignore.

Raw factory video is 1.6 MB/frame. Diffy gets it to ~8 KB/frame by exploiting three priors that generic codecs (H.265, AV1) ignore:

Background stationarity: the factory floor doesn’t move. Encode it once as a JPEG keyframe (~200 KB), never again.
Cycle periodicity: workers repeat the same motion thousands of times per shift. Store one canonical cycle, then only the tiny per-cycle deltas.
Sparse foreground: only the worker’s hands and held objects change. Skip the 99% of pixels that are background.

Result: 8 hours of 1080p30 → under 10 MB. H.265 gets you to ~5 GB.

Try it: diffy.tech

Go to diffy.tech. No install. Runs entirely in your browser (WebAssembly + Pyodide).

Wait ~20 seconds on first load (downloads numpy/scipy/Pillow into WebAssembly: cached after)
Drop a video file, folder of frames, or .zip of images onto the drop zone
Watch the ASCII progress bar advance through background model → frame encoding → cycle compression
Download your .dfy file

Videos are downsampled to max 960px wide for encoding speed. The Python CLI below uses full resolution.

No data leaves your machine. No account. No server.

For full-length videos (> 10 min) or batch processing, use the Python CLI below.

What is `.dfy`?

The .dfy container format:

[4B magic "DFY\x01"]
[header: total_frames, fps, width, height, flags]
--- chunks ---
BACKGROUND   0x01   JPEG background keyframe (stored once)
CYCLE_CANON  0x02   Canonical work cycle (temporal I/P coded)
CYCLE_DELTA  0x03   Per-cycle deviation vs canonical
CYCLE_SKIP   0x05   Clone pointer (4 bytes: identical cycle)
METADATA     0x06   JSON blob
CODEBOOK     0x07   Optional VQ codebook (float16 centroids)
IMU_BLOCK    0x04   IMU quaternions (f16, zlib compressed)

Every frame is uniquely reconstructable. Training mode preserves per-cycle pixel variation (tool slip, speed jitter) as temporal deltas: signal for physical AI models.

Python library

pip install -e .

Encode a video:

from diffy import DiffyEncoder

enc = DiffyEncoder('output.dfy', fps=30, width=1920, height=1080, quality=25)

for frame in your_frame_source:          # uint8 H×W×3 numpy arrays
    enc.push_frame(frame)

enc.encode()
print(f"Written {enc.bytes_written / 1e6:.1f} MB")

From a video file (requires opencv-python):

from diffy import DiffyEncoder
DiffyEncoder.from_video('factory_shift.mp4', 'factory_shift.dfy', quality=25)

Decode:

from diffy import DiffyDecoder

dec = DiffyDecoder('factory_shift.dfy')
for frame in dec.iter_frames():          # yields uint8 H×W×3 numpy arrays
    process(frame)

Parameters:

Parameter	Default	Description
`quality`	25	Residual codec quality 1–100 (lower = smaller)
`warmup_frames`	300	Frames used to build background model
`use_temporal`	True	Inter-frame prediction within cycles
`use_bbox`	True	Encode only foreground bounding box per frame
`use_vq`	False	Train VQ codebook for ~27× extra reduction

Compression calculator

diffy.tech/calc: estimate file size and upload time for your specific setup (workers, hours/day, resolution, network speed, task repetitiveness).

Dataset testing

To test on builddotai/Egocentric-10K without downloading 16 TB:

from datasets import load_dataset
import numpy as np
from diffy import DiffyEncoder

# Stream just the evaluation split (5.49 GB)
ds = load_dataset("builddotai/Egocentric-10K-Evaluation", streaming=True, split="train")

for sample in ds.take(5):
    frames = sample['frames']  # list of uint8 H×W×3 arrays
    enc = DiffyEncoder(f"sample_{sample['id']}.dfy", fps=30,
                       width=frames[0].shape[1], height=frames[0].shape[0])
    for f in frames:
        enc.push_frame(f)
    enc.encode()

Architecture

diffy/
  __init__.py      : public API - DiffyEncoder, DiffyDecoder

egocodec/          : core implementation
  encoder.py       : encode pipeline (background model, cycle detection, compression)
  decoder.py       : reconstruct frames from .dfy
  bitstream.py     : BitstreamWriter / BitstreamReader (.dfy container)
  background.py    : BackgroundModel: running median + EMA update
  cycle_detector.py: CycleDetector: energy-valley segmentation
  temporal_codec.py: I/P frame coding within cycles (zlib)
  residual_codec.py: DCT + RLE + zlib residual codec
  vq_codec.py      : optional VQ codebook (numpy k-means++)
  imu.py           : IMU quaternion integration + frame stabilisation

Run tests

pip install pytest numpy scipy Pillow
pytest tests/

License

Apache 2.0: see LICENSE.

Contributing

PRs welcome. Key areas:

Decoder web UI: drop a .dfy file at diffy.tech/decode, export to MP4/ZIP
Streaming encoder: encode while recording, don’t buffer all frames
Better cycle detection: optical flow instead of frame-diff energy
Range coder: replace zlib with arithmetic coding for ~15% extra gain
Benchmarks: compression ratio and PSNR vs H.265 on Egocentric-10K

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
diffy-bridge		diffy-bridge
diffy-native		diffy-native
diffy		diffy
diffycodec		diffycodec
egocodec-wasm		egocodec-wasm
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CNAME		CNAME
LICENSE		LICENSE
README.md		README.md
benchmark.py		benchmark.py
benchmark_real.py		benchmark_real.py
calc.html		calc.html
decode.html		decode.html
index.html		index.html
setup.py		setup.py
test_quality.py		test_quality.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diffy

Try it: diffy.tech

What is `.dfy`?

Python library

Compression calculator

Dataset testing

Architecture

Run tests

License

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Diffy

Try it: diffy.tech

What is .dfy?

Python library

Compression calculator

Dataset testing

Architecture

Run tests

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

What is `.dfy`?

Packages