backend/device_id has been split into two stages:
- decode stage:
decode_backend,decode_device_id - process stage (patch pipeline only):
process_backend,process_device_id
This is a breaking API upgrade. Old fields are removed (no compatibility shim).
auto resolution rules:
decode_backend=auto: usesgpuif CUDA is available, elsecpuprocess_backend=auto: follows resolveddecode_backend
Output device rule:
process_backend=cpu:patchesandmetadata_tensorsare CPU tensorsprocess_backend=gpu:patchesandmetadata_tensorsare oncuda:process_device_id
codec-patch-stream provides two native APIs:
decode_only(DecodeConfig): uniform sampled decode outputpatch_stream(PatchStreamConfig): decode + energy-based patch selection
uv venv .venv
source .venv/bin/activate
uv pip install --python .venv/bin/python torchCPU-only build:
CODEC_BUILD_NATIVE=1 CODEC_ENABLE_GPU=0 \
uv pip install -e . --python ./.venv/bin/python --no-build-isolationCPU+GPU build:
CODEC_BUILD_NATIVE=1 CODEC_ENABLE_GPU=1 \
uv pip install -e . --python ./.venv/bin/python --no-build-isolationfrom codec_patch_stream import PatchStreamConfig, patch_stream
session = patch_stream(
PatchStreamConfig(
video_path="/path/to/video.mp4",
sequence_length=16,
decode_backend="cpu", # auto|cpu|gpu
process_backend="gpu", # auto|cpu|gpu
decode_device_id=0,
process_device_id=0,
uniform_strategy="auto",
input_size=224,
patch_size=14,
k_keep=2048,
output_dtype="bf16",
)
)
patches = session.patches
meta_fields, meta_scores = session.metadata_tensors
session.close()Supported combinations (GPU build):
cpu -> cpucpu -> gpugpu -> cpugpu -> gpu
from codec_patch_stream import DecodeConfig, decode_only
decoded = decode_only(
DecodeConfig(
video_path="/path/to/video.mp4",
sequence_length=16,
decode_backend="auto", # auto|cpu|gpu
decode_device_id=0,
uniform_strategy="auto",
)
)
frames = decoded.frames- Old:
DecodeConfig(backend=..., device_id=...)
New:
DecodeConfig(decode_backend=..., decode_device_id=...)
- Old:
PatchStreamConfig(backend=..., device_id=...)
New:
PatchStreamConfig(decode_backend=..., process_backend=..., decode_device_id=..., process_device_id=...)
| Decode -> Process | Recommended when |
|---|---|
cpu -> cpu |
no GPU runtime or lightweight workloads |
cpu -> gpu |
CPU decode stronger but patch compute is heavy |
gpu -> cpu |
GPU decode preferred but CPU-side downstream expected |
gpu -> gpu |
high-throughput fully GPU pipeline |
python benchmark/benchmark_codec_apis.py /path/to/videos \
--apis decode_only patch_stream decord \
--decode-backends cpu gpu \
--process-backends cpu gpu \
--limit 10