Release v1.4.0

github-actions released this 01 May 19:01

· 121 commits to main since this release

29f1358

Added

SAM3-based video object tracking, per-event VLM captioning, serialized SAM3 outputs, an example
event pipeline, and a demo tool.
Sensor library support for GPS, IMU, camera intrinsics, and camera extrinsics data.
MP4 header validation utilities for video-index checks.
Qwen3.5-27B support for image captioning.
External OpenAI/Gemini endpoint support for image semantic filtering and classification stages.
Async OpenAI/Gemini request handling with batch_size-controlled concurrency for image/video
captioning and external filter/classifier stages.
exclusive_end_ns support in make_ts_grid for half-open clip spans.

Fixed

Prevent Qwen from falling back to native-resolution inputs during resize.
Isolate vLLM async per-window payload handling.
Preserve model-variant-specific image filter errors.

Changed

Upgrade the cosmos-xenna Python package and submodule to v0.4.0.
Add a dedicated sam3 pixi environment for Segment Anything 3 dependencies.
Include runtime prompt and config data files in built wheels.

Documentation

Reorganize curator documentation into design, guide, and reference sections.
Add the interactive Slurm guide.
Add GPS and IMU sensor-library design documentation.
Update image pipeline documentation, including Qwen3.5 coverage.

Assets 2