Skip to content

Release v1.4.0

Choose a tag to compare

@github-actions github-actions released this 01 May 19:01
· 121 commits to main since this release

Added

  • SAM3-based video object tracking, per-event VLM captioning, serialized SAM3 outputs, an example
    event pipeline, and a demo tool.
  • Sensor library support for GPS, IMU, camera intrinsics, and camera extrinsics data.
  • MP4 header validation utilities for video-index checks.
  • Qwen3.5-27B support for image captioning.
  • External OpenAI/Gemini endpoint support for image semantic filtering and classification stages.
  • Async OpenAI/Gemini request handling with batch_size-controlled concurrency for image/video
    captioning and external filter/classifier stages.
  • exclusive_end_ns support in make_ts_grid for half-open clip spans.

Fixed

  • Prevent Qwen from falling back to native-resolution inputs during resize.
  • Isolate vLLM async per-window payload handling.
  • Preserve model-variant-specific image filter errors.

Changed

  • Upgrade the cosmos-xenna Python package and submodule to v0.4.0.
  • Add a dedicated sam3 pixi environment for Segment Anything 3 dependencies.
  • Include runtime prompt and config data files in built wheels.

Documentation

  • Reorganize curator documentation into design, guide, and reference sections.
  • Add the interactive Slurm guide.
  • Add GPS and IMU sensor-library design documentation.
  • Update image pipeline documentation, including Qwen3.5 coverage.