Skip to content

v0.1.1

Latest

Choose a tag to compare

@leozitogs leozitogs released this 16 Jun 13:45
· 3 commits to main since this release
9ada00d

First public release with corrected package metadata.

vid2llm extracts frames from video for multimodal LLM workflows, with a
streaming Python API and a CLI. Frame extraction runs through three
auto-selected backends: OpenCV, PyAV, and ffmpeg.

Changed

  • Corrected the published package description and project metadata on PyPI to
    reflect the shipped functionality.

Available in this release

  • Frame extraction with sampling by interval, count cap, and time window
  • Three decode backends with automatic selection
  • Streaming Python API and a typed CLI (probe and extract)
  • Output to jpg, png, or webp
  • Tested on Linux and Windows across Python 3.11, 3.12, and 3.13

On the roadmap

  • Scene-aware and motion-based sampling
  • OCR text extraction
  • Direct adapters for multimodal provider SDKs

Full changelog: https://github.com/leozitogs/vid2llm/blob/main/CHANGELOG.md