Release v0.1.1 · leozitogs/vid2llm

First public release with corrected package metadata.

vid2llm extracts frames from video for multimodal LLM workflows, with a
streaming Python API and a CLI. Frame extraction runs through three
auto-selected backends: OpenCV, PyAV, and ffmpeg.

Changed

Corrected the published package description and project metadata on PyPI to
reflect the shipped functionality.

Available in this release

Frame extraction with sampling by interval, count cap, and time window
Three decode backends with automatic selection
Streaming Python API and a typed CLI (probe and extract)
Output to jpg, png, or webp
Tested on Linux and Windows across Python 3.11, 3.12, and 3.13

On the roadmap

Scene-aware and motion-based sampling
OCR text extraction
Direct adapters for multimodal provider SDKs

Full changelog: https://github.com/leozitogs/vid2llm/blob/main/CHANGELOG.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.1.1

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Changed

Available in this release

On the roadmap

Uh oh!