Release 0.4.0 by ankit-v2-3 · Pull Request #59 · video-db/videodb-python

ankit-v2-3 · 2026-01-06T13:06:26Z

Pull Request

Description:
Version 0.4.0 introduces real-time streaming and desktop capture capabilities to the VideoDB Python SDK. This release adds support for capturing audio/video from local devices, managing capture sessions in the cloud, and real-time transcription for live streams. WebSocket support enables real-time event handling for live workflows.

Major Features

Desktop Capture SDK

New CaptureClient class for local desktop recording with async/await support
- request_permission() - Request microphone and screen capture permissions
- list_channels() - Discover available audio/video input devices
- start_session() / stop_session() - Control recording sessions
- pause() / resume() - Pause and resume individual channels
Channel Management
- Channel, AudioChannel, VideoChannel classes
- ChannelList with .default property for easy access to default channel
- Channels container with grouped access:
  - channels.mics.default / channels.displays.default / channels.system_audio.default
  - channels.all() - Get all channels as flat list
- Configurable store property on channels

Capture Session Management

New CaptureSession class for cloud-side session handling
- Track session status, RTStreams, and metadata
- get_rtstream(category) - Filter RTStreams by channel type (mic, screen, system_audio)
Connection methods:
- create_capture_session() - Initialize new capture sessions
- get_capture_session() / list_capture_sessions() - Retrieve sessions
- generate_client_token() - Generate time-limited tokens for capture operations

WebSocket Support

New WebSocketConnection class for real-time event streaming
- connect_websocket() on Connection for establishing WebSocket connections
- Async message sending and receiving
- Context manager support (async with)

Authentication

New session_token parameter for videodb.connect()
- Alternative authentication method alongside api_key
- Enables temporary/scoped access for client applications

RTStream Enhancements

Transcript support:
- start_transcript() / stop_transcript() - Control real-time transcription
- get_transcript() - Retrieve transcription with pagination
Specialized indexing methods:
- index_audio() - Index audio content
- index_visuals() - Index visual content
Export functionality:
- export() - Export RTStream recording to video/audio asset
- RTStreamExportResult - Result class with video_id, stream_url, player_url
WebSocket integration:
- ws_connection_id parameter support across RTStream methods for real-time updates
connect_rtstream() improvements:
- New media_types parameter (replaces audio/video boolean flags)
- New store parameter to enable recording storage for export

Video Enhancements

New clip() method - Generate AI-powered clips from video using prompts
New index_visuals() method - Index visual scenes with configurable batch extraction
New index_audio() method - Index audio content (alias for spoken word indexing)

Improvements

Added RTStreamChannelType constants: mic, screen, system_audio
Enhanced Shot objects with scene_index_id, scene_index_name, and metadata fields
Added segmentation_type parameter to index_spoken_words() with LLM segmentation support
Fixed upload from collection objects defaulting to wrong collection
Corrected zero-score threshold behavior in search defaults
Timeline editor handles large payloads via file upload
Windows console encoding support (non-UTF-8 handling)

Dependencies

Added: websockets>=11.0.3 as mandatory dependency
New optional extra: videodb[capture] for desktop capture (videodb-capture-bin>=0.2.7)

Feat/sdk sync workflows

Rename the CaptureClient parameter from upload_token to client_token for consistency with the generate_client_token() method. This improves API naming consistency and matches user expectations. - Rename __init__ parameter: upload_token -> client_token - Update docstring to reflect new parameter name - Keep "uploadToken" in binary protocol payload (required by recorder)

- Add session_token parameter to connect() function as alternative to api_key - Update Connection class to accept and handle session_token authentication - Update .gitignore to exclude videodb-recorder binary - Update capture_bin package manifest This enables frontend clients to create WebSocket connections using time-bound session tokens for capture operations.

Changed store from hardcoded True to a configurable property that defaults to False.

- Reorganize capture binaries into platform folders (darwin_arm64, darwin_x86_64, win_amd64) - Update package_data glob pattern to include subfolders (bin/**/*) - Add platform detection logic to select correct binary at runtime - Export RTStreamChannelType from videodb package

Use errors="replace" when decoding stdout/stderr from recorder binary to handle Windows console encoding (CP1252) gracefully.

The second __all__ was overwriting the first, causing capture classes (CaptureClient, Channel, etc.) to be missing from exports.

- Remove capture_bin/ from main SDK repo (will live in separate repo) - Add ChannelList class with .default property for cleaner API - Change API: channels.default_mic -> channels.mics.default - Export ChannelList from package Breaking change: channels.default_mic, channels.default_display, channels.default_system_audio replaced with channels.mics.default, channels.displays.default, channels.system_audio.default

- Bump videodb-capture-bin from >=0.2.4 to >=0.2.5 - Remove capture_bin from .gitignore (moved to separate repo)

add index methods

…_rtstream - Convert MediaType to str Enum for strict type enforcement - Replace video: bool and audio: bool params with media_types: List[str] - Default media_types to [MediaType.video] when not specified - Add validation to reject invalid media type values - Bump videodb-capture-bin dependency to >=0.2.7

- Add export() method to RTStream for exporting recordings as video/audio assets - Add RTStreamExportResult class to hold export metadata - Add store parameter to create_rtstream() for enabling recording storage - Rename capture methods for consistency: - start_capture_session -> start_session - stop_capture -> stop_session

…nscript

…to release-0-4-0

ankit-v2-3 and others added 30 commits June 23, 2025 12:54

feat: add timelinev2

8e79b15

fix: fit

a8fdf4e

build: update v

1e4a5f6

fix: image asset

2b51bfc

feat: add audio asset

2940995

fix: asset enum

2672b72

feat: add text asset

3537e39

fix: volume range

ac78d55

Add audio param support in connect rtstream

b28928e

Remove timeline v2 to avoid conflict

ac1fa60

Add get trasscript for rtstream

35a0240

Add joined status for meeting bot

2e7a564

feat: add caption asset

fcf4796

fix: caption animation

9d5a6b0

fix: border style

92dc137

feat: add timeline download

339e3a0

fix: position enum

5027f9a

fix: border style enum

b28968a

docs: add docstrings

b15ab4f

feat: increase api gateway timeout

33a271f

feat: add editor

c379a86

fix: asset start

2c3d152

docs: add docstrings

092100d

Merge branch 'main' into ankit/add-videodb-editor

7dee62d

fix: ci/cd

8282460

fix: ci/cd

d340301

fix: ci/cd

f3af108

fix: ci/cd

cc7e2b9

docs: add docstings

9bb1f56

fix: ci/cd

c1518c0

0xrohitgarg and others added 26 commits January 22, 2026 11:43

rtstream: ws_connection_id support for rtstream

33581ef

Add options in list rtstream

af57791

feat: github workflows for SDK sync

edacc97

feat: openai.yaml spec

4df7b41

fix: correct prompts file URL

ebf0172

feat: updated openapi.yaml

7af1f03

Merge pull request #61 from omgate234/feat/sdk-sync-workflows

4cde80a

Feat/sdk sync workflows

refactor: move websockets to mandatory dependencies in setup.py

10ac7e7

fix: include capture binaries in repository for git-based installation

8aaa1e1

refactor: make channel store property configurable

1ae28c7

Changed store from hardcoded True to a configurable property that defaults to False.

fix: handle non-UTF-8 binary output on Windows

b22a0ef

Use errors="replace" when decoding stdout/stderr from recorder binary to handle Windows console encoding (CP1252) gracefully.

fix: merge duplicate __all__ exports in __init__.py

235136b

The second __all__ was overwriting the first, causing capture classes (CaptureClient, Channel, etc.) to be missing from exports.

Update capture binary dependency to 0.2.5

3e4d854

- Bump videodb-capture-bin from >=0.2.4 to >=0.2.5 - Remove capture_bin from .gitignore (moved to separate repo)

add index methods

cd467dc

Merge pull request #64 from video-db/video-index-methods

b80be79

add index methods

Update audio index method

342fefa

remove default values for indexing functions

c4e1b46

Update openapi.yaml

4d93eff

Default engine: None for Rtstream.start_transript & Rtstream.stop_tra…

9186e39

…nscript

Merge branch 'release-0-4-0' of github.com:video-db/videodb-python in…

9ffdce3

…to release-0-4-0

ashish-spext approved these changes Feb 12, 2026

View reviewed changes

ashish-spext requested a review from 0xrohitgarg February 12, 2026 12:28

0xrohitgarg approved these changes Feb 12, 2026

View reviewed changes

ashish-spext merged commit 4800657 into main Feb 12, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 0.4.0#59

Release 0.4.0#59
ashish-spext merged 98 commits intomainfrom
release-0-4-0

ankit-v2-3 commented Jan 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ankit-v2-3 commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request

Major Features

Desktop Capture SDK

Capture Session Management

WebSocket Support

Authentication

RTStream Enhancements

Video Enhancements

Improvements

Dependencies

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ankit-v2-3 commented Jan 6, 2026 •

edited

Loading