Merged
Conversation
Feat/sdk sync workflows
Rename the CaptureClient parameter from upload_token to client_token for consistency with the generate_client_token() method. This improves API naming consistency and matches user expectations. - Rename __init__ parameter: upload_token -> client_token - Update docstring to reflect new parameter name - Keep "uploadToken" in binary protocol payload (required by recorder)
- Add session_token parameter to connect() function as alternative to api_key - Update Connection class to accept and handle session_token authentication - Update .gitignore to exclude videodb-recorder binary - Update capture_bin package manifest This enables frontend clients to create WebSocket connections using time-bound session tokens for capture operations.
Changed store from hardcoded True to a configurable property that defaults to False.
- Reorganize capture binaries into platform folders (darwin_arm64, darwin_x86_64, win_amd64) - Update package_data glob pattern to include subfolders (bin/**/*) - Add platform detection logic to select correct binary at runtime - Export RTStreamChannelType from videodb package
Use errors="replace" when decoding stdout/stderr from recorder binary to handle Windows console encoding (CP1252) gracefully.
The second __all__ was overwriting the first, causing capture classes (CaptureClient, Channel, etc.) to be missing from exports.
- Remove capture_bin/ from main SDK repo (will live in separate repo) - Add ChannelList class with .default property for cleaner API - Change API: channels.default_mic -> channels.mics.default - Export ChannelList from package Breaking change: channels.default_mic, channels.default_display, channels.default_system_audio replaced with channels.mics.default, channels.displays.default, channels.system_audio.default
- Bump videodb-capture-bin from >=0.2.4 to >=0.2.5 - Remove capture_bin from .gitignore (moved to separate repo)
add index methods
…_rtstream - Convert MediaType to str Enum for strict type enforcement - Replace video: bool and audio: bool params with media_types: List[str] - Default media_types to [MediaType.video] when not specified - Add validation to reject invalid media type values - Bump videodb-capture-bin dependency to >=0.2.7
- Add export() method to RTStream for exporting recordings as video/audio assets - Add RTStreamExportResult class to hold export metadata - Add store parameter to create_rtstream() for enabling recording storage - Rename capture methods for consistency: - start_capture_session -> start_session - stop_capture -> stop_session
ashish-spext
approved these changes
Feb 12, 2026
0xrohitgarg
approved these changes
Feb 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request
Description:
Version 0.4.0 introduces real-time streaming and desktop capture capabilities to the VideoDB Python SDK. This release adds support for capturing audio/video from local devices, managing capture sessions in the cloud, and real-time transcription for live streams. WebSocket support enables real-time event handling for live workflows.
Major Features
Desktop Capture SDK
New
CaptureClientclass for local desktop recording with async/await supportrequest_permission()- Request microphone and screen capture permissionslist_channels()- Discover available audio/video input devicesstart_session()/stop_session()- Control recording sessionspause()/resume()- Pause and resume individual channelsChannel Management
Channel,AudioChannel,VideoChannelclassesChannelListwith.defaultproperty for easy access to default channelChannelscontainer with grouped access:channels.mics.default/channels.displays.default/channels.system_audio.defaultchannels.all()- Get all channels as flat liststoreproperty on channelsCapture Session Management
New
CaptureSessionclass for cloud-side session handlingget_rtstream(category)- Filter RTStreams by channel type (mic, screen, system_audio)Connection methods:
create_capture_session()- Initialize new capture sessionsget_capture_session()/list_capture_sessions()- Retrieve sessionsgenerate_client_token()- Generate time-limited tokens for capture operationsWebSocket Support
WebSocketConnectionclass for real-time event streamingconnect_websocket()on Connection for establishing WebSocket connectionsasync with)Authentication
session_tokenparameter forvideodb.connect()api_keyRTStream Enhancements
Transcript support:
start_transcript()/stop_transcript()- Control real-time transcriptionget_transcript()- Retrieve transcription with paginationSpecialized indexing methods:
index_audio()- Index audio contentindex_visuals()- Index visual contentExport functionality:
export()- Export RTStream recording to video/audio assetRTStreamExportResult- Result class with video_id, stream_url, player_urlWebSocket integration:
ws_connection_idparameter support across RTStream methods for real-time updatesconnect_rtstream() improvements:
media_typesparameter (replacesaudio/videoboolean flags)storeparameter to enable recording storage for exportVideo Enhancements
clip()method - Generate AI-powered clips from video using promptsindex_visuals()method - Index visual scenes with configurable batch extractionindex_audio()method - Index audio content (alias for spoken word indexing)Improvements
RTStreamChannelTypeconstants:mic,screen,system_audioShotobjects withscene_index_id,scene_index_name, andmetadatafieldssegmentation_typeparameter toindex_spoken_words()with LLM segmentation supportDependencies
websockets>=11.0.3as mandatory dependencyvideodb[capture]for desktop capture (videodb-capture-bin>=0.2.7)