feat: Multi-Provider Media Generation (Video, Audio, Image)#475
Open
santoshkumarradha wants to merge 14 commits intomainfrom
Open
feat: Multi-Provider Media Generation (Video, Audio, Image)#475santoshkumarradha wants to merge 14 commits intomainfrom
santoshkumarradha wants to merge 14 commits intomainfrom
Conversation
Contributor
Performance
⚠ Regression detected:
|
Contributor
📊 Coverage gateThresholds from
✅ Gate passedNo surface regressed past the allowed threshold and the aggregate stayed above the floor. |
Contributor
📐 Patch coverage gateThreshold: 80% on lines this PR touches vs
✅ Patch gate passedEvery surface whose lines were touched by this PR has patch coverage at or above the threshold. |
…tion (#468) * feat(go-sdk): add MediaProvider interface and OpenRouter media generation (#468) Adds MediaProvider interface, MediaRouter for model-prefix-based dispatch, and OpenRouterMediaProvider supporting image, audio, and video generation. * fix(02): CR-01 validate job ID to prevent SSRF via path traversal * fix(02): CR-02+WR-02 use context.WithTimeout for poll loop, add transient error retry * fix(02): CR-03 increase SSE scanner buffer to 1MB for large audio chunks * fix(02): WR-01 cap io.ReadAll with 10MB LimitReader on all HTTP responses * fix(02): WR-03 validate API key non-empty, return error from constructor * fix(02): WR-05+WR-06 validate non-empty prompt/text before API calls * fix(02): WR-07 return error on base64 decode failure instead of silent skip * fix(02): IN-05 set VideoData.Filename to generated_video.mp4 * fix(02): WR-08 add full video poll lifecycle test and input validation tests
…a generation (#467) Ports MediaProvider abstraction to TS SDK with VideoRequest/ImageRequest/AudioRequest types, MediaRouter prefix-based dispatch, and OpenRouterMediaProvider supporting video (async job polling), image, and audio (SSE stream) generation.
…464) * feat(python-sdk): add OpenRouter video generation via async polling (#464) * fix(python-sdk): address code review findings for OpenRouter video (#464) CR-01: Add image_url to request body (was silently dropped) CR-02: Validate job_id format + enforce HTTPS-only video download URL HI-01: Add MAX_VIDEO_BYTES (500MB) size limit on video downloads HI-02: Add comment clarifying download uses no auth headers HI-03: Add transient poll error retry (max 3 consecutive 502/503/504) MD-01: Fix duration type to Optional[float], remove int() cast in agent_ai MD-03: Move poll sleep to end of loop (poll immediately on first iteration) LO-01: Truncate error response bodies to 500 chars LO-02: Move _error_messages to class constant _VIDEO_ERROR_MESSAGES IN-02: Add test for image_url passthrough in request body
Apply fixes from REVIEW-465.md: - CR-01: Add aiohttp.ClientTimeout(total=300s) to SSE streaming - CR-02: Add MAX_AUDIO_B64_BYTES (500MB) size guard - HI-01: Extract _stream_openrouter_audio() shared helper (dedup ~90 lines) - HI-02: Cache _openrouter_provider as lazy property (like _fal_provider) - HI-03: Rename format -> audio_format internally to avoid builtin shadow - ME-02: Use resp.content.readline() for proper SSE line parsing - ME-03: Truncate error response body to 500 chars - ME-04: Validate duration > 0 and <= 600 - LO-02: Replace deprecated get_event_loop with @pytest.mark.asyncio
Apply fixes from REVIEW-ts-sdk-media.md: - CR-01: Add AbortSignal.timeout() to all fetch calls (30s API, 120s download) - CR-02: SSRF validation — assertSafeUrl() blocks non-HTTPS, localhost, private IPs - CR-03: API key stored in WeakMap, toJSON() excludes key - WR-01: Poll loop checks deadline after sleep, uses Math.min for sleep duration - WR-02: Process remaining SSE buffer after stream ends - WR-04: Track parse errors, throw MediaProviderError after 50 consecutive - WR-05: Include model + endpoint in all error messages - WR-06: MediaProviderError typed error class for programmatic handling
- Python: 33 tests — MediaRouter routing, OpenRouter video/audio/music lifecycle, AgentAI dispatch, MultimodalResponse consistency, error propagation, provider caching - TypeScript: 28 tests — MediaRouter, OpenRouter video/image/audio, SSRF protection (8 cases), MediaProviderError typing - Go: 25 tests — MediaRouter, OpenRouter video lifecycle with httptest, audio SSE, input validation, context cancellation
Keep dev/add-video version which includes ai_generate_music delegate and all media generation methods added during the milestone.
- Fix flaky harness test: DurationMS can be 0ms for near-instant stubs in CI; use GreaterOrEqual(0) instead of Positive assertion - Go SDK: fix image response parsing for models returning content as string or null, handle Gemini-style message.images[], default audio format to pcm16 - Python SDK: replace readline-based SSE parsing with manual chunked parsing to handle >64KB base64 audio lines from music models
The live verification agent changed _stream_openrouter_audio() from readline() to iter_any() for handling large SSE lines. Update test fakes (_FakeContent and integration test mocks) to implement iter_any() as async generators instead of readline(). Fixes 12 test failures in CI: test_openrouter_audio.py and test_media_integration.py.
Add coverage tests for branches not exercised by the existing media integration suite: optional video payload fields, submit/poll error paths, image config+inline-base64 fallback, Gemini-style images[], audio default voice, HTTP error, invalid SSE/base64 chunks, and RawStdEncoding fallback. Lifts patch coverage from 69% to 89%.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Epic PR for the Media Generation Milestone — adds multi-provider media generation (video, audio, image, music) across Python, TypeScript, and Go SDKs via OpenRouter and other providers.
Issues Closed
Closes #463, closes #464, closes #465, closes #466, closes #467, closes #468, closes #469, closes #470
What's Included
Wave 1 — Foundation (Merged ✅):
MediaRouter: prefix-based provider dispatch replacing scattered if/elif chainsimage_configsupport for OpenRouter image generationVideoOutputtype + video support inMultimodalResponseWave 2 — Python Generation (Merged ✅):
Wave 3 — Cross-SDK (Merged ✅):
MediaProvider+OpenRouterMediaProviderMediaProvider+OpenRouterMediaProviderWave 4 — Quality (Complete ✅):
Code Review Fixes Applied
All implementation PRs received deep code reviews (54 findings total). Key fixes:
aiohttp.ClientTimeout,AbortSignal.timeout(),context.WithTimeoutio.LimitReader, bounded accumulationArchitecture
CI Status
Test Plan
cd sdk/python && pytest— all tests pass (incl. 33 new integration tests)cd sdk/python && ruff check . && ruff format --check .— lint cleancd sdk/typescript && bun run build && bun test— 536 tests passcd sdk/go && go test ./... && go vet ./...— 1008+ tests passDocumentation