Skip to content

[Python SDK] Add VideoOutput type and save helpers to MultimodalResponse #469

@santoshkumarradha

Description

@santoshkumarradha

Summary

Video results currently piggyback on FileOutput in MultimodalResponse.files[]. Add a dedicated VideoOutput type with video-specific metadata (duration, resolution, aspect_ratio, has_audio) and a videos property on MultimodalResponse.

Problem

Today, a generated video is stored as:

FileOutput(url="...", data=None, mime_type="video/mp4", filename="generated_video.mp4")

This loses video-specific metadata and forces users to check files[] for video content:

# Current — awkward
result = await app.ai_generate_video(...)
video_file = result.files[0]  # Is this a video? A PDF? Who knows

Solution

# multimodal_response.py

@dataclass
class VideoOutput:
    url: Optional[str] = None
    data: Optional[bytes] = None
    mime_type: str = "video/mp4"
    filename: Optional[str] = None
    duration: Optional[float] = None      # seconds
    resolution: Optional[str] = None       # "1080p"
    aspect_ratio: Optional[str] = None     # "16:9"
    has_audio: Optional[bool] = None
    cost_usd: Optional[float] = None

    async def save(self, path: str): ...
    async def get_bytes(self) -> bytes: ...

# MultimodalResponse additions
class MultimodalResponse:
    videos: List[VideoOutput]  # NEW
    has_video: bool            # NEW property

Developer Experience

result = await app.ai_generate_video(
    "A cat playing",
    model="openrouter/google/veo-3.1",
    resolution="1080p",
)

# Clean access
print(result.has_video)           # True
print(result.videos[0].duration)  # 8.0
print(result.videos[0].resolution) # "1080p"
await result.videos[0].save("cat.mp4")

# Still backward compatible via files[]
print(result.has_file)  # Also True (videos appear in files too)

Dependencies

Files

File Change
sdk/python/agentfield/multimodal_response.py Add VideoOutput class, add videos/has_video to MultimodalResponse

Acceptance Criteria

  • VideoOutput dataclass with url, data, mime_type, duration, resolution, aspect_ratio, has_audio, cost_usd
  • VideoOutput.save() and VideoOutput.get_bytes() methods (mirror existing ImageOutput/AudioOutput pattern)
  • MultimodalResponse.videos property
  • MultimodalResponse.has_video property
  • Videos also appear in files[] for backward compatibility
  • pytest sdk/python/ passes
  • ruff check sdk/python/ passes

Notes for Contributors

Severity: LOW — Quality-of-life improvement.

Follow the existing ImageOutput / AudioOutput pattern exactly. The save() method should handle both URL download and raw bytes, same as the image/audio equivalents.

Metadata

Metadata

Labels

ai-friendlyWell-documented task suitable for AI-assisted developmentarea:aiAI/LLM integrationenhancementNew feature or requestgood first issueGood for newcomerssdk:pythonPython SDK related

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions