novita: fix Wan 2.7 R2V media item types to match upstream enum by duanbing · Pull Request #9 · RouterBase/tensorzero

duanbing · 2026-05-20T05:38:05Z

Summary

R2V was sending {type:"image"|"video"} in each media[] item; Novita's enum is reference_image | reference_video | first_frame. Upstream rejected every R2V request with "failed to exec task".
Repack now emits reference_image / reference_video from the legacy flat image_urls+video_urls shape.
media added to the R2V allowed-fields whitelist so direct API callers can submit the rich shape (including first_frame and per-item reference_voice) verbatim. The repack block is skipped when media is already present.
Synthesised media array truncated at 5 to match Novita's combined-items cap.

Test plan

Re-run a Wan 2.7 R2V generation from the playground (legacy image_urls+video_urls flow) and confirm Novita accepts the call (status moves past failed to exec task).
Curl POST with a rich media: [{type:"first_frame", url}, ...] body and confirm pass-through.

The Wan 2.7 R2V (`/v3/async/wan2.7-r2v`) endpoint requires each item in the `media` array to carry a `type` value from the enum: - `reference_image` - `reference_video` - `first_frame` We were sending `image` and `video`, which Novita rejects with the generic "failed to exec task" 500 — every R2V submission via the playground / legacy `image_urls`+`video_urls` shape was failing silently for that reason. Two changes in `build_body`: 1. Repack each `image_urls[]` URL as `{type: "reference_image", url}` and each `video_urls[]` URL as `{type: "reference_video", url}`. No way to express `first_frame` or per-item `reference_voice` from the legacy flat shape — callers who want those use the new pass-through path below. 2. Pass `media` through the allowed-fields whitelist for the R2V shape so direct API callers / a future media-editor UI can submit the rich shape (`[{type, url, reference_voice?}, ...]`) verbatim. The `!body.contains_key("media")` guard in the repack block ensures the pass-through wins when both shapes are present. Also cap the synthesised `media` array at 5 items to match Novita's documented ceiling (combined images+videos ≤ 5), so users who upload more get a deterministic truncate-from-front rather than a 422.

github-actions · 2026-05-20T05:38:16Z

Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.

I have read the Contributor License Agreement (CLA) and hereby sign the CLA.

_{You can retrigger this bot by commenting recheck in this Pull Request.}_{Posted by the CLA Assistant Lite bot.}

duanbing merged commit 048539e into main May 20, 2026
6 of 7 checks passed

duanbing deleted the novita/wan-r2v-media-types branch May 20, 2026 05:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

novita: fix Wan 2.7 R2V media item types to match upstream enum#9

novita: fix Wan 2.7 R2V media item types to match upstream enum#9
duanbing merged 1 commit into
mainfrom
novita/wan-r2v-media-types

duanbing commented May 20, 2026

Uh oh!

github-actions Bot commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

duanbing commented May 20, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant