Skip to content

fix: support i2v resize modes with filled latent shape#1124

Merged
Musisoul merged 12 commits into
mainfrom
downstream/i2v-resize-modes-fill-latent-shape
Jun 8, 2026
Merged

fix: support i2v resize modes with filled latent shape#1124
Musisoul merged 12 commits into
mainfrom
downstream/i2v-resize-modes-fill-latent-shape

Conversation

@GACLove
Copy link
Copy Markdown
Contributor

@GACLove GACLove commented Jun 5, 2026

Summary

  • Support I2V resize modes with filled latent shape handling.
  • Improve compatibility of resize mode processing in runners.

Changes

  • Update lightx2v/models/runners/default_runner.py
  • Update lightx2v/models/runners/seedvr/seedvr_runner.py

GACLove added 4 commits June 5, 2026 17:01
`default_runner.read_image_input` only populated `input_info.latent_shape`
when `resize_mode == "adaptive"`, while `wan_runner.run_vae_encoder`'s
else-branch reads `input_info.latent_shape` unconditionally whenever
`resize_mode` is set. Picking any non-adaptive mode
(keep_ratio_fixed_area / fixed_min_area / fixed_max_area / fixed_shape /
fixed_min_side) therefore crashed with:

    File ".../wan_runner.py", line 524, in run_vae_encoder
        latent_h, latent_w = self.input_info.latent_shape[-2], ...[-1]
    IndexError: list index out of range

`default_runner.resize_image` was also adaptive-only; rebuilt it to
handle all six modes the same way `wan_audio_runner.resize_image`
already does, then dropped the `== "adaptive"` gate in
`read_image_input` so any populated `resize_mode` runs the resize +
latent_shape / target_shape population path. Empty-string / missing
`resize_mode` still falls through to `run_vae_encoder`'s from-scratch
branch (back-compat with old configs).

Verified all 6 modes produce sensible h/w on portrait / landscape /
square inputs in isolation.
torchvision moved read_video from torchvision.io to torchvision.io.video
in some versions. Add try/except fallback at both import sites.
Extract a module-level _get_read_video() with torchvision.io →
torchvision.io.video → PyAV fallback chain so seedvr works even
when torchvision is absent or has a different API surface.
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request expands the image resizing capabilities in default_runner.py to support multiple resize modes and adds a fallback mechanism for reading videos in seedvr_runner.py when torchvision is missing. The review feedback highlights opportunities to improve robustness, specifically by preventing dimensions from rounding to zero during resizing and latent shape calculation, and by adding proper error handling and resource cleanup to the PyAV video reading fallback.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread lightx2v/models/runners/default_runner.py Outdated
Comment thread lightx2v/models/runners/default_runner.py Outdated
Comment thread lightx2v/models/runners/seedvr/seedvr_runner.py Outdated
GACLove and others added 8 commits June 6, 2026 23:46
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@Musisoul Musisoul merged commit 0b5263e into main Jun 8, 2026
2 checks passed
@Musisoul Musisoul deleted the downstream/i2v-resize-modes-fill-latent-shape branch June 8, 2026 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants