Stream vae by huochaitiantang · Pull Request #594 · ModelTC/LightX2V

huochaitiantang · 2025-12-10T04:32:14Z

No description provided.

gemini-code-assist · 2025-12-10T04:32:30Z

Summary of Changes

Hello @huochaitiantang, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement by implementing a streaming Variational Autoencoder (VAE) decoding mechanism. This new approach allows for more efficient and potentially real-time processing of video and audio segments, which can be toggled via a configuration flag. The changes also include minor dependency adjustments to streamline the codebase and improve module loading.

Highlights

Streaming VAE Implementation: A new end_run_segment_stream method has been introduced in wan_audio_runner.py to enable streaming VAE decoding. This method processes video and audio segments incrementally, allowing for more efficient and potentially real-time output.
Configurable VAE Decoding: The run_main method in wan_audio_runner.py now includes a conditional check for a use_stream_vae configuration flag. This allows users to switch between the traditional VAE decoding and the newly implemented streaming VAE approach.
Dependency Optimization: The torch.distributed import and its associated get_rank_and_world_size method have been removed from wan_audio_runner.py. Additionally, OmniVAReader is now lazily imported within specific methods in va_controller.py, potentially improving module loading performance.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist · 2025-12-10T04:35:01Z

Summary of Changes

Hello @huochaitiantang, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the system by integrating a streaming Variational Autoencoder (VAE) capability, which allows for more efficient and potentially real-time processing of generated video and audio segments. This new functionality can be activated via a configuration setting. Additionally, the changes include minor refactoring to streamline imports and remove unused distributed processing utilities.

Highlights

Streaming VAE Implementation: A new method, end_run_segment_stream, has been introduced to enable streaming processing of VAE (Variational Autoencoder) outputs, allowing for more dynamic handling of video and audio segments.
Conditional VAE Processing: The system now supports switching between the traditional VAE decoding and the new streaming VAE approach based on a use_stream_vae configuration flag, providing flexibility in how VAE outputs are handled.
Distributed Module Cleanup: The torch.distributed import and the get_rank_and_world_size utility method have been removed from wan_audio_runner.py, indicating a refactoring or removal of distributed processing logic from this specific file.
Localized Imports: The OmniVAReader import has been moved inside the next_control and before_control methods in va_controller.py, potentially to address circular dependencies or for lazy loading.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces streaming VAE functionality, which is a significant feature for improving real-time performance. The overall implementation is good, but I've identified a few areas for improvement. My feedback focuses on moving local imports to the module level for better code structure and performance, adding a missing stop check in the new streaming loop for graceful termination, and correcting a potential memory leak and logic error in how video segments are aggregated. Addressing these points will enhance the robustness and efficiency of the new streaming feature.

lightx2v/models/runners/wan/wan_audio_runner.py

lightx2v/deploy/common/va_controller.py

lightx2v/models/runners/wan/wan_audio_runner.py

huochaitiantang added 3 commits December 10, 2025 12:28

Support stream vae for stream input s2v

96d710e

Code lint

b565484

Merge branch 'main' into stream_vae

7576f70

gemini-code-assist bot reviewed Dec 10, 2025

View reviewed changes

lightx2v/models/runners/wan/wan_audio_runner.py Show resolved Hide resolved

lightx2v/deploy/common/va_controller.py Show resolved Hide resolved

lightx2v/deploy/common/va_controller.py Show resolved Hide resolved

lightx2v/models/runners/wan/wan_audio_runner.py Show resolved Hide resolved

helloyongyang approved these changes Dec 10, 2025

View reviewed changes

helloyongyang merged commit 5b902af into ModelTC:main Dec 10, 2025
1 check passed

huochaitiantang deleted the stream_vae branch December 16, 2025 02:24

helloyongyang pushed a commit that referenced this pull request Mar 6, 2026

Stream vae (#594)

cb5fd0f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream vae#594

Stream vae#594
helloyongyang merged 3 commits intoModelTC:mainfrom
huochaitiantang:stream_vae

huochaitiantang commented Dec 10, 2025

Uh oh!

gemini-code-assist bot commented Dec 10, 2025

Uh oh!

gemini-code-assist bot commented Dec 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

huochaitiantang commented Dec 10, 2025

Uh oh!

gemini-code-assist bot commented Dec 10, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot commented Dec 10, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants