-
Notifications
You must be signed in to change notification settings - Fork 1
feature: video audio support, WIP LTX-2 model integration, and ComfyUI v0.9.2 infrastructure updates #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request implements audio generation support for video models, updates ComfyUI error handling, removes video-to-video support from the wan-2 model, adds a new LTX video 2 model implementation, and updates various dependencies and configurations.
Changes:
- Added audio generation support to video models (seedance-1, kling-2, veo-3, sora-2) with a new
generate_audioparameter - Improved ComfyUI client error handling with better error messages and binary WebSocket message handling
- Removed video-to-video functionality from wan-2 model and updated model descriptions
- Added new ltx_video_2.py model implementation (marked as WIP)
- Updated ComfyUI to v0.9.2 and PyTorch base image version
- Updated hunyuan-video-1 with explicit text encoder quantization and improved attention backend
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| api/videos/schemas.py | Added audio field to VideosModelInfo and generate_audio parameter to VideoRequest; updated wan-2 description |
| workers/videos/schemas.py | Mirror of API schema changes for worker-side processing |
| workers/videos/external/veo_3.py | Passes generate_audio parameter to the external API |
| workers/videos/external/seedance_1.py | Passes generate_audio parameter to the external API |
| workers/videos/external/kling_2.py | Passes generate_audio parameter to the external API |
| workers/videos/local/wan_2.py | Removed video-to-video fallback to wan_vace |
| workers/videos/local/ltx_video_2.py | New LTX-2 video model implementation with audio support (marked WIP) |
| workers/videos/local/hunyuan_video_1.py | Updated text encoder quantization and attention backend |
| workers/workflows/comfy/comfy_client.py | Improved error handling and binary WebSocket message support |
| workers/tests/workflows/test_video_gen.py | Added test for LTX-2 text-to-video workflow |
| workers/tests/videos/local/test_wan_vace.py | Removed video-to-video test for wan-2 |
| workers/tests/videos/local/test_ltx_video_2.py | New test file for LTX-2 model |
| docker-compose.yml | Added default values for admin key and storage address |
| Dockerfile.comfy | Updated ComfyUI to v0.9.2 and PyTorch to 2.9.1; added --disable-smart-memory flag |
| clients/openapi.json | Regenerated with audio field and generate_audio parameter (but contains stale wan-2 description) |
| clients/nuke/python/dd_workflow.py | Added handling for list values in PrimitiveInt knobs |
| assets/workflows/video_ltx2_t2v_v001.json | New ComfyUI workflow for LTX-2 text-to-video generation |
| assets/workflows/image_to_image_v001.json | Updated dummy image reference and added resolution_steps parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 5 comments.
No description provided.