Skip to content

Conversation

@bghira
Copy link
Owner

@bghira bghira commented Feb 1, 2026

This pull request adds support for video models that use audio conditioning by introducing new capabilities for handling audio inputs and S2V (sound-to-video) requirements. It updates both backend and frontend logic to detect and expose these capabilities, enhances the dataset wizard UI to allow audio settings for video datasets, and adds comprehensive frontend tests to ensure correct behavior.

Backend: Model capability detection

  • Added logic in models_service.py to detect if a model supports audio inputs (supports_audio_inputs) and if it requires S2V datasets (requires_s2v_datasets). These are now included in the model's reported capabilities.

Frontend: Capability exposure and UI logic

  • Updated dataloader-section-component.js and dataset-wizard.js to expose supportsAudioInputs and requiresS2VDatasets getters, making these capabilities available for Alpine.js components and UI logic. [1] [2]

UI: Audio settings for video datasets

  • Modified the dataset modal in dataset_modal.html so that the Audio tab is shown for video datasets if the selected model supports audio inputs.
  • Enhanced the audio settings section in audio_body.html to display auto-split and audio format options for video datasets, including toggles and advanced configuration for audio extraction from videos. [1] [2]

Testing: Frontend logic

  • Added a new test suite dataloader_audio_capabilities.test.js covering the new getters and their combinations, ensuring that audio-related UI logic behaves correctly for various model capabilities.

@bghira bghira merged commit a382691 into main Feb 1, 2026
2 checks passed
@bghira bghira deleted the ui/video-audio-split-options branch February 1, 2026 04:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants