ui: add audio options for video datasets on models which support a+v #2545
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request adds support for video models that use audio conditioning by introducing new capabilities for handling audio inputs and S2V (sound-to-video) requirements. It updates both backend and frontend logic to detect and expose these capabilities, enhances the dataset wizard UI to allow audio settings for video datasets, and adds comprehensive frontend tests to ensure correct behavior.
Backend: Model capability detection
models_service.pyto detect if a model supports audio inputs (supports_audio_inputs) and if it requires S2V datasets (requires_s2v_datasets). These are now included in the model's reported capabilities.Frontend: Capability exposure and UI logic
dataloader-section-component.jsanddataset-wizard.jsto exposesupportsAudioInputsandrequiresS2VDatasetsgetters, making these capabilities available for Alpine.js components and UI logic. [1] [2]UI: Audio settings for video datasets
dataset_modal.htmlso that the Audio tab is shown for video datasets if the selected model supports audio inputs.audio_body.htmlto display auto-split and audio format options for video datasets, including toggles and advanced configuration for audio extraction from videos. [1] [2]Testing: Frontend logic
dataloader_audio_capabilities.test.jscovering the new getters and their combinations, ensuring that audio-related UI logic behaves correctly for various model capabilities.