Skip to content

feat: Integrate Wan with multi-resolution DL#1475

Merged
pthombre merged 13 commits intomainfrom
pranav/text_to_video_dataloader
Mar 13, 2026
Merged

feat: Integrate Wan with multi-resolution DL#1475
pthombre merged 13 commits intomainfrom
pranav/text_to_video_dataloader

Conversation

@pthombre
Copy link
Copy Markdown
Contributor

@pthombre pthombre commented Mar 6, 2026

  • Add text-to-video multiresolution dataloader: Introduces TextToVideoDataset and build_video_multiresolution_dataloader to support video training (Wan, HunyuanVideo) with bucket-based multiresolution sampling, replacing the legacy build_dataloader / MetaFilesDataset path.
  • Refactor diffusion datasets with shared base class: Extracts common multiresolution logic (metadata loading, bucket grouping, dynamic batch sizing) into BaseMultiresolutionDataset, with TextToImageDataset and TextToVideoDataset as concrete implementations.
  • Refactor and rename dataloader APIs: Renames collate_fn_flux → collate_fn_text_to_image, build_flux_multiresolution_dataloader → build_text_to_image_multiresolution_dataloader, and extracts a shared _build_multiresolution_dataloader_core helper. Moves collate_fn_production and dataloader builders from sampler.py
    to collate_fns.py.
  • Add video collate function (collate_fn_video) with support for model-specific optional fields (e.g. text_mask, image_embeds).
  • Update example configs: Migrate Wan 2.1 and HunyuanVideo YAML configs to use the new video multiresolution dataloader; update Flux configs to use renamed image dataloader API.
  • Remove obsolete CI/CD nightly scripts and configs for Wan 2.1 pretrain.
  • Fix Flux attention backend configuration (use "flash" instead of "_flash_3_hub").
  • Fix and update unit tests for renamed APIs and new video dataloader.

Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Mar 6, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

pthombre and others added 4 commits March 6, 2026 17:31
Signed-off-by: pthombre <pthombre@users.noreply.github.com>
Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com>
Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com>
@pthombre
Copy link
Copy Markdown
Contributor Author

pthombre commented Mar 8, 2026

/ok to test a31c392

pthombre and others added 2 commits March 8, 2026 15:39
Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com>
Signed-off-by: pthombre <pthombre@users.noreply.github.com>
@pthombre
Copy link
Copy Markdown
Contributor Author

pthombre commented Mar 8, 2026

/ok to test fa92b0c

Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com>
@pthombre
Copy link
Copy Markdown
Contributor Author

pthombre commented Mar 8, 2026

/ok to test 755c134

pthombre and others added 3 commits March 9, 2026 16:13
Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com>
Signed-off-by: pthombre <pthombre@users.noreply.github.com>
@pthombre
Copy link
Copy Markdown
Contributor Author

pthombre commented Mar 9, 2026

/ok to test 4a891b8

Comment thread nemo_automodel/_diffusers/auto_diffusion_pipeline.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants