fix(qwenimage): Add padding for context parallelism #12595

Ratish1 · 2025-11-05T14:34:40Z

What does this PR do?

This PR fixes a bug that causes the QwenImage model to crash when using context parallelism with a prompt whose sequence length is not divisible by the world size.

The fix is implemented within the QwenImageTransformer2DModel and consists of three parts:

Input Padding: The text prompt embeddings (encoder_hidden_states) and their attention mask are padded at the
start of the forward method to ensure their length is divisible by the world size.
RoPE Correction: The model is updated to use the new, padded sequence length to generate the rotary positional
embeddings (RoPE), preventing a tensor shape mismatch that was causing a RuntimeError.
Attention Masking: The QwenDoubleStreamAttnProcessor2_0 is corrected to build and use a proper additive attention mask. This ensures the new padded tokens are correctly ignored by the attention mechanism, preserving the numerical output of the model.

A new unit test is also added to simulate a distributed environment. It verifies that the padding logic prevents the crash while ensuring the output is numerically equivalent to the baseline, non-padded run.

Fixes #12568

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@sayakpaul @yiyixuxu @DN6

fix(qwenimage): Correct context parallelism padding

baf42db

Ratish1 changed the title ~~fix(qwenimage): Correct context parallelism padding~~ fix(qwenimage): Add padding for context parallelism Nov 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(qwenimage): Add padding for context parallelism #12595

fix(qwenimage): Add padding for context parallelism #12595

Ratish1 commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix(qwenimage): Add padding for context parallelism #12595

Are you sure you want to change the base?

fix(qwenimage): Add padding for context parallelism #12595

Conversation

Ratish1 commented Nov 5, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant