[REQUEST]Will zero 3 support diffrent module usage?

**Is your feature request related to a problem? Please describe.**
When using DeepSpeed ZeRO-3, it appears that each module is expected to be invoked the same number of times across all ranks.

For example, suppose I have an LLM module A and a task-specific head B. Depending on the output of A, module B may be executed a different number of times on different ranks. For instance, B may be called once on rank 0, but three times on rank 1. In this case, rank 1 hangs on the second invocation and waits indefinitely for rank 0.

This makes it difficult to support dynamic control flow where different ranks may follow different execution paths.

**Describe the solution you'd like**
It would be helpful if ZeRO-3 could support different data/control flows across ranks, allowing modules to be executed a different number of times depending on the input or model behavior.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REQUEST]Will zero 3 support diffrent module usage? #7998

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[REQUEST]Will zero 3 support diffrent module usage? #7998

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions