Example for overlap communication with sequence parallel #7094

vivekgoe · 2025-02-28T13:00:49Z

vivekgoe
Feb 28, 2025

I am looking to optimize performance for sequence parallel implementation using DistributedAttention by overlapping communication and compute. I see that there is some code for this available under "sp_overlap_comm" flag in

DeepSpeed/deepspeed/sequence/layer.py

Line 311 in 02bbf50

class DistributedAttention(torch.nn.Module):

Is this feature (sp_overlap_com) functional? If yes, is there an example or blog about this which I can use to understand the feature. Will really appreciate it if someone can help answering this. Thanks.

inkcherry · 2025-03-07T05:34:14Z

inkcherry
Mar 7, 2025

hi @vivekgoe , plz see deepspeedai/Megatron-DeepSpeed#415

0 replies

vivekgoe · 2025-03-11T12:53:03Z

vivekgoe
Mar 11, 2025
Author

@inkcherry Thanks for sharing PR, will check this.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Example for overlap communication with sequence parallel #7094

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Example for overlap communication with sequence parallel #7094

Uh oh!

vivekgoe Feb 28, 2025

Replies: 2 comments

Uh oh!

inkcherry Mar 7, 2025

Uh oh!

vivekgoe Mar 11, 2025 Author

vivekgoe
Feb 28, 2025

inkcherry
Mar 7, 2025

vivekgoe
Mar 11, 2025
Author