DualPipe is an innovative bidirectional pipeline parallelism algorithm introduced in the DeepSeek-V3 Technical Report. It achieves full overlap of forward and backward computation-communication phases, also reducing pipeline bubbles. For detailed information on computation-communication overlap, please refer to the profile data.
Example DualPipe scheduling for 8 PP ranks and 20 micro-batches in two directions. The micro-batches in the reverse direction are symmetric to those in the forward direction, so we omit their batch ID for illustration simplicity. Two cells enclosed by a shared black border have mutually overlapped computation and communication
DualPipeV is a concise V-shape schedule derived from DualPipe using a "cut-in-half" procedure, introduced by Sea AI Lab as "Cut-in-half" in their blog post. Thanks to them for this efficient schedule!
Example DualPipeV scheduling for 4 PP ranks (8 PP stages) and 10 micro-batches.
Method | Bubble | Parameter Per Device | Activation Per Device | #Devices |
---|---|---|---|---|
1F1B | (PP-1)(πΉ+π΅) | 1Γ | PP | PP |
ZB1P | (PP-1)(πΉ+π΅-2π) | 1Γ | PP | PP |
DualPipe | (PP/2-1)(πΉ&π΅+π΅-3π) | 2Γ | PP+1 | PP |
DualPipeV | (PP/2-1)(πΉ&π΅+π΅-3π) | 2Γ | PP+1 | PP/2 |
PP denotes the number of pp stages (even). πΉ denotes the execution time of a forward chunk, π΅ denotes the execution time of a full backward chunk, π denotes the execution time of a "backward for weights" chunk, and πΉ&π΅ denotes the execution time of two mutually overlapped forward and backward chunks.
The usage is shown in the following example:
python examples/example_dualpipe.py
python examples/example_dualpipev.py
Note: For real-world applications, you will need to implement a custom overlapped_forward_backward
method tailored to your specific module.
- PyTorch 2.0 and above
DualPipe was created and developed by Jiashi Li and Chengqi Deng and Wenfeng Liang.
@misc{deepseekai2024deepseekv3technicalreport,
title={DeepSeek-V3 Technical Report},
author={DeepSeek-AI},
year={2024},
eprint={2412.19437},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2412.19437},
}