[pipeline] supported more flexible dataflow control for pipeline parallel training #1108

FrankLeeeee · 2022-06-14T07:08:52Z

Summary

Fixed #120 to support more data formats in the engine schedules and implemented a more flexible tensor-argument mapping mechanism in pipeline schedules.

Details

Data in the format of list and tuple is now supported as the model inputs and outputs.
The data_process_func now accepts the stage output as well so that the user can write their own logic to map the tensors (outputs from the previous stage and those produced by dataloader on the current stage) to the arguments of the model forward function on the current stage.

…llel training

colossalai/engine/schedule/_non_pipeline_schedule.py

[pipeline] supported more flexible dataflow control for pipeline para…

9bbdc0d

…llel training

FrankLeeeee added the Run Build and Test label Jun 14, 2022

polish code

467cadc

YuliangLiu0306 reviewed Jun 14, 2022

View reviewed changes

colossalai/engine/schedule/_non_pipeline_schedule.py Outdated Show resolved Hide resolved

FrankLeeeee added 2 commits June 14, 2022 15:18

polish code

c4fdea2

polish code

ed46798

YuliangLiu0306 approved these changes Jun 15, 2022

View reviewed changes

YuliangLiu0306 merged commit 6f82ac9 into hpcaitech:main Jun 15, 2022

FrankLeeeee deleted the hotfix/pipeline-schedule-cache branch January 26, 2023 07:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pipeline] supported more flexible dataflow control for pipeline parallel training #1108

[pipeline] supported more flexible dataflow control for pipeline parallel training #1108

FrankLeeeee commented Jun 14, 2022

[pipeline] supported more flexible dataflow control for pipeline parallel training #1108

[pipeline] supported more flexible dataflow control for pipeline parallel training #1108

Conversation

FrankLeeeee commented Jun 14, 2022

Summary

Details