Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pipeline] supported more flexible dataflow control for pipeline parallel training #1108

Merged

Conversation

FrankLeeeee
Copy link
Contributor

Summary

Fixed #120 to support more data formats in the engine schedules and implemented a more flexible tensor-argument mapping mechanism in pipeline schedules.

Details

  • Data in the format of list and tuple is now supported as the model inputs and outputs.
  • The data_process_func now accepts the stage output as well so that the user can write their own logic to map the tensors (outputs from the previous stage and those produced by dataloader on the current stage) to the arguments of the model forward function on the current stage.

@YuliangLiu0306 YuliangLiu0306 merged commit 6f82ac9 into hpcaitech:main Jun 15, 2022
@FrankLeeeee FrankLeeeee deleted the hotfix/pipeline-schedule-cache branch January 26, 2023 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compatabilities to various batch formats[FEATURE]
2 participants