Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[autoparallel]add backward cost info into strategies #1524

Conversation

YuliangLiu0306
Copy link
Contributor

This PR add backward information into conv strategies, the information including:

  1. memory cost: I use a memory pair to store the memory cost, first element of pair is forward memory cost and second one is backward memory cost which is composed with activation grad cost and parameters grad cost.
  2. computing cost: compute cost of backward has two parts, first is activation computing cost and second is parameter computing cost.
  3. communication cost: In previous, we only count the communication cost during forward. In the PR, the communication cost of backward is counted. Additionally, in some no forward communication cost case, the communication may happen during backward phase.
  4. resharding cost: resharding cost will also happen during backward, such as we convert the input sharding spec from [R, S] into [R, R], which means we need to recover it from [R, R] back to [R, S] during backward phase.

@FrankLeeeee FrankLeeeee merged commit 0908d0f into hpcaitech:main Sep 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants