Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] explicit stage num with uniform stage divided by flops #878

Merged
merged 3 commits into from
Feb 11, 2023

Conversation

ZYHowell
Copy link
Collaborator

The previous UniformStageOption uses "flops_if_cut_here is the closest to average_flops_per_stage" to divide a stage, however, this tends to divide a stage slightly smaller than the average. The shift accumulates and thus the last stage may not end at the last layer, but some layers before. This pr fixes the bug by cutting at "flops_till_now is the closest to average_flops_per_stage * stage_idx"

@ZYHowell ZYHowell merged commit c164f3e into main Feb 11, 2023
@ZYHowell ZYHowell deleted the pr-fix-apply-grad-rewrite branch February 11, 2023 05:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant