New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support gradient accumulation in spark torch estimator #3681
feat: support gradient accumulation in spark torch estimator #3681
Conversation
Signed-off-by: Li Jiang <bnujli@gmail.com>
Signed-off-by: Li Jiang <bnujli@gmail.com>
Signed-off-by: Li Jiang <bnujli@gmail.com>
Signed-off-by: Li Jiang <bnujli@gmail.com>
Please rebase with latest master, which includes a fix for the CI. |
Signed-off-by: Li Jiang <bnujli@gmail.com>
3591557
to
ac55b99
Compare
Thanks @EnricoMi , done! |
OOO this week. I will take a look when I'm back. |
Unit Test Results (with flaky tests) 1 169 files - 35 1 169 suites - 35 11h 33m 24s ⏱️ - 9m 14s Results for commit 6d33ba4. ± Comparison against base commit 25ed803. ♻️ This comment has been updated with latest results. |
Signed-off-by: Li Jiang <bnujli@gmail.com>
bd25468
to
e6133e3
Compare
do loss.div_ only when backward_passes_per_step > 1 Signed-off-by: Li Jiang <bnujli@gmail.com>
Checklist before submitting
Description
We can use
backward_passes_per_step
in torch, although we need to adjust the training code to make it work as expected. However, in spark torch estimator, there is no implementation for supporting it.In this PR, we added support to spark torch estimator, thus we can apply gradient accumlation by simply set up param
backward_passes_per_step
in TorchEstimator. Example and test are modified accordingly.Review process to land