Modularize zero step function by karakusc · Pull Request #7265 · pytorch/xla

karakusc · 2024-06-13T00:17:09Z

(Re-opening earlier PR after fixes)

Refactor and (slightly) generalize step function of the zero redundancy optimizer, by breaking it into the following three high-level steps:

_reduce_gradients
_clip_grad_norm
_update_parameters
This makes it easier to sub-class and override specific behaviors of the zero optimizer in different contexts.
In addition, we introduce an additional optional parameter sharding_scheme, which allows us to customize steps 1 and 3 above, if needed.

Can Karakus added 5 commits June 10, 2024 15:12

Modularize zero step function and make it customizable

d6318dc

Remove formatting characters

b13b534

Bugfix for recent zero1 changes

f97f2c0

fix

d0aaea8

Merge branch 'master' into step_refactor

d1be4b6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modularize zero step function#7265

Modularize zero step function#7265
karakusc wants to merge 5 commits into
pytorch:masterfrom
karakusc:step_refactor

karakusc commented Jun 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

karakusc commented Jun 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant