fix for torch 2.1 inits #290

saurabh111233212 · 2023-09-26T17:28:54Z

In torch 2.1 FSDP now only calls reset_parameters() on modules which directly manage parameters (i.e. not the top level Olmo module).

Very simple fix: call olmo_model.reset_parameters() after the FSDP wrap. this recovers the loss curve from torch 2.0.1
No changes if using torch version < 2.1.0

epwalsh

Note that this does mean the model will be initialized twice for earlier versions, leading to slightly different inits for the same seed than we had before.

This isn't true given that we have the version check, right?

But for later versions, some params might be initialized twice, right? If so, is there a way to avoid that? Like force FSDP to materialize params but not apply init fn.

scripts/train.py

epwalsh

LGTM, just a suggestion to add some comments explaining what's going on

scripts/train.py

one-line fix for torch 2.1 inits

6e5f667

saurabh111233212 requested a review from epwalsh September 26, 2023 17:28

ensure nothing changes with inits for earlier versions

4f00ae3

saurabh111233212 closed this Sep 26, 2023

saurabh111233212 reopened this Sep 26, 2023

epwalsh requested changes Sep 26, 2023

View reviewed changes

scripts/train.py Outdated Show resolved Hide resolved

Saurabh Shah added 3 commits September 26, 2023 13:34

fixes from Pete's comments

be59064

formatting

56228a1

fixed import

e560566

epwalsh reviewed Sep 26, 2023

View reviewed changes

scripts/train.py Show resolved Hide resolved

not double calling FSDP

a5a88fc

epwalsh approved these changes Sep 26, 2023

View reviewed changes

scripts/train.py Show resolved Hide resolved

added comment

7f5e4e5

saurabh111233212 merged commit 012e97f into main Sep 26, 2023
10 checks passed

saurabh111233212 deleted the torch2.1init branch September 26, 2023 21:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix for torch 2.1 inits #290

fix for torch 2.1 inits #290

saurabh111233212 commented Sep 26, 2023 •

edited

Loading

epwalsh left a comment

epwalsh left a comment

fix for torch 2.1 inits #290

fix for torch 2.1 inits #290

Conversation

saurabh111233212 commented Sep 26, 2023 • edited Loading

epwalsh left a comment

Choose a reason for hiding this comment

epwalsh left a comment

Choose a reason for hiding this comment

saurabh111233212 commented Sep 26, 2023 •

edited

Loading