Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues loading optimizer state, training further under pytorch 1.12.0 #6

Closed
mdmarti opened this issue Aug 28, 2023 · 3 comments
Closed

Comments

@mdmarti
Copy link
Contributor

mdmarti commented Aug 28, 2023

Howdy! When using torch=1.12.0, this issue comes up. To avoid this, I think we could

  1. require versions other than 1.12.0,
  2. only load optimizer state if torch version != 1.12.0
  3. set capturable=True, which looks like it slows down training.

Any preferences among these? Any other solutions that would be better?

@NickleDave
Copy link
Contributor

Just curious: is there a reason you can't use 1.12.1 that fixes the regression?
pytorch/pytorch#80809 (comment)

You could change

"torch>=1.1",

to

torch>=1.1, !=1.12.0

so that users don't face the same issue

@mdmarti
Copy link
Contributor Author

mdmarti commented Aug 28, 2023

We can - I'm asking if that would be the best fix, or if there are other solutions that would be preferable. Thank you for the suggestion!

@jackgoffinet
Copy link
Collaborator

Hmmm, thanks @mdmarti! If I understood the thread (here), this is a PyTorch issue that will be fixed in 1.12.1, so it's only an issue with torch 1.12.0. If that's the case, I like @NickleDave's suggestion of just changing the requirements in setup.py, so option 1. Option 2 seems a bit hacky to me and we probably don't want to slow down training with the capturable flag if it can be helped.

@mdmarti mdmarti closed this as completed Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants