Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resumable trainer? #349

Closed
StephenArk30 opened this issue Apr 21, 2021 · 2 comments · Fixed by #350
Closed

Resumable trainer? #349

StephenArk30 opened this issue Apr 21, 2021 · 2 comments · Fixed by #350
Assignees
Labels
enhancement Feature that is not a new algorithm or an algorithm enhancement

Comments

@StephenArk30
Copy link
Contributor

I wonder if it is a common demand to save training progress and resume it. In my story I bothered by an unstable server that often shut down and made my train wasted. I made some changes to the trainer and, if it is a common demand, I will create a pr. But a discussion is needed to decide what param should be added to the trainer, like begin_step, last_gradient_step, etc. Should we read them from log? Or read it from a slot function that users should implement (like load_fn).

Also, what data should be saved in the progress. Buffer, policy, and what else?

@Trinkle23897
Copy link
Collaborator

Trinkle23897 commented Apr 21, 2021

I wonder if it is a common demand to save training progress and resume it.

Definitely worth supporting it!

an unstable server that often shut down and made my train wasted.

Because of a kill event?

Also, what data should be saved in the progress. Buffer, policy, and what else?

  • optim status (if we call torch.save(policy) the optim is also saved into pth file? not sure)
  • policy parameters
  • logger
  • buffer

But a discussion is needed to decide what param should be added to the trainer, like begin_step, last_gradient_step, etc. Should we read them from log? Or read it from a slot function that users should implement (like load_fn).

I don't think we need a large amount of extra params -- load all before trainer init would be fine, like the current approach for policy load/save. And we can get the env_step/gradient_step from logger.

My current thought is that we can modify the trainer logic in test_episode, save (logger/buffer/policy) first and go on testing. And sure, add something like save_all_every_epoch: bool = False would be fine.

@Trinkle23897 Trinkle23897 added the enhancement Feature that is not a new algorithm or an algorithm enhancement label Apr 21, 2021
@StephenArk30
Copy link
Contributor Author

pr here @Trinkle23897 , please take a look.

@Trinkle23897 Trinkle23897 linked a pull request Apr 23, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature that is not a new algorithm or an algorithm enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants