You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wonder if it is a common demand to save training progress and resume it. In my story I bothered by an unstable server that often shut down and made my train wasted. I made some changes to the trainer and, if it is a common demand, I will create a pr. But a discussion is needed to decide what param should be added to the trainer, like begin_step, last_gradient_step, etc. Should we read them from log? Or read it from a slot function that users should implement (like load_fn).
Also, what data should be saved in the progress. Buffer, policy, and what else?
The text was updated successfully, but these errors were encountered:
I wonder if it is a common demand to save training progress and resume it.
Definitely worth supporting it!
an unstable server that often shut down and made my train wasted.
Because of a kill event?
Also, what data should be saved in the progress. Buffer, policy, and what else?
optim status (if we call torch.save(policy) the optim is also saved into pth file? not sure)
policy parameters
logger
buffer
But a discussion is needed to decide what param should be added to the trainer, like begin_step, last_gradient_step, etc. Should we read them from log? Or read it from a slot function that users should implement (like load_fn).
I don't think we need a large amount of extra params -- load all before trainer init would be fine, like the current approach for policy load/save. And we can get the env_step/gradient_step from logger.
My current thought is that we can modify the trainer logic in test_episode, save (logger/buffer/policy) first and go on testing. And sure, add something like save_all_every_epoch: bool = False would be fine.
I wonder if it is a common demand to save training progress and resume it. In my story I bothered by an unstable server that often shut down and made my train wasted. I made some changes to the trainer and, if it is a common demand, I will create a pr. But a discussion is needed to decide what param should be added to the trainer, like
begin_step
,last_gradient_step
, etc. Should we read them from log? Or read it from a slot function that users should implement (likeload_fn
).Also, what data should be saved in the progress. Buffer, policy, and what else?
The text was updated successfully, but these errors were encountered: