You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Due to GPU walltime limitation or other reasons, a job may die before completing all the requested epochs. We should have a function for restarting a job from previous checkpoint file. We need a config such as restart_mode=True/False, where it will search for checkpoint*.pt and load in https://github.com/usnistgov/alignn/blob/main/alignn/train.py#L140
The text was updated successfully, but these errors were encountered:
I think train_dgl should take an optional checkpoint to resume from, and then it can load model and optimizer state with ignite's Checkpoint.load_objects
Due to GPU walltime limitation or other reasons, a job may die before completing all the requested epochs. We should have a function for restarting a job from previous checkpoint file. We need a config such as restart_mode=True/False, where it will search for checkpoint*.pt and load in https://github.com/usnistgov/alignn/blob/main/alignn/train.py#L140
The text was updated successfully, but these errors were encountered: