Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Epoch is not saved to the Torch checkpoint #55

Closed
martinlyra opened this issue May 17, 2023 · 2 comments
Closed

Epoch is not saved to the Torch checkpoint #55

martinlyra opened this issue May 17, 2023 · 2 comments

Comments

@martinlyra
Copy link

martinlyra commented May 17, 2023

Whenever predict_ros.py is run with our own trained data:

Loading ckpt from  /home/marcusmartin/artifacts/0/model_best_val.pth.tar
Traceback (most recent call last):
  File "predict_ros.py", line 109, in <module>
    tracker = Tracker(dataset_info, images_mean, images_std,ckpt_dir,trans_normalizer=dataset_info['max_translation'],rot_normalizer=dataset_info['max_rotation'])
  File "/home/marcusmartin/iros20-tracking/predict.py", line 152, in __init__
    print('pose track ckpt epoch={}'.format(checkpoint['epoch']))
KeyError: 'epoch'

The culprit seems to be

checkpoint_data = {'state_dict': self.model.state_dict()}

and
checkpoint_data = {'state_dict': self.model.state_dict()}

Our torch version when we trained the data was 2.0.1 - if you were using an older version, may I suggest you to add a requirements.txt too?

@wenbowen123
Copy link
Owner

can you pull the code and retry? This issue should not be related to pytorch version. Are you running inside the docker?

@martinlyra
Copy link
Author

martinlyra commented May 19, 2023

Nope, we run natively on a computer that has access to the robot ROS. - The fix seems to work fine!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants