New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance on Hopper-v2 #41
Comments
Thank you for the speedy reply! I ran v0.2 with the default hyper-parameters. I’ll double check that the hyper-parameters I ran with match what you posted. To confirm, the performance metric is logged to “evaluation/Returns Mean”, right? Also, would you be so kind to share your plotting code? Did you have to do smoothing to get the solid blue line in your graph? If I plot “evaluation/Return Mean” averaged over 5 seeds directly, I got the hyper zigzag pattern in my graph. |
I'll run it again just to check. Yes, that's the correct metric. I used viskit for plotting and did temporal smoothing. I think the important thing is that the thick, shaded region is about the same width as yours. |
Ah, I think the issue is that the paths are sometimes not exactly So, this seems like a bug in the logging/eval code, but not in the training (phew!). I'll push a fix soon. |
Thanks! I'll rerun the code and let you know how it goes. |
Yeah, it's a bit different... It's not a big difference, but I'll look into it. The only difference I can think of is that I switched to batch training rather than online training, and I'm expecting to add support for online-mode soon. |
Okie, thanks so much! |
originally posted under another issue but re-posting for visibility
Sorry to open this up again, but I am unable to obtain comparable result to the Tensorflow implementation using the master branch. I post the training graph for the pytorch and tensorflow implementation below for comparison. Both results were averaged over 5 seeds.
The TF implementation's final performance is higher and also learns faster. The shape of the TF implementation also closely matches the shape of the graph in the paper, i.e. quickly increase and plateau at around 400 epochs.
Does the pytorch graph look similar to what you obtained too?
I just want to mention that your repo is awesome. Answering pestering question from me is not your responsibility : ) and I really appreciate any help here.
The text was updated successfully, but these errors were encountered: