-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance of release v1.0 on Space Invaders #26
Comments
Thanks for checking! I've dug through my original results for 1.0 and I've got the models/plots for Beam Rider, Enduro, Ms. Pac-Man and Seaquest. Performance for Seaquest has also dropped for 1.1, don't have 1.1 results for any other games. Got rid of my suboptimal 1.0 Frostbite results after I got expected results on 1.1. But for some reason I've lost my Space Invaders results, so I'm now confused as to where I managed to get those results. Currently I'm running with pytorch 0.4.1, atari_py 0.1.1 and opencv-python 3.4.2.16, but I've not been tracking library versions. I always just run the same seed and one experiment. Let's leave this issue open until the results can be replicated. I'm currently seeing if removing gradient clipping from 1.1 will get Space Invaders to work, but that'll take a week to check if I can even hang on to my current GPU. If you still have the trained model from this experiment a quick sanity check would be to load it but use epsilon=0.001 for testing. |
So I just did the sanity check you told me with epsilon=0.001 for testing and it doesn't really change anything (I don't got the exact same result but almost). I don't expect those minor difference to be a problem but I may be wrong... Actually I got a pretty good CPU/GPU available right now and I could make some other test (but I really would like to get the working version on Space Invaders! This is my sanity check for my own multi-agent version ^^) |
I would have been using torch 0.4.0 at the time, and I don't expect opencv-python to change a lot on a minor version (I may have been on the same version as you anyway). If you have compute to test then taking master and running Space Invaders with priority weights not included in the new priorities would be the next thing to check. epsilon=0.001 was needed for Pong to report the right scores, and using log softmax in training prevents numerical problems (had these in Q*bert) so I'm pretty sure those are needed. |
Not sure to have fully understand what you mean there. |
Reverting d6538df and running Space Invaders would be a good test. |
Hum I manually reverted this commit cause I had some conflicts (and it was only 5 lines changed). I don't really know how I could share you the new branch I just made for sanity check? Edited: I come just to upgrade my plotly version from 2.5.1 to 3.1.0 to make it work. |
Fork this repo, make a branch with the changes and just point to it in this issue. |
Here is the change I made to revert the gradient clipping. |
I would say wait to about 18M just in case, but it doesn't look promising. Of all the hyperparameters I might have changed, I might have used |
Closing because an agent trained from |
I just launched the release v1.0 (commit 952fcb4) on Space Invaders for the whole week-end (around 25M steps). I took the exact same code with the exact same random seed.
![q_values_v1 0](https://user-images.githubusercontent.com/10373813/43708463-c98232c0-996a-11e8-9c38-5e06d4cfe84d.png)
![reward_v1 0](https://user-images.githubusercontent.com/10373813/43708471-cbc867fc-996a-11e8-8340-0b2f3a3f94b7.png)
I got really lower performance than the one you are showing.
Here are the plots of rewards and Q-values
Could you explain exactly how you got your results for this release? Did you try multiple experiments with different random seed and average them or just took the best one of them?
Or maybe it's a pytorch, atari_py or any other library issue? Could you give all your library version?
The text was updated successfully, but these errors were encountered: