Results and args used for space invaders? #1

gtoubassi · 2016-05-01T13:05:45Z

First off, thanks for sharing this awesome repo. I am attempting to reproduce the deepmind results with TF and am slowly getting better results as I buff out the many subtle issues. Your repo has really helped me figure things out!

I wanted to know what results you get for space invaders (on a public post I saw you mentioned 1500), and exactly what args you are using. If haven't run deep_rl_ale for the full 200 epochs but at about 125 (with all default args) I was seeing average scores of about 1000-1100. Maybe I just need to let it run, but I wanted to make sure.

Thanks much!

Jabberwockyll · 2016-05-01T14:58:32Z

First off, thanks for sharing this awesome repo. I am attempting to reproduce the deepmind results with TF and am slowly getting better results as I buff out the many subtle issues. Your repo has really helped me figure things out!

You're welcome! I'm excited that it's actually being used by others!

I wanted to know what results you get for space invaders (on a public post I saw you mentioned 1500), and exactly what args you are using. If haven't run deep_rl_ale for the full 200 epochs but at about 125 (with all default args) I was seeing average scores of about 1000-1100. Maybe I just need to let it run, but I wanted to make sure.

I got 1514 using --double_dqn and --gradient_clip=10. All other args were the defaults. DeepMind reports 1975 from the nature paper and 3154 with double dqn. These are significantly higher than my results, but I'm unaware of any differences between my implementation and theirs that might cause this. However, it looks like training progress hadn't yet converged and was still improving when the experiment ended:

It seems like I was getting 1000-1300 range around epoch 125.

I don't know if it would have kept improving, but it seems to be progressing much slower than theirs. I have changed the initialization of the moving averages in my rmsprop since then, which might make the training progress a little faster initially, but I don't think it would make a major difference.

Let me know what results you get with what settings! It takes a while to run these experiments, so the more data the better. You could try running it for more than 200 epochs if you have the time and see if it still improves. Let me know if you have any hypotheses about the performance differences as well. I plan on putting all of my results in the wiki when I have finished testing on a couple more games.

gtoubassi · 2016-05-01T17:55:40Z

Thats very helpful. I have a few other random questions if you are willing to contact me over mail (my github user name on gee male). I would also be interested in a more systematic hunt to reproduce the deepmind results with TF if you are game. Perhaps we could mount an offensive with help of others from the deep-q-learning google group.

nishithbsk · 2016-05-10T17:15:18Z

This question is off-topic to this discussion, but how did you use tensorboard to obtain the graph of score_per_game vs. epochs? I tried to use tensorboard --logdir <path/to/records/dir_containing_tf_events> but was not able to see any graphs. Maybe this broke when I upgraded my tensorflow to 0.8?

Jabberwockyll · 2016-05-10T18:23:42Z

Try tensorboard --logdir=${PWD}. You can launch tensorboard from any parent directory too. I'm using 0.8 and it works fine.

Jabberwockyll closed this as completed May 1, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Results and args used for space invaders? #1

Results and args used for space invaders? #1

gtoubassi commented May 1, 2016

Jabberwockyll commented May 1, 2016

gtoubassi commented May 1, 2016

nishithbsk commented May 10, 2016

Jabberwockyll commented May 10, 2016

Results and args used for space invaders? #1

Results and args used for space invaders? #1

Comments

gtoubassi commented May 1, 2016

Jabberwockyll commented May 1, 2016

gtoubassi commented May 1, 2016

nishithbsk commented May 10, 2016

Jabberwockyll commented May 10, 2016