Replicate Prioritized Experience Replay's reported performance improvements #278

muupan · 2018-06-14T10:18:55Z

Missing details

"all weights w_i were scaled so that max_i w_i = 1". Is max_i w_i computed over a minibatch or the whole buffer?
What is the value of epsilon that is added to absolute TD errors?

muupan · 2018-06-15T13:46:29Z

I asked the author via email and confirmed

max_i w_i is computed over a minibatch
epsilon=0.01

muupan · 2018-10-03T13:20:13Z

breakout: 🆗

muupan · 2018-10-03T13:21:45Z

space invaders: 🆗

muupan · 2018-10-03T13:24:21Z

seaquest: a bit worse

muupan · 2018-10-03T13:25:42Z

beam rider: a bit worse

muupan · 2018-10-03T13:27:32Z

asterix: 🆗

muupan · 2018-10-03T13:28:46Z

qbert: 🆗

muupan · 2018-10-03T13:32:15Z

I compared "Double DQN tuned prioritized lr/4" vs "proportional" in the paper.
similar results: breakout, space invaders, asterix, qbert
a bit worse than the paper: seaquest, beam rider

They seem to use 500,000 frames instead of 108,000 frames for evaluation (B.2.3 of http://arxiv.org/abs/1511.05952), so trying 500,000 frames may fill the gap.

prabhatnagarajan · 2018-10-04T08:59:10Z

Tested 7 games: Breakout, Space Invaders, Seaquest, Asterix, Beam Rider, Qbert.

Results:

Breakout: 🆗
Space Invaders: 🆗
Seaquest: worse
Beam Rider: slightly worse
Asterix:🆗
Qbert: 🆗

@muupan suggested that it appears that the evaluations used in the paper permitted longer episodes during evaluations, potentially explaining our slightly worse performance in 2 domains.

prabhatnagarajan closed this as completed Oct 4, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replicate Prioritized Experience Replay's reported performance improvements #278

Replicate Prioritized Experience Replay's reported performance improvements #278

muupan commented Jun 14, 2018

muupan commented Jun 15, 2018

muupan commented Oct 3, 2018

muupan commented Oct 3, 2018

muupan commented Oct 3, 2018

muupan commented Oct 3, 2018 •

edited

muupan commented Oct 3, 2018

muupan commented Oct 3, 2018

muupan commented Oct 3, 2018 •

edited

prabhatnagarajan commented Oct 4, 2018

Replicate Prioritized Experience Replay's reported performance improvements #278

Replicate Prioritized Experience Replay's reported performance improvements #278

Comments

muupan commented Jun 14, 2018

muupan commented Jun 15, 2018

muupan commented Oct 3, 2018

muupan commented Oct 3, 2018

muupan commented Oct 3, 2018

muupan commented Oct 3, 2018 • edited

muupan commented Oct 3, 2018

muupan commented Oct 3, 2018

muupan commented Oct 3, 2018 • edited

prabhatnagarajan commented Oct 4, 2018

muupan commented Oct 3, 2018 •

edited

muupan commented Oct 3, 2018 •

edited