Unit test Prioritised Experience Replay Memory #16

Kaixhin · 2018-04-18T13:10:43Z

PER was reported to cause issues (decreasing the performance of a DQN) when ported to another codebase. Although PER can cause performance to decrease, it is still likely that there exists a bug within it.

Ashutosh-Adhikari · 2018-04-23T06:54:20Z

I am not sure whether what I am going to say is the correct logic behind PER or not.

What current code does : In the training loop, when we do mem.append(), we are keeping the priority to be some default priority, transitions.max().

Shouldn't we do this? : Calculate the priority before appending, and append with that priority. This will keep the complexity same. And attach the priority to the sample right away.

Such level of specification is not found in the paper, to the best of my knowledge.

Kaixhin · 2018-04-23T13:00:59Z

Adding new transitions with the max priority is in line 6 of the algorithm in the PER paper; the initial value, 1, is given in line 2. Also, calculating the priority means having access to the future states (even more states when calculating multi-step returns) and doing the whole target calculation on a single sample, so it's not that cheap.

marintoro · 2018-04-23T14:43:32Z

Just read that in the paper DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY from D. Horgan.
"In Prioritized DQN (Schaul et al., 2016) priorities for new transitions were initialized to the maximum priority seen so far, and only updated once they were sampled."

But it's interesting to notice that they changed it cause this was not scaling well (this article is all about learning with a lot of different actors).

Ashutosh-Adhikari · 2018-04-23T17:08:02Z

@Kaixhin Yep, I understand that now when you say so about n-step TD.

Kaixhin · 2018-05-06T16:12:58Z

Results on 3 games so far look promising, so closing unless a specific problem is identified.

Kaixhin added the bug label Apr 18, 2018

Kaixhin changed the title ~~Unit Test Prioritised Experience Replay Memory~~ Unit test Prioritised Experience Replay Memory Apr 18, 2018

Kaixhin closed this as completed May 6, 2018

albertwujj mentioned this issue May 30, 2019

Interrupted history transitions in ReplayMemory? #42

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unit test Prioritised Experience Replay Memory #16

Unit test Prioritised Experience Replay Memory #16

Kaixhin commented Apr 18, 2018 •

edited

Loading

Ashutosh-Adhikari commented Apr 23, 2018 •

edited

Loading

Kaixhin commented Apr 23, 2018

marintoro commented Apr 23, 2018

Ashutosh-Adhikari commented Apr 23, 2018

Kaixhin commented May 6, 2018

Unit test Prioritised Experience Replay Memory #16

Unit test Prioritised Experience Replay Memory #16

Comments

Kaixhin commented Apr 18, 2018 • edited Loading

Ashutosh-Adhikari commented Apr 23, 2018 • edited Loading

Kaixhin commented Apr 23, 2018

marintoro commented Apr 23, 2018

Ashutosh-Adhikari commented Apr 23, 2018

Kaixhin commented May 6, 2018

Kaixhin commented Apr 18, 2018 •

edited

Loading

Ashutosh-Adhikari commented Apr 23, 2018 •

edited

Loading