Data Effiecient Rainbow with Skiing does not work #86

Somjit77 · 2023-02-16T04:51:24Z

There is a problem with sampling from the replay buffer with data-efficient rainbow hyper-parameters with 'skiing'. It goes into an infinite loop.

Kaixhin · 2023-02-16T06:50:19Z

Checking the paper, it seems that the algorithm was only tested on 26 games, which doesn't include "skiing", so there's no guarantee in the original work that those hyperparameters are valid for the entire Atari suite.

Somjit77 · 2023-02-16T15:16:23Z

Hi, thanks for pointing this out. However, I still do not understand why it would affect prioritized experience replay sampling. If you run the algorithm with skiing for seed=123, it simply goes into an infinite loop while sampling from the buffer. It works perfectly for uniform sampling, so I am unable to figure out why this would be the case.

Kaixhin · 2023-02-16T15:20:27Z

Looks like a duplicate of #41. PER can get stuck depending on the way its sampling works - as you can see from how uniform sampling works fine.

Somjit77 · 2023-02-16T16:25:48Z

Ah I see, I think the fact that we use 20-step TD compounds this problem as well. Thank you so much. It makes sense now.

Kaixhin added invalid wontfix labels Feb 16, 2023

Kaixhin closed this as completed Feb 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Effiecient Rainbow with Skiing does not work #86

Data Effiecient Rainbow with Skiing does not work #86

Somjit77 commented Feb 16, 2023

Kaixhin commented Feb 16, 2023

Somjit77 commented Feb 16, 2023

Kaixhin commented Feb 16, 2023

Somjit77 commented Feb 16, 2023

Data Effiecient Rainbow with Skiing does not work #86

Data Effiecient Rainbow with Skiing does not work #86

Comments

Somjit77 commented Feb 16, 2023

Kaixhin commented Feb 16, 2023

Somjit77 commented Feb 16, 2023

Kaixhin commented Feb 16, 2023

Somjit77 commented Feb 16, 2023