Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Effiecient Rainbow with Skiing does not work #86

Closed
Somjit77 opened this issue Feb 16, 2023 · 4 comments
Closed

Data Effiecient Rainbow with Skiing does not work #86

Somjit77 opened this issue Feb 16, 2023 · 4 comments

Comments

@Somjit77
Copy link

There is a problem with sampling from the replay buffer with data-efficient rainbow hyper-parameters with 'skiing'. It goes into an infinite loop.

@Kaixhin
Copy link
Owner

Kaixhin commented Feb 16, 2023

Checking the paper, it seems that the algorithm was only tested on 26 games, which doesn't include "skiing", so there's no guarantee in the original work that those hyperparameters are valid for the entire Atari suite.

@Kaixhin Kaixhin closed this as completed Feb 16, 2023
@Somjit77
Copy link
Author

Hi, thanks for pointing this out. However, I still do not understand why it would affect prioritized experience replay sampling. If you run the algorithm with skiing for seed=123, it simply goes into an infinite loop while sampling from the buffer. It works perfectly for uniform sampling, so I am unable to figure out why this would be the case.

@Kaixhin
Copy link
Owner

Kaixhin commented Feb 16, 2023

Looks like a duplicate of #41. PER can get stuck depending on the way its sampling works - as you can see from how uniform sampling works fine.

@Somjit77
Copy link
Author

Ah I see, I think the fact that we use 20-step TD compounds this problem as well. Thank you so much. It makes sense now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants