New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prioritized Double IQN #518
Conversation
/test |
Successfully created a job for commit 6faf052: |
if args.prioritized: | ||
betasteps = args.steps / args.update_interval | ||
rbuf = replay_buffer.PrioritizedReplayBuffer( | ||
10 ** 6, alpha=0.5, beta0=0.4, betasteps=betasteps, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find alpha
for prioritized replay in the paper https://arxiv.org/abs/1710.10044. Could you confirm where the hyperparameters came from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hyperparameters don't come from anywhere. Prioritized Double IQN is a new algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I could use alpha=0.2
and beta0=0.4
, which appears to be the case here: https://github.com/valeoai/rainbow-iqn-apex/blob/master/rainbowiqn/args.py, which implements Rainbow IQN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test |
Successfully created a job for commit a54f5d7: |
/test |
Successfully created a job for commit a54f5d7: |
/test |
Successfully created a job for commit 1a6abb7: |
/test |
Successfully created a job for commit 1a6abb7: |
We should first merge Double IQN.
Here is a comparison against Double IQN
Prioritized IQN wins on 5/7 domains and loses on 2/7!