Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prioritized Double IQN #518

Merged
merged 17 commits into from Oct 31, 2019
Merged

Conversation

prabhatnagarajan
Copy link
Contributor

@prabhatnagarajan prabhatnagarajan commented Jul 28, 2019

We should first merge Double IQN.

Here is a comparison against Double IQN

Game DoubleIQN Prioritized Double IQN
Asterix 507353.8 738166.66
Bowling 80.33 72.72
Hero 28564.58 35293.26
MontezumaRevenge 5.55 3.79
Qbert 29531.1 25763.95
Seaquest 30870.0 31905.0
Venture 719.51 1369.84
VideoPinball 731942.25 717376.0

Prioritized IQN wins on 5/7 domains and loses on 2/7!

@prabhatnagarajan
Copy link
Contributor Author

/test

@pfn-ci-bot
Copy link
Collaborator

Successfully created a job for commit 6faf052:

@prabhatnagarajan prabhatnagarajan changed the title [WIP] Prioritized Double IQN Prioritized Double IQN Sep 3, 2019
@toslunar toslunar self-assigned this Sep 17, 2019
chainerrl/agents/iqn.py Outdated Show resolved Hide resolved
if args.prioritized:
betasteps = args.steps / args.update_interval
rbuf = replay_buffer.PrioritizedReplayBuffer(
10 ** 6, alpha=0.5, beta0=0.4, betasteps=betasteps,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find alpha for prioritized replay in the paper https://arxiv.org/abs/1710.10044. Could you confirm where the hyperparameters came from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hyperparameters don't come from anywhere. Prioritized Double IQN is a new algorithm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I could use alpha=0.2 and beta0=0.4, which appears to be the case here: https://github.com/valeoai/rainbow-iqn-apex/blob/master/rainbowiqn/args.py, which implements Rainbow IQN.

chainerrl/agents/iqn.py Show resolved Hide resolved
chainerrl/agents/iqn.py Show resolved Hide resolved
Copy link
Member

@toslunar toslunar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@toslunar
Copy link
Member

/test

@pfn-ci-bot
Copy link
Collaborator

Successfully created a job for commit a54f5d7:

@prabhatnagarajan
Copy link
Contributor Author

/test

@pfn-ci-bot
Copy link
Collaborator

Successfully created a job for commit a54f5d7:

chainerrl/agents/iqn.py Show resolved Hide resolved
@toslunar
Copy link
Member

/test

@pfn-ci-bot
Copy link
Collaborator

Successfully created a job for commit 1a6abb7:

@prabhatnagarajan
Copy link
Contributor Author

/test

@pfn-ci-bot
Copy link
Collaborator

Successfully created a job for commit 1a6abb7:

@prabhatnagarajan prabhatnagarajan merged commit c7452e9 into chainer:master Oct 31, 2019
@muupan muupan added this to the v0.8 milestone Feb 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants