Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Human-expert normalized scores #52

Open
ThisIsIsaac opened this issue Aug 11, 2019 · 4 comments
Open

Human-expert normalized scores #52

ThisIsIsaac opened this issue Aug 11, 2019 · 4 comments

Comments

@ThisIsIsaac
Copy link
Contributor

The Rainbow DQN paper uses human-expert normalized scores, so I am not sure how to evaluate the training results against the original paper. Do you know what values were used for human expert scores?

I found snippets of the values used from papers here and there, but not sure if we can use the same number and how we can compute a single normalized value for all Atari games:
image

@Kaixhin
Copy link
Owner

Kaixhin commented Aug 11, 2019

Looks like I came up with a script in my Atari repo that can do this, but I can't remember where I got the details (must have scoured through lots of DQN papers). I'm not going to do it myself, but if you want to submit a PR that adds the computation and plotting of this score to test.py then I'd be happy to accept it.

@ThisIsIsaac
Copy link
Contributor Author

The scores for some games are different from the ones from the DQN paper:

beam_rider:

  • DQN paper's human score: 7456
  • your code's human score: 5774.70

Enduro

  • DQN paper's human score: 368
  • your code's human score: 309.60

Qbert

  • DQN paper's human score: 18900
  • your code's human score: 13455.00

Pong

  • DQN paper's human score: -3
  • your code's human score: 9.30

Space invaders:

  • DQN paper's human score: 3690
  • your code's human score: 1652.30

Do you remember which papers you got these numbers from?

@Kaixhin
Copy link
Owner

Kaixhin commented Aug 12, 2019

Unfortunately not. Maybe you can email one of the authors of Rainbow to see if they can give you a list of the human rewards and also confirm the score calculation?

@Kaixhin
Copy link
Owner

Kaixhin commented Aug 17, 2019

Although they apparently got the human rewards from the original paper, you can check this paper for human rewards and evaluation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants