Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification about Training and Evaluation Task Split #9

Closed
rasoolfa opened this issue Aug 19, 2020 · 1 comment
Closed

Clarification about Training and Evaluation Task Split #9

rasoolfa opened this issue Aug 19, 2020 · 1 comment

Comments

@rasoolfa
Copy link

rasoolfa commented Aug 19, 2020

Hi,

Thanks for sharing this repository. It is great
I'd like to ask about "Training and Evaluation Task Split" in Appendix D and how results are reported in Tables 1 and 3. I am a bit confused how those have been done.
For simplicity, let's assume BCQ and Maze2D are being used, which of the followings is correct description of what have been done in this paper:

  1. BCQ is trained on "maze2d-umaze-v1". Then the leaned model is used to report results on "maze2d-eval-umaze-v1"? In other words, maze2d-eval-umaze-v1 is not used for training and only used to report results?

  2. BCQ's hyperparameters are tuned on "maze2d-umaze-v1". Then, BCQ is trained with those hyperparameters and evaluated on "maze2d-eval-umaze-v1"? In other words, maze2d-eval-umaze-v1 is used for both training and evaluation?

  3. Or any other scenario?

Thanks for your help.

@rasoolfa
Copy link
Author

I asked it in d4rl repo which I believe more relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant