Clarification about Training and Evaluation Task Split #9

rasoolfa · 2020-08-19T00:59:17Z

Hi,

Thanks for sharing this repository. It is great
I'd like to ask about "Training and Evaluation Task Split" in Appendix D and how results are reported in Tables 1 and 3. I am a bit confused how those have been done.
For simplicity, let's assume BCQ and Maze2D are being used, which of the followings is correct description of what have been done in this paper:

BCQ is trained on "maze2d-umaze-v1". Then the leaned model is used to report results on "maze2d-eval-umaze-v1"? In other words, maze2d-eval-umaze-v1 is not used for training and only used to report results?
BCQ's hyperparameters are tuned on "maze2d-umaze-v1". Then, BCQ is trained with those hyperparameters and evaluated on "maze2d-eval-umaze-v1"? In other words, maze2d-eval-umaze-v1 is used for both training and evaluation?
Or any other scenario?

Thanks for your help.

rasoolfa · 2020-08-19T19:26:48Z

I asked it in d4rl repo which I believe more relevant.

rasoolfa closed this as completed Aug 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification about Training and Evaluation Task Split #9

Clarification about Training and Evaluation Task Split #9

rasoolfa commented Aug 19, 2020 •

edited

Loading

rasoolfa commented Aug 19, 2020

Clarification about Training and Evaluation Task Split #9

Clarification about Training and Evaluation Task Split #9

Comments

rasoolfa commented Aug 19, 2020 • edited Loading

rasoolfa commented Aug 19, 2020

rasoolfa commented Aug 19, 2020 •

edited

Loading