Question about the normalization of results and the benchmarked CQL performance in Table 2 (ICLR submission) #72

zhaoyi11 · 2020-12-27T15:42:27Z

Hi,

Thanks so much for your work. I have a question about the normalization of results. Specifically, e.g., in the Gym domain, each result is normalized according to the expert-policy (sac) and random-policy. But which number should we refer to? On the Wiki/"Off policy evaluation" page, there is a form that includes the expert-policy and random-policy, should we refer these? Also, the results of the expert-policy are different from the SAC results in Table3 (ICLR), so which one should we use?

And I noticed that in Table 2 and 3 (ICLR), the result of CQL-'hopper-medium' seems not aligned, could you please confirm this (maybe also the CQL-'walker2d-medium')?

Thanks.

zhihanyang2022 · 2021-01-05T02:05:29Z

We can find the random and expert scores here: https://github.com/rail-berkeley/d4rl/blob/master/d4rl/infos.py.

zhaoyi11 · 2021-01-05T09:25:14Z

Thanks a lot! I will close this issue.

zhaoyi11 closed this as completed Jan 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the normalization of results and the benchmarked CQL performance in Table 2 (ICLR submission) #72

Question about the normalization of results and the benchmarked CQL performance in Table 2 (ICLR submission) #72

zhaoyi11 commented Dec 27, 2020

zhihanyang2022 commented Jan 5, 2021 •

edited

Loading

zhaoyi11 commented Jan 5, 2021

Question about the normalization of results and the benchmarked CQL performance in Table 2 (ICLR submission) #72

Question about the normalization of results and the benchmarked CQL performance in Table 2 (ICLR submission) #72

Comments

zhaoyi11 commented Dec 27, 2020

zhihanyang2022 commented Jan 5, 2021 • edited Loading

zhaoyi11 commented Jan 5, 2021

zhihanyang2022 commented Jan 5, 2021 •

edited

Loading