Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do both BC and DT fit the training data well? #2

Closed
w-hc opened this issue Jun 4, 2021 · 1 comment
Closed

Do both BC and DT fit the training data well? #2

w-hc opened this issue Jun 4, 2021 · 1 comment

Comments

@w-hc
Copy link

w-hc commented Jun 4, 2021

Hi thanks for the interesting work!
A question here: how well do Behavior Cloning and Decision Transformer fit the training data (esp. when there is a mixture of policies, like the ones with replay data or medium + expert)? This doesn't seem to be reported in the paper. Do they fit the data (roughly) equally well?

@kzl
Copy link
Owner

kzl commented Jun 5, 2021

Thanks for the question! I've attached some of the L2 losses for both. In short Decision Transformer fits the training data better across all datasets (a combination of return conditioning and longer context length).

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants