Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification about Training and Evalaution Steps #13

Closed
rfali opened this issue Nov 9, 2022 · 1 comment
Closed

Clarification about Training and Evalaution Steps #13

rfali opened this issue Nov 9, 2022 · 1 comment

Comments

@rfali
Copy link

rfali commented Nov 9, 2022

Hi. I am a little confused about how many steps should the agent train and then evalauted on. To add a little context, the paper mentions on pg2

Crafter evaluates many different abilities of an agent by training only on a single environment for 5M steps

and this can be seen in crafter_baselines code as well (e.g. PPO, Rainbow)

But in Sec 3.3 of the paper

An agent is granted a budget of 1M environment steps to interact with the environment.

elsewhere (pg 6, sec 4.1), multiple figures

budget of 1M environment steps

Table A.1

It is computed across all training episodes within the budget of 1M environment steps

I also see that you commented out evaluation code from the Rainbow code (here).

What I can make of this is that I need to run crafter agent for 1M steps (I saw the PPO example) and then use the saved stats (json file?) and the analysis code to calculate the success rate and score. Precisely, using the existing crafter code available, how can I go from training to plotting meaningful results. Can you please clarify? Thanks

@danijar
Copy link
Owner

danijar commented Nov 10, 2022

Thanks for spotting the typo in the paper. It's always 1M steps.

I ran the methods for 5M steps but only used the first 1M steps because afterwards the training curves started to flat off and require a lot more steps for relatively small improvements.

If you use the recorder class to wrap the environment, it will write a JSONL file that you can plot results and generate tables from using the scripts in the analysis directory. If you have more concrete questions about those, just let me know (in a new ticket).

@danijar danijar closed this as completed Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants