Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inclusion of baseline results #48

Closed
sytelus opened this issue Nov 1, 2019 · 3 comments · Fixed by #65
Closed

Inclusion of baseline results #48

sytelus opened this issue Nov 1, 2019 · 3 comments · Fixed by #65

Comments

@sytelus
Copy link

sytelus commented Nov 1, 2019

There should be a way to see your results that tells you what one should expect if you run the training from scratch. At a minimum, there should be information on a number of training steps and eventual 100-episode average that one might expect in baseline but much better would be to show the entire training curve. Without this baseline is not very meaningful as one may never know if they actually replicated the expected result.

Few good RL baseline frameworks do this, for example here is how other framework display their results: Garage, RLLib, Coach. I love the UX that Garage provides as well as Coach's approach of making results as part of repo itself.

Currently, there is benchmark.zip file in the repo but it seems monitor.csv and progess.csv are not helpful (for example, for DQN progress.csv is empty and monitor.csv only has last few rows). Furthermore, these files are not produced at all currently if you run the experiment.

@araffin
Copy link
Owner

araffin commented Nov 1, 2019

Hello,

At a minimum, there should be information on a number of training steps

You have that in the hyperparameters file and the config file associated with each trained agent (at least starting with release 1.0 of the zoo). The final performance can be found in benchmark.md, note the results correspond only to one seed (it is not meant for quantitative comparison).

Yes, training curve would be a good addition, even better learning curve using a test env periodically (it is planned to be supported with the callback collection), but you would at least 10 runs per algorithm per environment.

it seems monitor.csv

Monitor.csv can give you the training learning curve, which is only a proxy to the real performance.

Furthermore, these files are not produced at all currently if you run the experiment.

If you don't specify a log folder, nothing is produced, yes.

@sytelus
Copy link
Author

sytelus commented Nov 2, 2019

I think my comment is probably misunderstood. I'm currently trying to train model for Breakout and reproduce the results. There is nothing in this repo that tells me what I should expect and how do I know the training was successful. As it happens, something is possibly broken in OpenAI baselines as well as stable-baselines so the training for Breakout isn't generating graphs that are convincingly converging.

@sytelus
Copy link
Author

sytelus commented Nov 2, 2019

Also, looks like in current codebase, there is no call to logger.configure() at all made when running training.py. This possibly explains why there are no monitor.csv and progress.csv generated even when log directory is specified.

yycho0108 pushed a commit to yycho0108/rl-baselines-zoo that referenced this issue Feb 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants