Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is this a bug in runner.py? #14

Closed
Ericonaldo opened this issue Aug 14, 2018 · 9 comments
Closed

Is this a bug in runner.py? #14

Ericonaldo opened this issue Aug 14, 2018 · 9 comments

Comments

@Ericonaldo
Copy link

Thank you for the great codes. When I tried new maps, I found some problems in runner.py. When there are more than one env, one env have done before others, then it is going to restart the game. At the end, all envs are done, the calculated rewards contain many episodes, which is a much bigger number. If you understand what I am talking about, please tell me is there any problem?

@Ericonaldo
Copy link
Author

These leaves another problem. Followed by the codes, the model should be trained every n_steps, say 12, inside each training, may contains two episode, then some value calculated by the bellman equation will be wrong.

@inoryy
Copy link
Owner

inoryy commented Aug 14, 2018

No bugs, episode end is accounted for inside the agent during returns calculation

@inoryy inoryy closed this as completed Aug 14, 2018
@Ericonaldo
Copy link
Author

Oh, thanks a lot, but I still think that there is error in reward calculation...

@inoryy
Copy link
Owner

inoryy commented Aug 15, 2018

If you're looking at console logs then they're calculated here. Notice that rewards are averaged and only displayed after all envs report back done flag.

@inoryy inoryy reopened this Aug 15, 2018
@Ericonaldo
Copy link
Author

Yeah, I have seen those codes.:) I mean, when calculating the average score, it may include more than one episode in one env because it has to wait for others to report done. So the result is unlikely to represent the average reward of one episode, which I though you'd like to record.

@inoryy
Copy link
Owner

inoryy commented Aug 16, 2018

Okay, I can see it now. Can confirm it's a bug, good catch! This most likely doesn't affect non-adversarial minigames, but definitely might explain high variance for others like DefeatZerglingsAndBanelings.

As I mentioned in #7 I'm currently re-writing the project essentially from scratch, so I don' think I'll have time to fix it in legacy codebase, but I'll be sure to keep this bug in mind. I plan to publish the rewrite by the end of August.

@Ericonaldo
Copy link
Author

Great! Thanks a lot. Hope for your next great release!

@inoryy
Copy link
Owner

inoryy commented Aug 16, 2018

Let's keep the ticket open until next release so others are informed as well.

@inoryy inoryy reopened this Aug 16, 2018
@inoryy inoryy closed this as completed Nov 25, 2018
@inoryy
Copy link
Owner

inoryy commented Nov 25, 2018

Fixed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants