-
Notifications
You must be signed in to change notification settings - Fork 934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove necessity to keep explicit default policy for MCCFR solvers #154
Remove necessity to keep explicit default policy for MCCFR solvers #154
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
@googlebot I signed it! |
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
This is great, thanks! I'm making a few (mostly) cosmetic changes. Turns out we don't have to store a tabular policy after all (we have a UniformPolicy object and the best response code does state-based queries already. Should get merged in Monday's update. (Please don't close the PR.) |
Nevermind, I spoke too soon. They do indeed need to be tabular policies due to assumptions in the best response code (which we should probably tweak eventually...) |
I see... I suppose this isn't a blocker for the proposed changes? One minor thing: we should probably rename |
Correct! It's already under internal review, which should be easy to get done today, so will almost surely get merged in Monday's update.
Yep, that was one of the changes I made :) (renaming to default_policy_) Please don't add commits at this point because I've already imported the PR and the changes would clash with mine. |
These changes are based on discussions in #149. The idea is to remove the need to keep an explicit default policy within MCCFR solvers and thus make them feasible to run on large games where memory constraints are tighter.
The proposed changes allow for usage of:
TabularPolicy
which is the default behavior and the current stateUniformPolicy
for cases where the inferred average CFR policy will only ever be queried with aState
instancenullptr
which will enable to also query the inferred policy with aninfo state string
but move the handling of info state lookup fails to the external caller (empty policy would be returned in that case)