Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove necessity to keep explicit default policy for MCCFR solvers #154

Merged

Conversation

inejc
Copy link
Contributor

@inejc inejc commented Feb 27, 2020

These changes are based on discussions in #149. The idea is to remove the need to keep an explicit default policy within MCCFR solvers and thus make them feasible to run on large games where memory constraints are tighter.

The proposed changes allow for usage of:

  • TabularPolicy which is the default behavior and the current state
  • UniformPolicy for cases where the inferred average CFR policy will only ever be queried with a State instance
  • nullptr which will enable to also query the inferred policy with an info state string but move the handling of info state lookup fails to the external caller (empty policy would be returned in that case)

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@inejc
Copy link
Contributor Author

inejc commented Feb 27, 2020

@googlebot I signed it!

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@lanctot lanctot self-assigned this Feb 27, 2020
@lanctot
Copy link
Collaborator

lanctot commented Feb 27, 2020

This is great, thanks!

I'm making a few (mostly) cosmetic changes. Turns out we don't have to store a tabular policy after all (we have a UniformPolicy object and the best response code does state-based queries already.

Should get merged in Monday's update. (Please don't close the PR.)

@lanctot
Copy link
Collaborator

lanctot commented Feb 27, 2020

Nevermind, I spoke too soon. They do indeed need to be tabular policies due to assumptions in the best response code (which we should probably tweak eventually...)

@inejc
Copy link
Contributor Author

inejc commented Feb 27, 2020

Nevermind, I spoke too soon. They do indeed need to be tabular policies due to assumptions in the best response code (which we should probably tweak eventually...)

I see... I suppose this isn't a blocker for the proposed changes?

One minor thing: we should probably rename uniform_policy_ in solvers to default_policy_ since this could, in theory, be something different to uniform now. Please LMK whether I can add commits to this PR or you want to potentially change that later?

@lanctot
Copy link
Collaborator

lanctot commented Feb 27, 2020

I see... I suppose this isn't a blocker for the proposed changes?

Correct! It's already under internal review, which should be easy to get done today, so will almost surely get merged in Monday's update.

One minor thing: we should probably rename uniform_policy_ in solvers to default_policy_ since this could, in theory, be something different to uniform now. Please LMK whether I can add commits to this PR or you want to potentially change that later?

Yep, that was one of the changes I made :) (renaming to default_policy_) Please don't add commits at this point because I've already imported the PR and the changes would clash with mine.

@OpenSpiel OpenSpiel merged commit 36ba1b1 into google-deepmind:master Mar 2, 2020
@inejc inejc deleted the inejc/fix-explicit-cfr-policy branch March 2, 2020 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants