Skip to content
This repository has been archived by the owner on Jul 16, 2024. It is now read-only.

WIP: RT Search #86

Open
wants to merge 42 commits into
base: develop
Choose a base branch
from
Open

Conversation

big-c-note
Copy link
Collaborator

@big-c-note big-c-note commented May 11, 2020

@fedden , would love your help on this branch if you want to jump on at any time. Feel free to restructure as you please. I got the ball rolling for real time search (RTS) here. I'll give a little overview about the immediate/long term purpose - as well as my current list of to dos.

Overview

  • Immediate: Needed for testing CFR/core algorithm changes

    • Used for better approximating a equilibrium to validate CFR
    • RTS can extend to the end of game (no huge time restraints, more accurate EV)
    • Only needed at test nodes, so no need to fear of long computation time
      • This is why I put the updates in one method, as opposed to updating throughout gameplay
  • Longterm: Needed for real time play

    • Used for better approximating a (roughly approximated by CFR) equilibrium
    • RTS likely needs to be truncated at leaf nodes for performance
    • Might benefit from updating probabilities as we move through game states due to potentially long computations
      • For this, we will actually need to average regret instead of make the precomputed strategy (as I have been doing with my average_strategy.py file). That way, we can add to regret in real time. We will have to calculate the strategy as we currently do in CFR to do this. This does not matter for the immediate term as the "Nash bot" will not be updating regret.

TODOs

  • Gather list of all action sequences DONE
  • Add function for loading game state DONE
  • Add method for bayesian updating starting hand probabilities DONE
  • Add method for getting probability of reach for each player DONE
  • Update testing methodology notebook (some math errors in there)
  • Debug/validate the two methods above
  • Add method for "loading" a predetermined flop DONE
  • Add method for dealing starting hands according to starting hand probabilities DONE
  • Modularize the state class?? It's getting large with these changes DONE

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@fedden
Copy link
Owner

fedden commented May 11, 2020

Thanks, I'll get stuck into this when I can, I'd advise merging my next PR in so that we can make use of that code in this PR, once this PR is rebased with the new develop

@fedden
Copy link
Owner

fedden commented May 11, 2020

#87

@big-c-note
Copy link
Collaborator Author

big-c-note commented May 11, 2020

Thanks, I'll get stuck into this when I can, I'd advise merging my next PR in so that we can make use of that code in this PR, once this PR is rebased with the new develop

Agreed, will do

self._poker_engine = PokerEngine(
table=self._table, small_blind=small_blind, big_blind=big_blind
)
# Reset the pot, assign betting order to players (might need to remove
# this), assign blinds to the players.
self._poker_engine.round_setup()
# Deal private cards to players.
self._table.dealer.deal_private_cards(self._table.players)
if not self.real_time_test:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During real time play, we'd actually want this deal to happen when the new game state is instantiated. For the test, I'm just keeping things simple, since we won't need to deal hole cards until the test node is reached.

There is a better way to abstract this, but I'm just focusing on getting something working

@big-c-note
Copy link
Collaborator Author

It looks like the regret output is reasonable at this point. I'm sure there are still a couple subtle bugs to work out, but it might be working at this point. Really slow at the moment, but I want to validate the math before optimizing. Updating TODOs with:

  • Continue validation of bayes updating/dealing
  • Create strategy update function that will update strategy counts for the test node only (as we think regret is volatile)

@big-c-note big-c-note force-pushed the feature/realtime-search-for-tests branch 2 times, most recently from ac11f41 to 4f03233 Compare May 18, 2020 01:40
@big-c-note big-c-note force-pushed the feature/realtime-search-for-tests branch from a5dfa50 to 2bbb7e9 Compare May 20, 2020 01:38
@big-c-note big-c-note requested a review from fedden May 24, 2020 14:50
@big-c-note
Copy link
Collaborator Author

big-c-note commented May 24, 2020

@fedden. Go ahead and take a look. This is basically finished. Still two sanity tests that I need to do:

  • Btw, do you know how to fix the Pytest failing? It is because I am referencing strategies I have not pushed.
  • Run RTS on a better trained strategy and use test method to validate
  • Run regular CFR against the develop to make sure I get the same debug output (ie; no bad changes to the state class)
  • Update my notebook, it's got some incorrect math in it

Other than that, it might could be good to go! I'll train a decent strategy tonight to try this on.

@fedden
Copy link
Owner

fedden commented May 24, 2020

provided the binary size is small you could force add this to the repo as part of the test. Or you could compute it in the test (or a smaller strategy) provided it's < 1 min

git add -f test_strategy/unnormalized_output/offline_strategy_100.gz

@big-c-note
Copy link
Collaborator Author

provided the binary size is small you could force add this to the repo as part of the test. Or you could compute it in the test (or a smaller strategy) provided it's < 1 min

git add -f test_strategy/unnormalized_output/offline_strategy_100.gz

Oh, is PyTest just picking this up because it has the word test in it? I can just change the name! I didn't mean for it to be an actual test for PyTest

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants