WIP: RT Search #86

big-c-note · 2020-05-11T03:12:24Z

@fedden , would love your help on this branch if you want to jump on at any time. Feel free to restructure as you please. I got the ball rolling for real time search (RTS) here. I'll give a little overview about the immediate/long term purpose - as well as my current list of to dos.

Overview

Immediate: Needed for testing CFR/core algorithm changes
- Used for better approximating a equilibrium to validate CFR
- RTS can extend to the end of game (no huge time restraints, more accurate EV)
- Only needed at test nodes, so no need to fear of long computation time
  - This is why I put the updates in one method, as opposed to updating throughout gameplay
Longterm: Needed for real time play
- Used for better approximating a (roughly approximated by CFR) equilibrium
- RTS likely needs to be truncated at leaf nodes for performance
- Might benefit from updating probabilities as we move through game states due to potentially long computations
  - For this, we will actually need to average regret instead of make the precomputed strategy (as I have been doing with my average_strategy.py file). That way, we can add to regret in real time. We will have to calculate the strategy as we currently do in CFR to do this. This does not matter for the immediate term as the "Nash bot" will not be updating regret.

TODOs

Gather list of all action sequences DONE
Add function for loading game state DONE
Add method for bayesian updating starting hand probabilities DONE
Add method for getting probability of reach for each player DONE
Update testing methodology notebook (some math errors in there)
Debug/validate the two methods above
Add method for "loading" a predetermined flop DONE
Add method for dealing starting hands according to starting hand probabilities DONE
Modularize the state class?? It's getting large with these changes DONE

CLAassistant · 2020-05-11T03:12:29Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

fedden · 2020-05-11T09:18:15Z

Thanks, I'll get stuck into this when I can, I'd advise merging my next PR in so that we can make use of that code in this PR, once this PR is rebased with the new develop

fedden · 2020-05-11T09:19:09Z

#87

big-c-note · 2020-05-11T18:28:59Z

Thanks, I'll get stuck into this when I can, I'd advise merging my next PR in so that we can make use of that code in this PR, once this PR is rebased with the new develop

Agreed, will do

big-c-note · 2020-05-11T19:40:58Z

pluribus/games/short_deck/state.py

        self._poker_engine = PokerEngine(
            table=self._table, small_blind=small_blind, big_blind=big_blind
        )
        # Reset the pot, assign betting order to players (might need to remove
        # this), assign blinds to the players.
        self._poker_engine.round_setup()
        # Deal private cards to players.
-        self._table.dealer.deal_private_cards(self._table.players)
+        if not self.real_time_test:


During real time play, we'd actually want this deal to happen when the new game state is instantiated. For the test, I'm just keeping things simple, since we won't need to deal hole cards until the test node is reached.

There is a better way to abstract this, but I'm just focusing on getting something working

big-c-note · 2020-05-13T02:56:16Z

It looks like the regret output is reasonable at this point. I'm sure there are still a couple subtle bugs to work out, but it might be working at this point. Really slow at the moment, but I want to validate the math before optimizing. Updating TODOs with:

Continue validation of bayes updating/dealing
Create strategy update function that will update strategy counts for the test node only (as we think regret is volatile)

…are test-specific changes atm

…update method

…ble at least

…ibutes

…porary regret, only weight temp regret

big-c-note · 2020-05-24T14:54:06Z

@fedden. Go ahead and take a look. This is basically finished. Still two sanity tests that I need to do:

Btw, do you know how to fix the Pytest failing? It is because I am referencing strategies I have not pushed.
Run RTS on a better trained strategy and use test method to validate
Run regular CFR against the develop to make sure I get the same debug output (ie; no bad changes to the state class)
Update my notebook, it's got some incorrect math in it

Other than that, it might could be good to go! I'll train a decent strategy tonight to try this on.

fedden · 2020-05-24T15:02:23Z

provided the binary size is small you could force add this to the repo as part of the test. Or you could compute it in the test (or a smaller strategy) provided it's < 1 min

git add -f test_strategy/unnormalized_output/offline_strategy_100.gz

big-c-note · 2020-05-24T15:04:12Z

provided the binary size is small you could force add this to the repo as part of the test. Or you could compute it in the test (or a smaller strategy) provided it's < 1 min
git add -f test_strategy/unnormalized_output/offline_strategy_100.gz

Oh, is PyTest just picking this up because it has the word test in it? I can just change the name! I didn't mean for it to be an actual test for PyTest

… function, fix broken test

…h after changes

…d testing RTS

…a file for this, it is also lengthy at the moment

…ressions

big-c-note commented May 11, 2020

View reviewed changes

big-c-note force-pushed the feature/realtime-search-for-tests branch 2 times, most recently from ac11f41 to 4f03233 Compare May 18, 2020 01:40

big-c-note added 22 commits May 19, 2020 15:36

adding probability of reach for each player, probably buggy

4b156dc

adding method for dealing cards, changing real time flag since these …

3b2cc02

…are test-specific changes atm

adding TODOs

e15c405

fix infoset lookup issue

cb91e0f

simple way of dealing with predetermined public cards

9355e2b

fixing broken tests

28e3f58

getting closer to using realtime, but I broke something in the bayes …

21da4bb

…update method

fix sytax error

e326629

work around for awkward deck class behavior

e362924

fixing broken test, still a bug in deal_bayes method

a56ccca

it's doing something

ae6968d

adding normal deck back and testing, seems like the regret is reasona…

9fe1775

…ble at least

adding somewhat hacky update strategy

09815cf

trying with better ph estimation

8750ceb

working out a few bugs

1398134

rebasing, confirming RT is running before refactor

e341ead

moving get_game_state to ShortDeckPokerState class

682298d

removing leftover code

bf3cfad

moving agent class to its own file

ebf0a9e

forgot to add in last commit, removing agent class

da97637

removing agent strategy from the state class

a982fff

refactoring useages of the deck class and card class

cd48a73

cleaning up some errors

2bbb7e9

big-c-note force-pushed the feature/realtime-search-for-tests branch from a5dfa50 to 2bbb7e9 Compare May 20, 2020 01:38

big-c-note added 4 commits May 19, 2020 23:58

reorganizing methods, info_set_builder takes args, remove unused attr…

965139a

…ibutes

making unnormalized strategy default, calculate strategy based on tem…

4f237a9

…porary regret, only weight temp regret

cleaning up code and testing a few configs

e68efeb

beginnings of a test script

0cd1bba

big-c-note requested a review from fedden May 24, 2020 14:50

big-c-note added 15 commits May 25, 2020 10:11

updating offline_strategy on each dump int, wrapping test method into…

f285442

… function, fix broken test

sample out put of state class producing same results as develop branc…

203fb22

…h after changes

removing sample files

715c4f8

testing many different RTS configs

90ae9c7

bug fix for dealing bayes hole cards before next community card

5cdd94f

fixing random strategy bug, adding entry script for rng game nodes an…

a097810

…d testing RTS

fixing pytest build fail

e82eb41

updating to decent defaults

7b4b777

regression-style test for loading game history

4d70d20

using smaller lookup table for pytest

8d48cf3

regresion-style test for loading public info, won't work until I add …

8e05533

…a file for this, it is also lengthy at the moment

reorganizing some

cc73933

removing old files

1eb9a85

shortening tests and test data, will make private data for fuller reg…

8aa9c09

…ressions

refactor and found a new bug in the tests

c303e70

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: RT Search #86

WIP: RT Search #86

big-c-note commented May 11, 2020 •

edited

Loading

CLAassistant commented May 11, 2020

fedden commented May 11, 2020

fedden commented May 11, 2020

big-c-note commented May 11, 2020 •

edited

Loading

big-c-note May 11, 2020

big-c-note commented May 13, 2020

big-c-note commented May 24, 2020 •

edited

Loading

fedden commented May 24, 2020

big-c-note commented May 24, 2020

WIP: RT Search #86

Are you sure you want to change the base?

WIP: RT Search #86

Conversation

big-c-note commented May 11, 2020 • edited Loading

CLAassistant commented May 11, 2020

fedden commented May 11, 2020

fedden commented May 11, 2020

big-c-note commented May 11, 2020 • edited Loading

big-c-note May 11, 2020

Choose a reason for hiding this comment

big-c-note commented May 13, 2020

big-c-note commented May 24, 2020 • edited Loading

fedden commented May 24, 2020

big-c-note commented May 24, 2020

big-c-note commented May 11, 2020 •

edited

Loading

big-c-note commented May 11, 2020 •

edited

Loading

big-c-note commented May 24, 2020 •

edited

Loading