## The basics
We will be using openspiel to test out the CFRSolver algorithm with Kuhn Poker, a greatly simplified version of texas holdem poker.
Kuhn poker is useful for testing out models because it has a simple and small decision tree.
Here we will go through the states of a single game.

In [None]:
import random
import pyspiel
import numpy as np

cards = {
    0: 'JACK',
    1: 'QUEEN',
    2: 'KING',
}

DECISIONS = {
    'first_to_act': {
        0: 'CHECK',
        1: 'BET 1',
    },
    'last_to_act': {
        0: 'FOLD',
        1: 'CALL 1',
    }
}

game = pyspiel.load_game("kuhn_poker")

print("number of players:", game.num_players())
print("game info:", game.get_type())

state = game.new_initial_state()
while not state.is_terminal(): #loop through game states
    legal_actions = state.legal_actions()
    
    if state.is_chance_node(): #check if the state involves chance (like is a card being drawn, are players being dealt cards)
        print('player being dealt')

        outcomes_with_probs = state.chance_outcomes()
        for outcome in outcomes_with_probs:
            card_name = cards[outcome[0]]
            prob_pct = round(outcome[1] * 100, 2)
            print(f"  Deal {card_name} with probability {prob_pct}%")


        action_list, prob_list = zip(*outcomes_with_probs)
        action = np.random.choice(action_list, p=prob_list)
        state.apply_action(action)
    else:
        current_player = state.current_player()
        
        print('player to move: ', current_player)

        #check context to know possible actions
        if len(legal_actions) == 2:
            decision_context = DECISIONS['first_to_act']
        else:
            decision_context = DECISIONS['last_to_act']


        print('Legal actions:')
        for action_idx in legal_actions:
            action_str = state.action_to_string(current_player, action_idx)
            print(f"  {action_idx}: {action_str}")


        #select the first available action
        
        action = np.random.choice(legal_actions)
        state.apply_action(action)
        action_str = state.action_to_string(current_player, action)
        print(f'player {current_player} chose: {action_str}')
print('game finished')
    

number of players: 2
game info: <GameType 'kuhn_poker'>
player being dealt
  Deal JACK with probability 33.33%
  Deal QUEEN with probability 33.33%
  Deal KING with probability 33.33%
player being dealt
  Deal JACK with probability 50.0%
  Deal QUEEN with probability 50.0%
player to move:  0
Legal actions:
  0: Pass
  1: Bet
player 0 chose: Bet
player to move:  1
Legal actions:
  0: Pass
  1: Bet
player 1 chose: Bet
game finished


Great! this looks pretty straightforward after some printing. Now we have a basic understanding of the game structure and the openspiel environment.

Let's apply openspiel's CFR solver to this game.

(included also is the EFR solver!)

In [8]:
#cfr_solver = pyspiel.CFRSolver(game)

from open_spiel.python.algorithms.efr import EFRSolver
solver = EFRSolver(game, "blind action")

num_iterations = int(10000)
for i in range(num_iterations):
    solver.evaluate_and_update_policy()

print(f'completed {num_iterations} iterations')



#policy source code: https://github.com/google-deepmind/open_spiel/blob/ce66de9bfc12e81788c70bc0e8df0291bd93b36b/open_spiel/python/policy.py#L97
avg_policy = solver.average_policy()
print('average policy: ', avg_policy)

completed 10000 iterations
average policy:  <open_spiel.python.policy.TabularPolicy object at 0x7fbd36bc24a0>


This is not readable so let's make a quick traversal function to be able to read our policy at each possible state.

In [9]:

#LLM GENERATED CODE

def print_policy_as_poker_range(avg_policy, game, cards):
    """
    Prints the average policy in a more readable, poker-style range format.
    
    Parameters:
        avg_policy: OpenSpiel Policy object (from CFRSolver)
        game: pyspiel game object (Kuhn Poker)
        cards: dict mapping card indices to card names
    """
    state = game.new_initial_state()
    policy_table = {}

    def traverse(state):
        if state.is_terminal():
            return
        elif state.is_chance_node():
            for action, _ in state.chance_outcomes():
                next_state = state.child(action)
                traverse(next_state)
        else:
            infoset = state.information_state_string()
            legal_actions = state.legal_actions()
            policy_probs = avg_policy.action_probabilities(state)

            # Extract player's card from infoset
            player_card = infoset[0]  # e.g. '0' or '2p'
            card_idx = int(player_card)
            card_name = cards[card_idx]

            # Extract betting history (if any)
            history = infoset[1:] if len(infoset) > 1 else "None"

            # Store policy entry
            policy_table[(card_name, history)] = {}
            for action in legal_actions:
                action_str = state.action_to_string(state.current_player(), action)
                prob = round(policy_probs[action], 2)
                policy_table[(card_name, history)][action_str] = prob

            # Recurse
            for action in legal_actions:
                next_state = state.child(action)
                traverse(next_state)

    traverse(state)

    ### Print in readable tabular format ###
    print("=== Poker Range Policy ===")
    print("{:<6} {:<8} {:<10} {:<10}".format("Card", "History", "Action", "Prob"))
    print("-" * 40)
    for (card_name, history), actions in sorted(policy_table.items()):
        for action_str, prob in actions.items():
            print("{:<6} {:<8} {:<10} {:<10}".format(card_name, history, action_str, prob))
    print("=" * 40)





state = game.new_initial_state()



print_policy_as_poker_range(avg_policy, game, cards)

=== Poker Range Policy ===
Card   History  Action     Prob      
----------------------------------------
JACK   None     Pass       0.64      
JACK   None     Bet        0.36      
JACK   b        Pass       1.0       
JACK   b        Bet        0.0       
JACK   p        Pass       0.0       
JACK   p        Bet        1.0       
JACK   pb       Pass       1.0       
JACK   pb       Bet        0.0       
KING   None     Pass       0.0       
KING   None     Bet        1.0       
KING   b        Pass       0.0       
KING   b        Bet        1.0       
KING   p        Pass       0.0       
KING   p        Bet        1.0       
KING   pb       Pass       0.5       
KING   pb       Bet        0.5       
QUEEN  None     Pass       0.0       
QUEEN  None     Bet        1.0       
QUEEN  b        Pass       0.65      
QUEEN  b        Bet        0.35      
QUEEN  p        Pass       0.0       
QUEEN  p        Bet        1.0       
QUEEN  pb       Pass       0.5       
QUEEN  pb       Bet 

Great, this looks like it sort of converged!!!

Now, for my own reference at this point I read the code of CFRSolver.py by openspiel, located here:
https://github.com/google-deepmind/open_spiel/blob/ce66de9bfc12e81788c70bc0e8df0291bd93b36b/open_spiel/python/algorithms/cfr.py#L488

And now we've gotten a handle on using openspiel's environment and algorithms!