# Transition probabilities 

$c = $ board configuration. This is a 3x8 matrix of tuples. Ex. one element could be $(1,0)$, which represents 1 white token and 0 black tokens occupying the space. 
<br> 
$t = $ player turn. This determines whose turn it is. i.e. black or white
<br>
$d = $ dice roll. In the Game of Ur, the outcome space is $\{{0, 1, 2}\}$ and i.i.d.
<br> 
$s = $ state. This is determined by the board configuration $c$ and player turn $t$
<br> 
$s' = $ transition state. This is determined from $s$. Before the dice roll, this is simply denoted as $s'$. Given a die roll, we can denote it in the following form. 
<br> 
$s'^{d}_{ n} = $ specific transition state. After the die roll, we can express the transition state $s'$ more specifically. $d$ denotes the dice roll and $n$ an arbitrary numbering of state to help keep track of how many possible states. $n$ is at max the number of possible states possible from $s \rightarrow s'$. Ex. $s'^{2}_{4}$ can be interpreted as a die roll of 2 and transition state 4 (again arbitrary numbering) from s. 
<br>
$a = $ action. This is action to take given a dice roll $d$ and a state $s$. This will transition $s \rightarrow s'$. 
<br>
$p = $ probability of transition. This is the probability of $s \rightarrow s'$ given a policy $\pi$
<br> 
$\pi = $ policy. Given a gamestate $s$ and a diceroll $d$, the policy determines the set of possible actions $a$. Given legal actions $a$, we set the policy to randomly choose a move. We will try to determine a better policy with reinforcement learning


In [3]:
from ipynb.fs.full.Next_Game_state import possible_moves #This returns a list of possible moves for a dice roll. 
from ipynb.fs.full.Playing_the_Game import turn_random_move #This takes a turn with policy of picking a random move
import numpy as np
import time

Game state:
[[(0, 0) (2, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) (1, 0)]
 [(0, 0) (0, 2) (0, 0) (0, 0)]]
Roll: 1
It's black's turn
--------------------------------------
It's black's turn
--------------------------------------
Current game state
--------------------------------------
[[(0, 0) (2, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) (1, 0)]
 [(0, 0) (0, 2) (0, 0) (0, 0)]]
--------------------------------------
Black rolls a  1
--------------------------------------
Possible black moves
--------------------------------------
One possible move: B0 -> B1
[[(0, 0) (2, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) (1, 0)]
 [(0, 1) (0, 1) (0, 0) (0, 0)]]
--------------------------------------
Number of possible moves:  1
--------------------------------------
[array([[(0, 0), (2, 0), (0, 0), (0, 0)],
       [(0, 0), (0, 0), (0, 0), (1, 0)],
       [(0, 1), (0, 1), (0, 0), (0, 0)]],
      dtype=[('whites', '<i4'), ('blacks', '<i4')])]
||||||||||||||||||||||||||||||||||||||

Next player's turn

In [186]:
# initialize the game board
current_game_state = np.array([[(0,0), (2,0), (0,0), (0,0)], 
                           [(0,0), (0,0), (0,0), (0,0)], 
                           [(0,0), (0,2), (0,0), (0,0)]], dtype = [('whites', 'i4'), ('blacks', 'i4')])
# Player turn 
player = 'b'

# Is this testing mode? 
test = False

In [12]:
def potential_moves_before_roll(game_state, player):
    '''
    This function determines the potential moves before a die roll
    Input: state, s
    Output: list of states possible from current state s before a die roll
    '''
    # initialize potential moves 
    potential_moves = []
    
    # call possible_moves function for all possible dice rolls and append to potential_moves
    roll_zero = possible_moves(player, 0, current_game_state, test)
    potential_moves.append(roll_zero)
    roll_one = possible_moves(player, 1, current_game_state, test)
    potential_moves.append(roll_one)
    roll_two = possible_moves(player, 2, current_game_state, test)
    potential_moves.append(roll_two)
    return potential_moves

In [13]:
moves_before_dice_rolls = potential_moves_before_roll(current_game_state, player)
print(moves_before_dice_rolls)

[[array([[(0, 0), (2, 0), (0, 0), (0, 0)],
       [(0, 0), (0, 0), (0, 0), (0, 0)],
       [(0, 1), (0, 1), (0, 0), (0, 0)]],
      dtype=[('whites', '<i4'), ('blacks', '<i4')])], [array([[(0, 0), (2, 0), (0, 0), (0, 0)],
       [(0, 1), (0, 0), (0, 0), (0, 0)],
       [(0, 0), (0, 1), (0, 0), (0, 0)]],
      dtype=[('whites', '<i4'), ('blacks', '<i4')])], [array([[(0, 0), (2, 0), (0, 0), (0, 0)],
       [(0, 1), (0, 0), (0, 0), (0, 0)],
       [(0, 1), (0, 0), (0, 0), (0, 0)]],
      dtype=[('whites', '<i4'), ('blacks', '<i4')]), array([[(0, 0), (2, 0), (0, 0), (0, 0)],
       [(0, 0), (0, 1), (0, 0), (0, 0)],
       [(0, 0), (0, 1), (0, 0), (0, 0)]],
      dtype=[('whites', '<i4'), ('blacks', '<i4')])]]


In [28]:
testing = False

In [50]:
def move_prob_before_roll(roll_0, roll_1, roll_2, test):
    '''
    Before we roll, we assign probabilities of getting to 
    a certain state. 
    Input: possible states given a particular dice roll n i.e. roll_0 = list of states
    Output: a dictionary mapping the name of the state to a 3 x 8 matrix 
    '''
    # Initialize probabilities list 
    transition_probs = []
    
    # We know this is the distribution of dice rolls 
    die_roll_0 = 1/3 
    die_roll_1 = 1/3
    die_roll_2 = 1/3 
    
    # create a dictionary 
    d = {}
    count = 1 
    for i in roll_0:
        d['prob_s0' + str(count)] = []
        count += 1
    for i in roll_1:
        d['prob_s1' + str(count)] = []
        count +=1 
    for i in roll_2:
        d['prob_s2' + str(count)] = []
        count += 1
    #print(d.keys())
    
    # Rolls 
    prob_s0 = die_roll_0 * len(roll_0)
    d['prob_s01'] = prob_s0
    count = 1
    if test == True:
        print("Given a roll of 0, you have", d['prob_s01']*100, "% chance of choosing prob_s0" + str(count))
    count += 1
    if len(roll_1) != 0:
        prob_s1 = die_roll_1 * (1/len(roll_1))
        for i in range(len(roll_1)):
            d['prob_s1' + str(count)] = prob_s1
            if test == True:
                print("Given a roll of 1, you have", d['prob_s1' + str(count)]*100, "% chance of choosing prob_s1" + str(count))
            count += 1

    if len(roll_2) != 0:
        prob_s2 = die_roll_2 * (1/len(roll_2))
        for i in range(len(roll_2)):
            d['prob_s2' + str(count)] = prob_s2
            if test == True:
                print("Given a roll of 2, you have", d['prob_s2' + str(count)]*100, "% chance of choosing prob_s2" + str(count))
            count += 1
    dict_keys = list(d.keys())
    dict_values = list(d.values())
    return d

In [51]:
roll_0 = possible_moves(player, 0, current_game_state, test)
roll_1 = possible_moves(player, 1, current_game_state, test)
roll_2 = possible_moves(player, 2, current_game_state, test)
answer = move_prob_before_roll(roll_0, roll_1, roll_2, testing)
print(answer)
print(type(answer))

--------------------------------------
It's black's turn
--------------------------------------
Current game state
--------------------------------------
[[(0, 0) (2, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 2) (0, 0) (0, 0)]]
--------------------------------------
Black rolls a  0
--------------------------------------
Possible black moves
--------------------------------------
None
--------------------------------------
Number of possible moves:  0
--------------------------------------
--------------------------------------
It's black's turn
--------------------------------------
Current game state
--------------------------------------
[[(0, 0) (2, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 2) (0, 0) (0, 0)]]
--------------------------------------
Black rolls a  1
--------------------------------------
Possible black moves
--------------------------------------
One possible move: B0 -> B1
[[(0, 0) (2, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) 

In [52]:
testing = True 
states_roll_0 = len(roll_0)
states_roll_1 = len(roll_1)
states_roll_2 = len(roll_2)
states_per_roll = [states_roll_0, states_roll_1, states_roll_2]
print(states_per_roll)

[1, 1, 1]


In [53]:
def move_prob_after_roll(roll, potential_moves_dict, states_per_roll, testing):
    '''
    Given a die roll, we calculate the probability of choosing a certain legal move.
    I.e. Given a roll of 2, what is the likelihood that we pick s21 or s22
    Input: roll, dictionary of possible moves
    Output: dictionary of conditional probabilities
    '''
    if testing == True: 
        print()
        print("After rolling a", roll, ":")
        print("-------------------")
    d = {}
    states_roll_0 = states_per_roll[0]
    states_roll_1 = states_per_roll[1]
    states_roll_2 = states_per_roll[2]
    if roll == 0: 
        d['prob_s01'] = 1
        if testing == True:
            print("Given a roll of 0, you have a", d['prob_s01']*100, "% chance of getting state prob_s01")
    if roll == 1:
        count = 1
        for i in range(states_roll_1):
            d['prob_s1' + str(count)] = 1 / states_roll_1
            if testing == True:
                print("Given a roll of 1, you have a", d['prob_s1' + str(count)] * 100, "% chance of choosing prob_s1" + str(count))
            count += 1
    if roll == 2:
        count = 1
        for i in range(states_roll_2):
            d['prob_s2' + str(count)] = 1 / states_roll_2
            if testing == True:
                print("Given a roll of 2, you have a", d['prob_s2' + str(count)] * 100, "% chance of choosing prob_s2" + str(count))
            count += 1
            
    return d 

In [54]:
potential_moves_dict = move_prob_before_roll(roll_0, roll_1, roll_2, testing)
roll = 2
testing == True 
trans_prob_given_roll = move_prob_after_roll(roll, potential_moves_dict, states_per_roll, testing)
print(trans_prob_given_roll)

Given a roll of 0, you have 33.33333333333333 % chance of choosing prob_s01
Given a roll of 1, you have 33.33333333333333 % chance of choosing prob_s12
Given a roll of 2, you have 33.33333333333333 % chance of choosing prob_s23

After rolling a 2 :
-------------------
Given a roll of 2, you have a 100.0 % chance of choosing prob_s21
{'prob_s21': 1.0}


In [14]:
def future_state_probs_after_roll(game_state, before_roll_dict, die_roll, num_turns_ahead, turn, test):
    '''
    This function determines the probability of going the state after a die roll to the state before the dice roll 
    In other words, [After -> Before] is considered one iteration; [After -> Before -> After -> Before] is considered two.
    Input:
    - the current gamestate, game_state (3x8 matrix)
    - dictionary of possible transition states and probabilities before the die roll, before_roll_dict (dictionary)
    - the roll of the dice, die_roll (integer)
    - number of iterations to look ahead, num_turns_ahead (integer)
    - player turn, turn (string either 'w' or 'b')
    - enable print statements, test (True or False boolean)
    Output: 
    - dictionary of possible states and their probabilities starting after the die roll and ending right after the nth
    die roll. The last part can be rephrased as the state before the next die roll. 
    '''

    
    # Initialize prob_after_roll dictionary 
    d_after_roll = move_prob_after_roll(die_roll, d_before_roll, moves_per_state, test)
    
    # answer dictionary 
    d_answer = d_after_roll
    
    current_game_state = game_state
    player = turn
    
    for i in range(num_turns_ahead):
        
        # Find possible moves given dice roll 
        moves = possible_moves(turn, die_roll, current_game_state, test)
        
        # Choose move based on possible moves at random
        if moves == []:
            current_game_state = game_state 
        else: 
            # Pick a random move
            number_of_moves = len(moves)
            move_to_pick = np.random.randint(0, number_of_moves)
            current_game_state = moves[move_to_pick]
            
        print()
        print("Player", player, "chose a move ")
        print(current_game_state)
        print()
        
        # Change players 
        if turn == 'w':
            player = 'b'
        if turn == 'b':
            player = 'w'
            
        # List of possible moves given a certain roll 
        roll_0 = possible_moves(player, 0, current_game_state, test)
        roll_1 = possible_moves(player, 1, current_game_state, test)
        roll_2 = possible_moves(player, 2, current_game_state, test)
            
        # We know this is the distribution of dice rolls 
        die_roll_0 = 1/3 
        die_roll_1 = 1/3
        die_roll_2 = 1/3 
        
        
        # Find possible moves for next player
        # aka update d_before
        
        count = len(roll_0):
        #for i in roll_0:
            
        #for key in d_answer.keys():
            #if key in d_answer: 
                
            
        print(current_game_state)
        print(player)
    
    return current_game_state, player

In [15]:
# initialize the game board
current_game_state = np.array([[(0,0), (2,0), (0,0), (0,0)], 
                           [(0,0), (0,0), (0,0), (0,0)], 
                           [(0,0), (0,2), (0,0), (0,0)]], dtype = [('whites', 'i4'), ('blacks', 'i4')])
# Player turn 
player = 'b'

# Is this testing mode? 
test = True

# turns to look ahead 
num_look_ahead = 2

# die roll
die_roll = 2

roll_0 = possible_moves(player, 0, current_game_state, test)
num_poss_moves_0 = len(roll_0)
roll_1 = possible_moves(player, 1, current_game_state, test)
num_poss_moves_1 = len(roll_1)
roll_2 = possible_moves(player, 2, current_game_state, test)
num_poss_moves_2 = len(roll_2)
moves_per_state = [num_poss_moves_0, num_poss_moves_1, num_poss_moves_2]

# Create prob_before_roll dictionary 
d_before_roll = move_prob_before_roll(roll_0, roll_1, roll_2, test)
print(d_before_roll)
trans_probs_after_roll = future_state_probs_after_roll(current_game_state, d_before_roll, die_roll, num_look_ahead, player, test)


--------------------------------------
It's black's turn
--------------------------------------
Current game state
--------------------------------------
[[(0, 0) (2, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 2) (0, 0) (0, 0)]]
--------------------------------------
Black rolls a  0
--------------------------------------
Possible black moves
--------------------------------------
None
--------------------------------------
Number of possible moves:  0
--------------------------------------
--------------------------------------
It's black's turn
--------------------------------------
Current game state
--------------------------------------
[[(0, 0) (2, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 2) (0, 0) (0, 0)]]
--------------------------------------
Black rolls a  1
--------------------------------------
Possible black moves
--------------------------------------
One possible move: B0 -> B1
[[(0, 0) (2, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) 

In [183]:
def future_state_probabilities(game_state, num_turns_ahead, turn, test):
    '''
    This function determines the likelihood of transitioning from a gamestate s 
    to gamestate s' for an n number of turns in the future 
    Input: state s (3x8 matrix), num_turns_ahead (the number of turns we'd like
    to look ahead to)
    Outputs: a dictionary of possible states mapped to probabilities
    '''
    global current_game_state 
    global player
    
    # initialize final dictionary
    d_answer = {}
    
    #
    roll_0 = possible_moves(player, 0, current_game_state, test)
    num_poss_moves_0 = len(roll_0)
    roll_1 = possible_moves(player, 1, current_game_state, test)
    num_poss_moves_1 = len(roll_1)
    roll_2 = possible_moves(player, 2, current_game_state, test)
    num_poss_moves_2 = len(roll_2)
    moves_per_state = [num_poss_moves_0, num_poss_moves_1, num_poss_moves_2]

    
    # Create prob_before_roll dictionary 
    d_before_roll = move_prob_before_roll(roll_0, roll_1, roll_2, test)
    print(d_before_roll)
    
    #die roll 
    die_roll = 2
    
    # Create prob_after_roll dictionary 
    d_after_roll = move_prob_after_roll(die_roll, d_before_roll, moves_per_state, test)
    print(d_after_roll)
    
    np.random.seed(3)
    
    # Choose the move
    
    # Find possible moves 
    moves = possible_moves(player, die_roll, current_game_state, test)
    if moves == []:
        current_game_state = game_state 
    else: 
        # pick a random move
        number_of_moves = len(moves)
        move_to_pick = np.random.randint(0, number_of_moves)
        current_game_state = moves[move_to_pick]
    
    # change players 
    if player == "w":
        player = 'b'
    
    if player == 'b':
        player = 'w'
    
    print(current_game_state)
    print(player)
    
    return d_answer

In [184]:
# initialize the game board
current_game_state = np.array([[(0,0), (2,0), (0,0), (0,0)], 
                [(0,0), (0,0), (0,0), (0,0)], 
                [(0,0), (0,2), (0,0), (0,0)]], dtype = [('whites', 'i4'), ('blacks', 'i4')])
look_ahead_n_turns = 2 
player = 'b'
test = True 
future_states_probs = future_state_probabilities(state, look_ahead_n_turns, turn, test)
print(future_states_probs)

--------------------------------------
It's black's turn
--------------------------------------
Current game state
--------------------------------------
[[(0, 0) (2, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 2) (0, 0) (0, 0)]]
--------------------------------------
Black rolls a  0
--------------------------------------
Possible black moves
--------------------------------------
None
--------------------------------------
Number of possible moves:  0
--------------------------------------
--------------------------------------
It's black's turn
--------------------------------------
Current game state
--------------------------------------
[[(0, 0) (2, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 2) (0, 0) (0, 0)]]
--------------------------------------
Black rolls a  1
--------------------------------------
Possible black moves
--------------------------------------
One possible move: B0 -> B1
[[(0, 0) (2, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) 

In [176]:
print(current_game_state)
print(player)

[[(0, 0) (2, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) (0, 0)]
 [(0, 1) (0, 1) (0, 0) (0, 0)]]
w


In [None]:
def policy(gamestate, action_space):
    '''
    The policy that determines which action to take given the possible actions (determined in function next_game_state)
    Input: 
    gamestate, s
    action_space, a_space. The die roll is already encoded into the action_space, therefore we do not to specify the roll 
    
    Output:
    action to take from action space, a
    '''
    if action_space== []:
        current_game_state = game_state 
    else: 
        # pick a random move
        number_of_moves = len(moves)
        move_to_pick = np.random.randint(0, number_of_moves)
        current_game_state = moves[move_to_pick]

    return a

In [4]:
def transition_probabilites(gamestate, policy):
    '''
    This function return a vector of transition probabilities
    Input: 
    gamestate, s
    
    Output: 
    the transition probabiily vector of s, p. 
    '''
    return 

SyntaxError: invalid syntax (<ipython-input-4-65da3c848eb3>, line 1)

In [23]:
game_state_1 = np.array([[(0,0), (2,0), (0,0), (0,0)], 
                           [(0,0), (0,0), (0,0), (0,0)], 
                           [(0,0), (0,2), (0,0), (0,0)]], dtype = [('whites', 'i4'), ('blacks', 'i4')])

game_state_2 = np.array([[(0,0), (2,0), (0,0), (0,0)], 
                           [(0,0), (0,0), (0,0), (0,0)], 
                           [(0,0), (0,2), (0,0), (0,0)]], dtype = [('whites', 'i4'), ('blacks', 'i4')])

if np.array_equal(game_state_1, game_state_2):
    print("These are equal")

These are equal
