### Content 

* Game Representation
* Game Examples 
    * Tic-Tac-Toe
    * Fig 5.2 Game
* Min-Max
* Alpha-Beta
* Players 

In [1]:
from games import * 
from notebook import psource, pseudocode

### Game Representation

To represent games we make use of the Game class, which was can subclass and override its functions to represent our own games. 

GameState is a namedtuple which represents the current state of the game.

GameState is defined as follows: GameState = namedtuple('GameState', 'to_move, utility, board, moves')

to_move: representation of whose turn is next 
utility: stores the utility of the game state
board: A dict stores the board of the game
moves: Stores the list of legal moves possible from the current position


In [2]:
# see game class
%psource Game

We must implement the following methods when we create a new class that represents a game. 

* actions(self, state): given a game state, this method generates all the legal actions possible from this state, as a list or generator. Returning a generator rather than a list has the advantage that it saves space and we can still operate on it as a list.

* result(self, state, move): given a game state and a move, this method returns the game state that you can get by making that move on this game state. 

* utility(self, state, player): given a terminal game state and a player, this method returns the utility for that player in the given terminal game state. While implementing this method assume that the game state is a terminal game state. The logic in this module is such that this method will only be called on terminal game states.

* terminal_test(self, state): given a game state, this method will return true if this game state is a terminal state, and false otherwise 

* to_move(self, state): given a game state, this method returns the player who is to play next. The information is typically stored in the game state, so all this method does is etract the information and return it

* display(self, state): print / displays the current state of the game 


In [3]:
# tic tac toe example 
%psource TicTacToe

The class TicTacToe is inherited from the class Game. 

Additional methods in TicTacToe: 

* __init__(self, h = 3, v = 3, k = 3): initializes the tictactoe game. Number of rows is h, number of columns is v, and number of consecutive X's or O's required to win is k.

* compute_utility(self, board, move, player): A method to calculate the utility of the TicTacToe game. If X wins, this returns 1. If O wins, this returns -1. Otherwise it returns 0.

* k_in_row(self, board, move, player, delta_x_y): This method returns True if there is a line formed on the TieTacToe board with the latest move, else it returns False

To store the game states we will use the GameState namedtuple. 

* to_move: a string of characters, either X or O
* utility: 1 for win, -1 for loss, 0 otherwise 
* board: all the positions of X and O on the board
* moves: all the possible moves from the current state

In [5]:
# initialize the game by creating an instance of the subclass 
ttt = TicTacToe()

# print state 
ttt.display(ttt.initial)

. . . 
. . . 
. . . 


In [8]:
# create a new game state by ourselves to experiment
my_state = GameState(
    to_move = 'X',
    utility = '0',
    board = {(1, 1): 'X', (1, 2): 'O', (1, 3): 'X', 
             (2, 1): 'O',              (2, 3): 'O',
             (3, 1): 'X',}, 
    moves = [(2,2), (3,2), (3,3)]
)

# print game state 
ttt.display(my_state)

X O X 
O . O 
X . . 


In [12]:
# create a new player that behaves pseudo-randomly
print("Random Player Move:", random_player(ttt, my_state))

# create alpha beta player that always uses the best move 
print("AlphaBeta Player Move:", alphabeta_player(ttt, my_state))

Random Player Move: (3, 2)
AlphaBeta Player Move: (2, 2)


In [13]:
# make random player and alphabeta player play against each other 
ttt.play_game(random_player, alphabeta_player)

O X X 
O O X 
O X . 


-1

In [15]:
# The output is usually -1 because the alphabeta player players perfectly. Since ABP plays perfectly, a match between 2 alphabeta players should always end in a draw 
for _ in range(10): 
    print(ttt.play_game(alphabeta_player, alphabeta_player))

X X O 
O O X 
X O X 
0
X X O 
O O X 
X O X 
0
X X O 
O O X 
X O X 
0
X X O 
O O X 
X O X 
0
X X O 
O O X 
X O X 
0
X X O 
O O X 
X O X 
0
X X O 
O O X 
X O X 
0
X X O 
O O X 
X O X 
0
X X O 
O O X 
X O X 
0
X X O 
O O X 
X O X 
0


In [16]:
# a random player should never win against an alphabeta player 
for _ in range(10): 
    print(ttt.play_game(random_player, alphabeta_player))

X X O 
. O X 
O . . 
-1
O O O 
. . X 
X X . 
-1
O O O 
X . X 
. X . 
-1
O . X 
. O X 
X . O 
-1
X X O 
O O X 
X X O 
0
O X X 
X O O 
X O X 
0
. . O 
X O . 
O X X 
-1
O X . 
. O X 
X . O 
-1
O O O 
X O X 
X . X 
-1
O O O 
X X O 
X X . 
-1


In [17]:
# we can also use the Canvas_TicTacToe subclass to play in the jupyter notebook
from notebook import Canvas_TicTacToe

In [18]:
# match between random_player and alphabeta player 
bot_play = Canvas_TicTacToe('bot_play', 'random', 'alphabeta')

In [19]:
# play a game ourselves against a random player 
rand_play = Canvas_TicTacToe('rand_play', 'human', 'random')

In [20]:
# play a game against the alphabeta player
ab_play = Canvas_TicTacToe('ab_play', 'human', 'alphabeta')

### Fig52 Game

![]("/home/michael/Desktop/SS/Artificial_Intelligence/fig_5_2.png")

In [22]:
# model moves, utilities and initial state
moves = dict(A=dict(a1='B', a2='C', a3='D'),
                 B=dict(b1='B1', b2='B2', b3='B3'),
                 C=dict(c1='C1', c2='C2', c3='C3'),
                 D=dict(d1='D1', d2='D2', d3='D3'))
utils = dict(B1=3, B2=12, B3=8, C1=2, C2=4, C3=6, D1=14, D2=5, D3=2)
initial = 'A'

# showcase moves
print(moves['A']['a1'])

B


In [24]:
# create an object of the Fig52Game class 
fig52 = Fig52Game()

# check actions
psource(Fig52Game.actions)
print("Actions From B", fig52.actions('B'))

# check results
psource(Fig52Game.result)
print("Results From A to a1", fig52.result('A', 'a1'))

# utility
psource(Fig52Game.utility)
print("Utility of 'B1' for MAX", fig52.utility('B1', 'MAX'))
print("Utility of 'B1' for MIN", fig52.utility('B1', 'MIN'))

# terminal test 
psource(Fig52Game.terminal_test)
print("Terminal test for 'C3'", fig52.terminal_test('C3'))

# to_move
psource(Fig52Game.to_move)
print("To move for A", fig52.to_move('A'))

# whole class 
psource(Fig52Game)

Actions From B ['b1', 'b2', 'b3']


Results From A to a1 B


Utility of 'B1' for MAX 3
Utility of 'B1' for MIN -3


Terminal test for 'C3' True


To move for A MAX


### Min-Max Algorithm 

This algorithm computes the next move for a player (MIN or MAX) at their current state. It recursively computes the minimax value of successor states, until it reaches terminal leaves of the tree. Using the utility value of the terminal states, it computes the values of parent states until it reaches the initial n ode. 

In [25]:
pseudocode("Minimax-Decision")

### AIMA3e
__function__ MINIMAX-DECISION(_state_) __returns__ _an action_  
&emsp;__return__ arg max<sub> _a_ &Element; ACTIONS(_s_)</sub> MIN\-VALUE(RESULT(_state_, _a_))  

---
__function__ MAX\-VALUE(_state_) __returns__ _a utility value_  
&emsp;__if__ TERMINAL\-TEST(_state_) __then return__ UTILITY(_state_)  
&emsp;_v_ &larr; &minus;&infin;  
&emsp;__for each__ _a_ __in__ ACTIONS(_state_) __do__  
&emsp;&emsp;&emsp;_v_ &larr; MAX(_v_, MIN\-VALUE(RESULT(_state_, _a_)))  
&emsp;__return__ _v_  

---
__function__ MIN\-VALUE(_state_) __returns__ _a utility value_  
&emsp;__if__ TERMINAL\-TEST(_state_) __then return__ UTILITY(_state_)  
&emsp;_v_ &larr; &infin;  
&emsp;__for each__ _a_ __in__ ACTIONS(_state_) __do__  
&emsp;&emsp;&emsp;_v_ &larr; MIN(_v_, MAX\-VALUE(RESULT(_state_, _a_)))  
&emsp;__return__ _v_  

---
__Figure__ ?? An algorithm for calculating minimax decisions. It returns the action corresponding to the best possible move, that is, the move that leads to the outcome with the best utility, under the assumption that the opponent plays to minimize utility. The functions MAX\-VALUE and MIN\-VALUE go through the whole game tree, all the way to the leaves, to determine the backed\-up value of a state. The notation argmax <sub>_a_ &Element; _S_</sub> _f_(_a_) computes the element _a_ of set _S_ that has maximum value of _f_(_a_).

---
__function__ EXPECTIMINIMAX(_s_) =     
&emsp;UTILITY(_s_) __if__ TERMINAL\-TEST(_s_)  
&emsp;max<sub>_a_</sub> EXPECTIMINIMAX(RESULT(_s, a_)) __if__ PLAYER(_s_)= MAX  
&emsp;min<sub>_a_</sub> EXPECTIMINIMAX(RESULT(_s, a_)) __if__ PLAYER(_s_)= MIN  
&emsp;∑<sub>_r_</sub> P(_r_) EXPECTIMINIMAX(RESULT(_s, r_)) __if__ PLAYER(_s_)= CHANCE

### Implementation 

We will use two functions, max_value and min_value to calculate the best move for MAX and MIN respectively. These functions interact in an alternating recursion; one calls the other until a terminal state is reached. When the recursion halts, we are left with scores for each move. We return the max. Despite returning the max, it will work for MIN too since for MIN the values are the negative of the MAX. 

In [27]:
psource(minimax_decision)

In [28]:
# use minimax to play the fig52 game 
print("Move 1", minimax_decision('B', fig52))
print("Move 2", minimax_decision('C', fig52))
print("Move 3", minimax_decision('D', fig52))
print("Move 4", minimax_decision('A', fig52))

Move 1 b1
Move 2 c1
Move 3 d3
Move 4 a1


In [29]:
# visualization 
from notebook import Canvas_minimax
from random import randint

minimax_viz = Canvas_minimax('minimax_viz', [randint(1, 50) for i in range(27)])

### Alpheta 

While minimax is great for computing a move, it doesn't scale well. The algorithm needs to search all the leaves of the tree, which increase exponentially to its depth. 

Here we examine pruning the tree, which means removing parts of it that we do not need to examine. This pruning is called alpha-beta, and the search is called alpha-beta search.

In [30]:
pseudocode("Alpha-Beta-Search")

### AIMA3e
__function__ ALPHA-BETA-SEARCH(_state_) __returns__ an action  
&emsp;_v_ &larr; MAX\-VALUE(_state_, &minus;&infin;, &plus;&infin;)  
&emsp;__return__ the _action_ in ACTIONS(_state_) with value _v_  

---
__function__ MAX\-VALUE(_state_, _&alpha;_, _&beta;_) __returns__ _a utility value_  
&emsp;__if__ TERMINAL\-TEST(_state_) __then return__ UTILITY(_state_)  
&emsp;_v_ &larr; &minus;&infin;  
&emsp;__for each__ _a_ __in__ ACTIONS(_state_) __do__  
&emsp;&emsp;&emsp;_v_ &larr; MAX(_v_, MIN\-VALUE(RESULT(_state_, _a_), _&alpha;_, _&beta;_))  
&emsp;&emsp;&emsp;__if__ _v_ &ge; _&beta;_ __then return__ _v_  
&emsp;&emsp;&emsp;_&alpha;_ &larr; MAX(_&alpha;_, _v_)  
&emsp;__return__ _v_  

---
__function__ MIN\-VALUE(_state_, _&alpha;_, _&beta;_) __returns__ _a utility value_  
&emsp;__if__ TERMINAL\-TEST(_state_) __then return__ UTILITY(_state_)  
&emsp;_v_ &larr; &plus;&infin;  
&emsp;__for each__ _a_ __in__ ACTIONS(_state_) __do__  
&emsp;&emsp;&emsp;_v_ &larr; MIN(_v_, MAX\-VALUE(RESULT(_state_, _a_), _&alpha;_, _&beta;_))  
&emsp;&emsp;&emsp;__if__ _v_ &le; _&alpha;_ __then return__ _v_  
&emsp;&emsp;&emsp;_&beta;_ &larr; MIN(_&beta;_, _v_)  
&emsp;__return__ _v_  


---
__Figure__ ?? The alpha\-beta search algorithm. Notice that these routines are the same as the MINIMAX functions in Figure ??, except for the two lines in each of MIN\-VALUE and MAX\-VALUE that maintain _&alpha;_ and _&beta;_ (and the bookkeeping to pass these parameters along).

### Implementation 

Like minimax, we will again make use of max_value and min_value, but this time we will utilize the a and b values, updating them and stopping the recursive call if we end up on the nodes with values worse than a and b (for MAX and MIN). The algorithm finds the maximum value and returns the move that results in it. 

In [31]:
%psource alphabeta_search

In [32]:
# play fig52 game with alphabeta search 
print("Move 1", alphabeta_search('A', fig52))
print("Move 2", alphabeta_search('B', fig52))
print("Move 3", alphabeta_search('C', fig52))
print("Move 4", alphabeta_search('D', fig52))


Move 1 a1
Move 2 b1
Move 3 c1
Move 4 d3


In [34]:
# visualization 
from notebook import Canvas_alphabeta 
from random import randint 

alphabeta_viz = Canvas_alphabeta('alphabeta_viz', [randint(1, 50) for i in range(27)])

In [35]:
# play Game52

# initialize 
game52 = Fig52Game()

In [37]:
# random player
print("Random Player Move 1:", random_player(game52, 'A'))
print("Random Player Move 2:", random_player(game52, 'A'))

Random Player Move 1: a1
Random Player Move 2: a3


In [40]:
# alphabeta player, gives best possible move 
print("AlphaBeta Player Move 1: ", alphabeta_player(game52, 'A'))
print("AlphaBeta Player Move 2: ", alphabeta_player(game52, 'B'))
print("AlphaBeta Player Move 3: ", alphabeta_player(game52, 'C'))

AlphaBeta Player Move 1:  a1
AlphaBeta Player Move 2:  b1
AlphaBeta Player Move 3:  c1


In [43]:
# minimax decision. The full state space exploration, akin to alphabeta 
print("Minimax Decision Choice:",minimax_decision('A', game52))
print("Alpha Beta Choice:", alphabeta_search('A', game52))

Minimax Decision Choice: a1
Alpha Beta Choice: a1


In [44]:
# play the game, alphabeta vs. alphabeta 
game52.play_game(alphabeta_player, alphabeta_player)

B1


3

In [45]:
# play the game, human vs. alphabeta. alphabeta plays as MIN
game52.play_game(query_player, alphabeta_player)

current state:
A
available moves: ['a1', 'a2', 'a3']

Your move? a1
B1


3

In [46]:
# play the game, human vs. alphabeta. alphabeta plays as MAX
game52.play_game(alphabeta_player, query_player)

current state:
B
available moves: ['b1', 'b2', 'b3']

Your move? b2
B2


12