![alt text](https://zewailcity.edu.eg/main/images/logo3.png)

_Prepared by_  [**Muhammad Hamdy AlAref**](mailto:malaref@zewailcity.edu.eg)

# Adversarial Search

Adversarial search algorithms are algorithms that try solving problems where other agents exist and may plan against us; also referred to as *games*!

## Game Formulation

Game formulation is very similar to *problem formulation* that we discussed before. This time with multiple agents (players)!

In [0]:
class Game:
    '''
    Abstract game class for game formulation.
    It declares the expected methods to be used by an adversarial search algorithm.
    All the methods declared are just placeholders that throw errors if not overriden by child "concrete" classes!
    '''
    
    def __init__(self):
        '''Constructor that initializes the game. Typically used to setup the initial state, number of players and, if applicable, the terminal states and their utilities.'''
        self.init_state = None
    
    def player(self, state):
        '''Returns the player whose turn it is.'''
        raise NotImplementedError
    
    def actions(self, state):
        '''Returns an iterable with the applicable actions to the given state.'''
        raise NotImplementedError
    
    def result(self, state, action):
        '''Returns the resulting state from applying the given action to the given state.'''
        raise NotImplementedError
    
    def terminal_test(self, state):
        '''Returns whether or not the given state is a terminal state.'''
        raise NotImplementedError
    
    def utility(self, state, player):
        '''Returns the utility of the given state for the given player, if possible (usually, it has to be a terminal state).'''
        raise NotImplementedError

## Example: Tic-Tac-Toe

Let's try formulating the quite well-known game [tic-tac-toe](https://en.wikipedia.org/wiki/Tic-tac-toe)!

In [0]:
from enum import Enum

class TicTacToe(Game):
    '''Tic-tac-toe game formulation.'''
    
    class Players(Enum):
        '''Enum with the players in tic-tac-toe.'''
        X = 'X'
        O = 'O'
    
    def _won(self, state, player):
        '''Auxiliary function for checking if a player has won.'''
        return any(all(state[0][i][j] is player for i in range(3)) for j in range(3)) \
            or any(all(state[0][j][i] is player for i in range(3)) for j in range(3)) \
            or all(state[0][i][i] is player for i in range(3))
    
    def __init__(self):
        self.init_state = ((None,) * 3,) * 3, None
    
    def player(self, state):
        return TicTacToe.Players.O if state[1] else TicTacToe.Players.X
    
    def actions(self, state):
        return ((i, j) for i, row in enumerate(state[0]) for j, player in enumerate(row) if not player)
    
    def result(self, state, action):
        mutable_grid = list(state[0])
        mutable_row = list(mutable_grid[action[0]])
        mutable_row[action[1]] = self.player(state)
        mutable_grid[action[0]] = tuple(mutable_row)
        return tuple(mutable_grid), not state[1]
    
    def terminal_test(self, state):
        return all(state[0][i][j] is not None for i in range(3) for j in range(3)) or any(self._won(state, player) for player in TicTacToe.Players)
    
    def utility(self, state, player):
        for p in TicTacToe.Players:
            if self._won(state, p):
                return 1 if p is player else -1
        return 0

## Minimax Algorithm

Minimax algorithm is a recursive algorithm that returns the optimal move, provided the players play *optimally*, by doing a DFS search on the game tree.

In [0]:
from math import inf

def minimax(game, state):
    '''Minimax implementation.'''
    player = game.player(state)
    def max_value(state):
        if game.terminal_test(state): return game.utility(state, player)
        maxi = -inf
        for action in game.actions(state):
            maxi = max(maxi, min_value(game.result(state, action)))
        return maxi
    def min_value(state):
        if game.terminal_test(state): return game.utility(state, player)
        mini = +inf
        for action in game.actions(state):
            mini = min(mini, max_value(game.result(state, action)))
        return mini
    return max(((min_value(game.result(state, action)), action) for action in game.actions(state)), key=lambda entry: entry[0])[1]

## Alpha-Beta Pruning

Alpha-beta pruning is a modification to the minimax algorithm that prunes sub-trees that won't affect the decision making for a great speed-up!

In [0]:
from math import inf

def alpha_beta(game, state):
    '''Alpha-Beta Pruning implementation.'''
    player = game.player(state)
    def max_value(state, alpha, beta):
        if game.terminal_test(state): return game.utility(state, player)
        maxi = -inf
        for action in game.actions(state):
            maxi = max(maxi, min_value(game.result(state, action), alpha, beta))
            alpha = max(alpha, maxi)
            if alpha >= beta: return maxi
        return maxi
    def min_value(state, alpha, beta):
        if game.terminal_test(state): return game.utility(state, player)
        mini = +inf
        for action in game.actions(state):
            mini = min(mini, max_value(game.result(state, action), alpha, beta))
            beta = min(beta, mini)
            if alpha >= beta: return mini
        return mini
    return max(((min_value(game.result(state, action), -inf, +inf), action) for action in game.actions(state)), key=lambda entry: entry[0])[1]

In [0]:
game = TicTacToe()

In [0]:
state = game.init_state
while(not game.terminal_test(state)):
    action = minimax(game, state)
    assert action == alpha_beta(game, state)
    state = game.result(state, action)
state

(((<Players.X: 'X'>, <Players.X: 'X'>, <Players.O: 'O'>),
  (<Players.X: 'X'>, <Players.O: 'O'>, <Players.X: 'X'>),
  (<Players.O: 'O'>, <Players.O: 'O'>, <Players.X: 'X'>)),
 True)

## Performance Comparison

Now, let's compare the performance of alpha-beta pruning against the original minimax algorithm!

In [0]:
%prun print(minimax(game, game.init_state))

(0, 0)
 

In [0]:
%prun print(alpha_beta(game, game.init_state))

(0, 0)
 

## Requirement

Let's re-solve the tic-tac-toe game with heuristics!

You are required to write Python code that implements the H-minimax (or H-alpha-beta) algorithm and apply it to the tic-tac-toe game and compare its performance with its regular implementation!

**HINT:** You will need to edit (or inherit from) the `TicTacToe` class. You may use the heuristic evaluation function from exercise 3! $$Eval(s) = 3X_2(s)+X_1(s)-(3O_2(s) + O_1(s))$$

**Estimated time for this exercise is 30 minutes!**