# Adversarial search
A two-player, zero-sum, perfect information game is one in which both opponents have all of the information about the state of the game available to them, and any gain in advantage for one is a loss of advantage for the other. Such games include tic-tac-toe, Connect Four, checkers, and chess. In this chapter we will study how to create an artificial opponent that can play such games with great skill. 

## Basic board game components 
Let’s start by defining some simple base classes that define all of the state our search algorithms will need. Later, we can subclass those base classes for the specific games we are implementing (tic-tac-toe and Connect Four) and feed the subclasses into the search algorithms to make them “play” the games.

In [1]:
from __future__ import annotations

class Piece:
    @property
    def opposite(self) -> Piece:
        raise NotImplementedError("Should be implemented by subclasses.")
        
    def __str__(self) -> str:
        return self.value

In [2]:
from typing import NewType, List
from abc import ABC, abstractmethod

Move = NewType('Move', int)


class Board(ABC):
    @property
    @abstractmethod
    def turn(self) -> Piece:
        ...

    @abstractmethod
    def move(self, location: Move) -> Board:
        ...

    @property
    @abstractmethod
    def legal_moves(self) -> List[Move]:
        ...

    @property
    @abstractmethod
    def is_win(self) -> bool:
        ...

    @property
    def is_draw(self) -> bool:
        return (not self.is_win) and (len(self.legal_moves) == 0)

    @abstractmethod
    def evaluate(self, player: Piece) -> float:
        ...

The Move type will represent a move in a game. It is, at heart, just an integer. In games like tic-tac-toe and Connect Four, an integer can represent a move by indicating a square or column where a piece should be placed. Piece is a base class for a piece on the board in a game. It will also double as our turn indicator. This is why the opposite property is needed. We need to know whose turn follows a given turn. 

The Board abstract base class is the actual maintainer of state. For any given game that our search algorithms will compute, we need to be able to answer four questions:

- Whose turn is it?
- What legal moves can be played in the current position?
- Is the game won?
- Is the game drawn?

## Managing tic-tac-toe state 
Let’s develop some structures to keep track of the state of a tic-tac-toe game as it progresses.

First, we need a way of representing each square on the tic-tac-toe board. We will use an enum called TTTPiece, a subclass of Piece. A tic-tac-toe piece can be X, O, or empty (represented by E in the enum). 

In [3]:
from __future__ import annotations
from typing import List
from enum import Enum


class TTTPiece(Piece, Enum):
    X = "X"
    O = "O"
    E = " " # stand-in for empty

    @property
    def opposite(self) -> TTTPiece:
        if self == TTTPiece.X:
            return TTTPiece.O
        elif self == TTTPiece.O:
            return TTTPiece.X
        else:
            return TTTPiece.E

The main holder of state will be the class TTTBoard. TTTBoard keeps track of two different pieces of state: the position (represented by the aforementioned one-dimensional list) and the player whose turn it is. 

TTTBoard is an informally immutable data structure; TTTBoards should not be modified. Instead, every time a move needs to be played, a new TTTBoard with the position changed to accommodate the move will be generated. This will later be helpful in our search algorithm. When the search branches, we will not inadvertently change the position of a board from which potential moves are still being analyzed. 

In [4]:
class TTTBoard(Board):
    def __init__(self, position: List[TTTPiece] = [TTTPiece.E] * 9, turn: TTTPiece = TTTPiece.X) -> None:
        self.position: List[TTTPiece] = position
        self._turn: TTTPiece = turn

    @property
    def turn(self) -> Piece:
        return self._turn
    
    def move(self, location: Move) -> Board:
        temp_position: List[TTTPiece] = self.position.copy()
        temp_position[location] = self._turn
        return TTTBoard(temp_position, self._turn.opposite)
    
    #legal move in tic-tac-toe is any empty square
    @property
    def legal_moves(self) -> List[Move]:
        return [Move(l) for l in range(len(self.position)) if self.position[l] == TTTPiece.E]
    
    @property
    def is_win(self) -> bool:
        # three row, three column, and then two diagonal checks
        return self.position[0] == self.position[1] and self.position[0] == self.position[2] and self.position[0] != TTTPiece.E or \
            self.position[3] == self.position[4] and self.position[3] == self.position[5] and self.position[3] != TTTPiece.E or \
            self.position[6] == self.position[7] and self.position[6] == self.position[8] and self.position[6] != TTTPiece.E or \
            self.position[0] == self.position[3] and self.position[0] == self.position[6] and self.position[0] != TTTPiece.E or \
            self.position[1] == self.position[4] and self.position[1] == self.position[7] and self.position[1] != TTTPiece.E or \
            self.position[2] == self.position[5] and self.position[2] == self.position[8] and self.position[2] != TTTPiece.E or \
            self.position[0] == self.position[4] and self.position[0] == self.position[8] and self.position[0] != TTTPiece.E or \
            self.position[2] == self.position[4] and self.position[2] == self.position[6] and self.position[2] != TTTPiece.E
    
    def evaluate(self, player: Piece) -> float:
        if self.is_win and self.turn == player:
            return -1
        elif self.is_win and self.turn != player:
            return 1
        else:
            return 0

    def __repr__(self) -> str:
        return f"""
        {self.position[0]}|{self.position[1]}|{self.position[2]}
        -----
        {self.position[3]}|{self.position[4]}|{self.position[5]}
        -----
        {self.position[6]}|{self.position[7]}|{self.position[8]}"""

## Minimax
Minimax is a classic algorithm for finding the best move in a two-player, zero-sum game with perfect information, like tic-tac-toe, checkers, or chess. It has been extended and modified for other types of games as well. Minimax is typically implemented using a recursive function in which each player is designated either the maximizing player or the minimizing player.

The maximizing player aims to find the move that will lead to maximal gains. However, the maximizing player must account for moves by the minimizing player. After each attempt to maximize the gains of the maximizing player, minimax is called recursively to find the opponent’s reply that minimizes the maximizing player’s gains. This continues back and forth (maximizing, minimizing, maximizing, and so on) until a base case in the recursive function is reached. The base case is a terminal position (a win or a draw) or a maximal search depth.

Minimax will return an evaluation of the starting position for the maximizing player. For the evaluate() method of the TTTBoard class, if the best possible play by both sides will result in a win for the maximizing player, a score of 1 will be returned. If the best play will result in a loss, -1 is returned. A 0 is returned if the best play is a draw.

These numbers are returned when a base case is reached. They then bubble up through all of the recursive calls that led to the base case. For each recursive call to maximize, the best evaluations one level further down bubble up. For each recursive call to minimize, the worst evaluations one level further down bubble up. 

In [5]:
# Find the best possible outcome for original player
def minimax(board: Board, maximizing: bool, original_player: Piece, max_depth: int = 8) -> float:
    # Base case – terminal position or maximum depth reached
    if board.is_win or board.is_draw or max_depth == 0:
        return board.evaluate(original_player)

    # Recursive case - maximize your gains or minimize the opponent's gains
    if maximizing:
        best_eval: float = float("-inf") # arbitrarily low starting point
        for move in board.legal_moves:
            result: float = minimax(board.move(move), False, original_player, max_depth - 1)
            best_eval = max(result, best_eval)
        return best_eval
    else: # minimizing
        worst_eval: float = float("inf")
        for move in board.legal_moves:
            result = minimax(board.move(move), True, original_player, max_depth - 1)
            worst_eval = min(result, worst_eval) 
        return worst_eval

Unfortunately, we cannot use our implementation of minimax() as is to find the best move for a given position. It returns an evaluation (a float value). It does not tell us what best first move led to that evaluation.

Instead, we will create a helper function, find_best_move(), that loops through calls to minimax() for each legal move in a position to find the move that evaluates to the highest value. You can think of find_best_move() as the first maximizing call to minimax(), but with us keeping track of those initial moves. 

In [6]:
# Find the best possible move in the current position
# looking up to max_depth ahead
def find_best_move(board: Board, max_depth: int = 8) -> Move:
    best_eval: float = float("-inf")
    best_move: Move = Move(-1)
    for move in board.legal_moves:
        result: float = minimax(board.move(move), False, board.turn, max_depth)
        if result > best_eval:
            best_eval = result
            best_move = move
    return best_move

## Testing minimax with tic-tac-toe 

In [7]:
import unittest
from typing import List

class TTTMinimaxTestCase(unittest.TestCase):
    def test_easy_position(self):
        # win in 1 move
        to_win_easy_position: List[TTTPiece] = [TTTPiece.X, TTTPiece.O, TTTPiece.X, 
                                                TTTPiece.X, TTTPiece.E, TTTPiece.O, 
                                                TTTPiece.E, TTTPiece.E, TTTPiece.O]
        test_board1: TTTBoard = TTTBoard(to_win_easy_position, TTTPiece.X)
        answer1: Move = find_best_move(test_board1)
        self.assertEqual(answer1, 6)

    def test_block_position(self):
        # must block O's win
        to_block_position: List[TTTPiece] = [TTTPiece.X, TTTPiece.E, TTTPiece.E, 
                                             TTTPiece.E, TTTPiece.E, TTTPiece.O, 
                                             TTTPiece.E, TTTPiece.X, TTTPiece.O]
        test_board2: TTTBoard = TTTBoard(to_block_position, TTTPiece.X)
        answer2: Move = find_best_move(test_board2)
        self.assertEqual(answer2, 2)

    def test_hard_position(self):
        # find the best move to win 2 moves
        to_win_hard_position: List[TTTPiece] = [TTTPiece.X, TTTPiece.E, TTTPiece.E, 
                                                TTTPiece.E, TTTPiece.E, TTTPiece.O, 
                                                TTTPiece.O, TTTPiece.X, TTTPiece.E]
        test_board3: TTTBoard = TTTBoard(to_win_hard_position, TTTPiece.X)
        answer3: Move = find_best_move(test_board3)
        self.assertEqual(answer3, 1)
        
unittest.main(argv=[''], verbosity=2, exit=False)

test_block_position (__main__.TTTMinimaxTestCase) ... ok
test_easy_position (__main__.TTTMinimaxTestCase) ... ok
test_hard_position (__main__.TTTMinimaxTestCase) ... ok

----------------------------------------------------------------------
Ran 3 tests in 0.007s

OK


<unittest.main.TestProgram at 0x10c9d7dd8>

## Developing a tic-tac-toe A
Because the default max_depth of find_best_move() is 8, this tic-tac-toe AI will always see to the very end of the game. (The maximum number of moves in tic-tac-toe is nine, and the AI goes second.) Therefore, it should play perfectly every time. A perfect game is one in which both opponents play the best possible move every turn. The result of a perfect game of tic-tac-toe is a draw. With this in mind, you should never be able to beat the tic-tac-toe AI. If you play your best, it will be a draw. If you make a mistake, the AI will win. Try it out yourself. You should not be able to beat it 

In [8]:
board: Board = TTTBoard()


def get_player_move() -> Move:
    player_move: Move = Move(-1)
    while player_move not in board.legal_moves:
        play: int = int(input("Enter a legal square (0-8):"))
        player_move = Move(play)
    return player_move


# main game loop
while True:
    human_move: Move = get_player_move()
    board = board.move(human_move)
    if board.is_win:
        print("Human wins!")
        break
    elif board.is_draw:
        print("Draw!")
        break
    computer_move: Move = find_best_move(board)
    print(f"Computer move is {computer_move}")
    
    board = board.move(computer_move)
    print(board)
    if board.is_win:
        print("Computer wins!")
        break
    elif board.is_draw:
        print("Draw!")
        break

Enter a legal square (0-8):0
Computer move is 4

        X| | 
        -----
         |O| 
        -----
         | | 
Enter a legal square (0-8):7
Computer move is 3

        X| | 
        -----
        O|O| 
        -----
         |X| 
Enter a legal square (0-8):5
Computer move is 2

        X| |O
        -----
        O|O|X
        -----
         |X| 
Enter a legal square (0-8):6
Computer move is 8

        X| |O
        -----
        O|O|X
        -----
        X|X|O
Enter a legal square (0-8):1
Draw!


## Connect Four 
In Connect Four, two players alternate dropping different-colored pieces in a seven-column, six-row vertical grid. Pieces fall from the top of the grid to the bottom until they hit the bottom or another piece. In essence, the player’s only decision each turn is which of the seven columns to drop a piece into. The player may not drop it into a full column. The first player that has four pieces of their color next to one another with no breaks in a row, column, or diagonal wins. If no player achieves this, and the grid is completely filled, the game is a draw.

Some of the following code will look very familiar, but the data structures and the evaluation method are quite different from tic-tac-toe. Both games are implemented as subclasses of the same base Piece and Board classes you saw at the beginning of the chapter, making minimax() usable for both games.

In [9]:
from typing import List, Optional, Tuple

class C4Piece(Piece, Enum):
    B = "B"
    R = "R"
    E = " " # stand-in for empty

    @property
    def opposite(self) -> C4Piece:
        if self == C4Piece.B:
            return C4Piece.R
        elif self == C4Piece.R:
            return C4Piece.B
        else:
            return C4Piece.E

Next, we have a function for generating all of the potential winning segments in a certain-size Connect Four grid. This function returns a list of lists of grid locations (tuples of column/row combinations). Each list in the list contains four grid locations. We call each of these lists of four grid locations a segment. If any segment from the board is all the same color, that color has won the game. 

In [10]:
def generate_segments(num_columns: int, num_rows: int, segment_length: int) -> List[List[Tuple[int, int]]]:
    
    segments: List[List[Tuple[int, int]]] = []
    # generate the vertical segments
    for c in range(num_columns):
        for r in range(num_rows - segment_length + 1):
            segment: List[Tuple[int, int]] = []
            for t in range(segment_length):
                segment.append((c, r + t))
            segments.append(segment)

    # generate the horizontal segments
    for c in range(num_columns - segment_length + 1):
        for r in range(num_rows):
            segment = []
            for t in range(segment_length):
                segment.append((c + t, r))
            segments.append(segment)

    # generate the bottom left to top right diagonal segments
    for c in range(num_columns - segment_length + 1):
        for r in range(num_rows - segment_length + 1):
            segment = []
            for t in range(segment_length):
                segment.append((c + t, r + t))
            segments.append(segment)

    # generate the top left to bottom right diagonal segments
    for c in range(num_columns - segment_length + 1):
        for r in range(segment_length - 1, num_rows):
            segment = []
            for t in range(segment_length):
                segment.append((c + t, r - t))
            segments.append(segment)
    return segments

Thinking about the Connect Four board as a group of seven columns is conceptually powerful and makes writing the rest of the C4Board class slightly easier. The Column class is very similar to the Stack class we used in earlier chapters. This makes sense, because conceptually during play, a Connect Four column is a stack that can be pushed to but never popped. But unlike our earlier stacks, a column in Connect Four has an absolute limit of six items.

In [11]:
class Column:
    def __init__(self) -> None:
        self._container: List[C4Piece] = []

    @property
    def full(self) -> bool:
        return len(self._container) == C4Board.NUM_ROWS

    def push(self, item: C4Piece) -> None:
        if self.full:
            raise OverflowError("Trying to push piece to full column")
        self._container.append(item)

    def __getitem__(self, index: int) -> C4Piece:
        if index > len(self._container) - 1:
            return C4Piece.E
        return self._container[index]

    def __repr__(self) -> str:
        return repr(self._container)

    def copy(self) -> Column:
        temp: Column = Column()
        temp._container = self._container.copy()
        return temp

Being able to quickly search all of the segments on the board is useful for both checking whether a game is over (someone has won) and for evaluating a position. 

In [12]:
class C4Board(Board):
    NUM_ROWS: int = 6
    NUM_COLUMNS: int = 7
    SEGMENT_LENGTH: int = 4
    SEGMENTS: List[List[Tuple[int, int]]] = generate_segments(NUM_COLUMNS, NUM_ROWS, SEGMENT_LENGTH)
        
    def __init__(self, position: Optional[List[Column]] = None, turn:
     C4Piece = C4Piece.B) -> None:
        if position is None:
            self.position: List[Column] = [Column() for _ in
         range(C4Board.NUM_COLUMNS)]
        else:
            self.position = position
        self._turn: C4Piece = turn

    @property
    def turn(self) -> Piece:
        return self._turn

    def move(self, location: Move) -> Board:
        temp_position: List[Column] = self.position.copy()
        for c in range(C4Board.NUM_COLUMNS):
            temp_position[c] = self.position[c].copy()
        temp_position[location].push(self._turn)
        return C4Board(temp_position, self._turn.opposite)

    @property
    def legal_moves(self) -> List[Move]:
        return [Move(c) for c in range(C4Board.NUM_COLUMNS) if not
         self.position[c].full]
    
    # Returns the count of black and red pieces in a segment
    def _count_segment(self, segment: List[Tuple[int, int]]) -> Tuple[int, int]:
        black_count: int = 0
        red_count: int = 0
        for column, row in segment:
            if self.position[column][row] == C4Piece.B:
                black_count += 1
            elif self.position[column][row] == C4Piece.R:
                red_count += 1
        return black_count, red_count

    @property
    def is_win(self) -> bool:
        for segment in C4Board.SEGMENTS:
            black_count, red_count = self._count_segment(segment)
            if black_count == 4 or red_count == 4:
                return True
        return False
    
    def _evaluate_segment(self, segment: List[Tuple[int, int]], player: Piece) -> float:
        black_count, red_count = self._count_segment(segment)
        if red_count > 0 and black_count > 0:
            return 0 # mixed segments are neutral
        count: int = max(red_count, black_count)
        score: float = 0
        if count == 2:
            score = 1
        elif count == 3:
            score = 100
        elif count == 4:
            score = 1000000
        color: C4Piece = C4Piece.B
        if red_count > black_count:
            color = C4Piece.R
        if color != player:
            return -score
        return score

    def evaluate(self, player: Piece) -> float:
        total: float = 0
        for segment in C4Board.SEGMENTS:
            total += self._evaluate_segment(segment, player)
        return total

    def __repr__(self) -> str:
        display: str = ""
        for r in reversed(range(C4Board.NUM_ROWS)):
            display += "|"
            for c in range(C4Board.NUM_COLUMNS):
                display += f"{self.position[c][r]}" + "|"
            display += "\n"
        return display

## A Connect Four AI
The same minimax() and find_best_move() functions we developed for tic-tac-toe can be used unchanged with our Connect Four.

In [13]:
board: Board = C4Board()

def get_player_move() -> Move:
    player_move: Move = Move(-1)
    while player_move not in board.legal_moves:
        play: int = int(input("Enter a legal column (0-6):"))
        player_move = Move(play)
    return player_move


# main game loop
while True:
    human_move: Move = get_player_move()
    board = board.move(human_move)
    if board.is_win:
        print("Human wins!")
        break
    elif board.is_draw:
        print("Draw!")
        break
    computer_move: Move = find_best_move(board, 3)
    print(f"Computer move is {computer_move}")
    board = board.move(computer_move)
    print(board)
    if board.is_win:
        print("Computer wins!")
        break
    elif board.is_draw:
        print("Draw!")
        break

Enter a legal column (0-6):3
Computer move is 1
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| |R| |B| | | |

Enter a legal column (0-6):1
Computer move is 3
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| |B| |R| | | |
| |R| |B| | | |

Enter a legal column (0-6):1
Computer move is 1
| | | | | | | |
| | | | | | | |
| |R| | | | | |
| |B| | | | | |
| |B| |R| | | |
| |R| |B| | | |

Enter a legal column (0-6):3
Computer move is 3
| | | | | | | |
| | | | | | | |
| |R| |R| | | |
| |B| |B| | | |
| |B| |R| | | |
| |R| |B| | | |

Enter a legal column (0-6):0
Computer move is 4
| | | | | | | |
| | | | | | | |
| |R| |R| | | |
| |B| |B| | | |
| |B| |R| | | |
|B|R| |B|R| | |

Enter a legal column (0-6):5
Computer move is 4
| | | | | | | |
| | | | | | | |
| |R| |R| | | |
| |B| |B| | | |
| |B| |R|R| | |
|B|R| |B|R|B| |

Enter a legal column (0-6):4
Computer move is 4
| | | | | | | |
| | | | | | | |
| |R| |R|R| | |
| |B| |B|B| | |
| |B| |R|R| | |
|B

## Improving minimax with alpha-beta pruning 
Minimax works well, but we are not getting a very deep search at present. There is a small extension to minimax, known as alpha-beta pruning, that can improve search depth by excluding positions in the search that will not result in improvements over positions already searched. This magic is accomplished by keeping track of two values between recursive minimax calls: alpha and beta. Alpha represents the evaluation of the best maximizing move found up to this point in the search tree, and beta represents the evaluation of the best minimizing move found so far for the opponent. If beta is ever less than or equal to alpha, it’s not worth further exploring this branch of the search, because a better or equivalent move has already been found than what will be found farther down this branch. This heuristic decreases the search space significantly. 

In [14]:
def alphabeta(board: Board, maximizing: bool, original_player: Piece, max_depth: int = 8, 
              alpha: float = float("-inf"), beta: float = float("inf")) -> float:
    # Base case – terminal position or maximum depth reached
    if board.is_win or board.is_draw or max_depth == 0:
        return board.evaluate(original_player)

    # Recursive case - maximize your gains or minimize the opponent's gains
    if maximizing:
        for move in board.legal_moves:
            result: float = alphabeta(board.move(move), False, original_player, max_depth - 1, alpha, beta)
            alpha = max(result, alpha)
            if beta <= alpha:
                break
        return alpha
    else:  # minimizing
        for move in board.legal_moves:
            result = alphabeta(board.move(move), True, original_player, max_depth - 1, alpha, beta)
            beta = min(result, beta)
            if beta <= alpha:
                break
        return beta

Change find_best_move() to use alphabeta() instead of minimax(), and change the search depth in connectfour_ai.py to 5 from 3. With these changes, your average Connect Four player will not be able to beat our AI. 