# Nim Policy Search

Nim is a simple game where two players take turns removing objects from a pile. The player who removes the last object loses. The game is described in detail [here](https://en.wikipedia.org/wiki/Nim). There is a mathematical strategy to win Nim, by ensuring you always leave the opponent with a nim-sum number of objects (groups of 1, 2 and 4).

In this notebook, we will play nim-sum using the following agents:
1. An agent using fixed rules based on nim-sum
2. An agent using evolved rules
3. An agent using minmax
4. An agent using reinforcement learning

> Sidharrth Nagappan, 2022

In [1]:
import logging
import random

In [2]:
class Nim:
    def __init__(self, num_rows: int, k: int = None):
        self.num_rows = num_rows
        self._k = k
        self._rows = [i*2+1 for i in range(num_rows)]

    def nimming_remove(self, row: int, num_objects: int):
        assert self._rows[row] >= num_objects
        assert self._k is None or num_objects <= self._k
        self._rows[row] -= num_objects
    
    def nimming_add(self, row: int, num_objects: int):
        assert self._rows[row] + num_objects <= (row+1)*2
        assert self._k is None or num_objects <= self._k
        self._rows[row] += num_objects
    
    def goal(self) -> bool:
        return sum(self._rows) == 0

In [None]:
class ExpertFixedRuleAgent:
    '''
    Play the game of Nim using a fixed rule (always leave nim-sum at the end of turn)
    '''
    def nim_sum(self, nim: Nim):
        return sum([i^r for i, r in enumerate(nim._rows)])
    
    def play(self, nim: Nim):
        # remove objects from row to make nim-sum 0
        nim_sum = self.nim_sum(nim)
        for i, row in enumerate(nim._rows):
            if row ^ nim_sum < row:
                nim.nimming_remove(i, row - (row ^ nim_sum))
                return        
    
    # remove objects to leave nim-sum at the end of turn
    # loop over rows and split objects into 1, 2, 4
    # remove left over objects
        