# Lab 3: Policy Search

## Task

Write agents able to play [*Nim*](https://en.wikipedia.org/wiki/Nim), with an arbitrary number of rows and an upper bound $k$ on the number of objects that can be removed in a turn (a.k.a., *subtraction game*).

The goal of the game is to **avoid** taking the last object.

* Task3.1: An agent using fixed rules based on *nim-sum* (i.e., an *expert system*)
* Task3.2: An agent using evolved rules
* Task3.3: An agent using minmax
* Task3.4: An agent using reinforcement learning

## Instructions

* Create the directory `lab3` inside the course repo 
* Put a `README.md` and your solution (all the files, code and auxiliary data if needed)

## Notes

* Working in group is not only allowed, but recommended (see: [Ubuntu](https://en.wikipedia.org/wiki/Ubuntu_philosophy) and [Cooperative Learning](https://files.eric.ed.gov/fulltext/EJ1096789.pdf)). Collaborations must be explicitly declared in the `README.md`.
* [Yanking](https://www.emacswiki.org/emacs/KillingAndYanking) from the internet is allowed, but sources must be explicitly declared in the `README.md`.

**Deadline**

T.b.d.


In [60]:
import logging
import operator
import functools
import random

In [61]:
class Nim:
    def __init__(self, num_rows: int, k: int = None) -> None:
        self._rows = [i*2 + 1 for i in range(num_rows)]
        self.max = self._rows[-1]
        self._k = k

    def nimming(self, row: int, num_objects: int) -> None:
        assert self._rows[row] >= num_objects
        assert self._k is None or num_objects <= self._k
        self._rows[row] -= num_objects
        if sum(self._rows) == 0:
            logging.info("Yeuch")

    def __str__(self) -> str:
        return "\n".join([" "*((self.max - x)//2) + "|"*x for x in self._rows])

    def nim_sum(self) -> int:
        return functools.reduce(operator.xor, self._rows, 0)

    def end_game(self) -> bool:
        return sum(self._rows) == 1

    def computer_move(self) -> tuple[int, int]:
        nim_sum = self.nim_sum()
        if nim_sum != 0:
            logging.debug("Playing optimally")
            for i in range(len(self._rows)):
                if ((self._rows[i] ^ nim_sum) < self._rows[i]):
                    return i, self._rows[i] ^ nim_sum
        else:
            # random chose a non-empty pile and remove it
            logging.debug("Playng randomly")
            non_zero_indices = [None for _ in self._rows]
            count = 0
            for i in range(len(self._rows)):
                if self._rows[i] > 0:
                    non_zero_indices[count] = i
                    count += 1
            rows = int(random.random() * count)
            k = 1 + int(random.random() * self._rows[rows])
            return rows, k


In [62]:
game = Nim(3)
TURN = 0

print("Welcome to Nim game!")
while not game.end_game():
    print(game)
    if TURN == 0:
        row, k = game.computer_move()
        game.nimming(row, k)
        print(f"PC removed {k} element{'s'*(k > 1)} from row {row}")
        TURN = 1
    else:
        row = int(input("Please insert a row: ")) - 1
        k = int(input("Please insert how many element: "))
        game.nimming(row, k)
        print(f"Removed {k} element{'s'*(k > 1)} from row {row}")
        TURN = 0
    print(f"{'-'*10}")



Welcome to Nim game!
  |
 |||
|||||
PC removed 2 elements from row 2
----------
  |
 |||
 |||
Removed 1 element from row 0
----------
  
 |||
 |||


AssertionError: 