# Intelligent Systems 2023: Practical Assignment 11

## Machine Learning Agents

Your name:

Your VUnetID:

If you do not provide your name and VUnetID we will not accept your submission.

### Preliminaries

At the end of this exercise you should be able to work with some basic Machine Learning concepts, and implement and evaluate a learning-based approach to playing Schnapsen. In this notebook we are going to create an adaptive bot. We will use the principle discussed in the machine learning lecture, but now in an agent setting. This comes down to using basic hill-climbing search, but learn the heuristic function rather than implementing it. This will require a few basic ingredients.

Feature vectors were discussed in the lecture. Didn't get it, or working ahead? See
https://brilliant.org/wiki/feature-vector/
https://www.youtube.com/watch?v=3Vy47dbI708

### Practicalities

Follow this Notebook step-by-step. For this course it is necessary that you manipulate the python programmes we provide. You can do the exercises in any Programming Editor of your liking. Still, please fill in the questions in this notebook as usual. You can also run tournaments in it if you want, but running them in your editor or via the commandline seems much more convenient.

Please use your studentID+Assignment6.ipynb as the name of the Notebook, and fill in the missing cells.

Note: unlike the courses dedicated to programming we will not evaluate the style of the programs. But we will, however, test your programs on other data that we provide, and your program should give the correct output to the test-data as well.

As was mentioned, the assignment is graded as pass/fail. To pass you need to have either a full working code or an explanation of what you tried and what didn't work for the tasks that you were unable to complete (you can use multi-line comments or a text cell).

## Initialising

First, we have to install the schnapsen python package.
Run the below code cell.
After running the cell, you have the schnapsen Github repository cloned in your current directory.
You can find the new directory created with the name `schnapsen`.
The detailed installation instructions can be found in the [README.md](https://github.com/intelligent-systems-course/schnapsen) of the repo.


In [None]:
# If you are on a UNIX system (Linux or Mac OS)
!pip uninstall schnapsen -y && rm -rf schnapsen && git clone https://github.com/intelligent-systems-course/schnapsen && cd schnapsen && pip install -e . && cd ..

In [None]:
# If you are on Windows
!pip uninstall schnapsen -y rd /s /q schnapsen && git clone https://github.com/intelligent-systems-course/schnapsen && cd schnapsen && pip install -e . && cd ..

## (Supervised) machine learning

So far, we've been trying to come up with bots that rely on heuristics and search
algorithms, e.g., Minimax, Monte-Carlo Tree Search, etc. This assignment, instead, will
focus on learning-based methods, where the agent, given a state, learns what moves 
(actions) lead to winning. You might have heard of [AlphaGo](https://en.wikipedia.org/wiki/AlphaGo),
which actually combines both a Monte Carlo tree search algorithm and machine learning.
Schnapsen is no different from the game of Go, except that it's state / aciton spaces are
much smaller, and also it involves an imperfect information phase.

Typically for these kinds of board games, reinforcement learning (RL), which is a branch of
machine learning is used. However, RL is out of scope for this assignment. Therefore,
we will do something untypical. We'll use supervised learning (SL), instead of RL.

This will still give you a glimpse of machine learning, and hopefully in the future,
you'll understand why RL is more suitable for this kind of task, and come up with your
own RL algorithm to solve it.

SL typicall includes three big steps:

1. Collect data.
2. Train a ML model with the collected data.
3. Run and evaluate the trained model.

All the functions and classes we need are included in [`ml_bot.py`](./ml_bot.py). 
Let's go through them one by one. 

### 1. Collect data

We'll collect supervised data, meaning that every $x$ has its label $y$.

For example, if $x$ is an image of a dog, then $y$ is 0, and if $x$ is an image of a cat,
then $y$ is 1.

In our case, $x$ is what the agent sees + the moves that the leader and the follower take, 
and $y$ is 0 if this move resulted in losing, and 1 if it resulted in winning.

It's easy for us to express "what the agent sees + the moves that the leader and the follower take" in human langauge, but this is not easy for the machine to understand. Therefore, we'll
have to convert it to numbers, e.g., 0's and 1's, so that it can understand.

#### Task 1:

The function `def create_state_and_actions_vector_representation()` in [`ml_bot.py`](./ml_bot.py)
does this job. This function takes three arguments, `perspective`, `leader_move`, and `follower_move`, and return them as a vector of numbers so that the machine can understand.

The three variables `player_game_state_representation`, `leader_move_representation`, and `follower_move_representation` are marked with `???`. You have to call the necessary 
functions with the correct arguments to get the correct values (features) for them.

Write them down in the below cell.

In [None]:
MyCode1 = """
Write your code here.
"""

#### Task 2

Now that the function `def create_state_and_actions_vector_representation()` is ready to use,
we will collect data by calling the function `def create_replay_memory_dataset()`.

Simply running the below cell won't work, since this function takes two arguments:
`bot1` and `bot2`.
Try to understand the function by looking at the docstring, and run the function by
specifying the bots.
You can provide the function with any bots, but let us start with `RandBot`.
Go to our previous assignments on how to instantiate the `Bot` objects.

Write your code in `MyCode2`.

In [None]:
import random
from schnapsen.bots import RandBot
from ml_bot import create_replay_memory_dataset

bot1 = ???
bot2 = ???

create_replay_memory_dataset(bot1=bot1, bot2=bot2)

In [None]:
MyCode2 = """
Write your code here.
"""

Now the offline data is collected, and saved at `./ML_replay_memories/random_random_10k_games.txt`

#### Task 3

Let's see what it looks like. Every row is a pair of $(x^{(i)}, y^{(i)})$, where 
$x^{(i)}$ is "what the agent sees + the moves that the leader and the follower take",
and $y^{(i)}$ is a label. Often we also call $x^{(i)}$ as a feature and $y^{(i)}$ as
a target.

As you can see now they are now all numbers so that the machine can understand, but
perhaps not so easy for you as a human, since the feature extraction (converting to 
numeric values) is done by us, not by you. But you can go through the code and try to
understand how these feature extractions were done. In the below cell, (1) explain what 
the numbers before and after `||`, and (2) what these numbers mean.

In [None]:
MyReport1 = """
Write your report here.
"""

### 2. Train a ML model with the collected data.

Now we will train a ML model! Our model $f_{\bm{\theta}}(x) = \hat{y}$ is a function that
maps $x$ to $\hat{y}$, where $\bm{\theta}$ are learanble parameters. As all of our target 
values, i.e., $y$ are categorical values, we will treat this as a classification problem,
rather than a regression problem. There are a numerous machine learning models we can use
for a classification problem, but in our assignment, we will only try two and compare 
them.

#### Task 4. 

The below cell will run training, but you have to give it an argument first. Look at 
the code and (1) explain what arguments does it take? (2) What are the differences 
between the arguments?

In [None]:
from ml_bot_with_answers import train_model

train_model()

In [None]:
MyReport2 = """
Write your code here.
"""

### 3. Run and evaluate the trained model.

#### Task 5

Now your model, depending on what argument you gave above, is trained and saved at
`./ML_models/simple_model`. It's time for us to load this model, and put this on a test.
We'll first let our ML model play against a RandBot. 

The below cell is not complete. The class `MLPlayingBot` and `RandBot` takes arguments,
and they are not given. Specify the arguments, and copy the code in `MyCode3`

In [None]:
import pathlib
import random

from schnapsen.game import SchnapsenGamePlayEngine
from schnapsen.bots import RandBot

from ml_bot_with_answers import MLPlayingBot

engine = SchnapsenGamePlayEngine()
model_dir = "ML_models"
model_name = "simple_model"
model_location = pathlib.Path(model_dir) / model_name

bot1 = MLPlayingBot()
bot2 = RandBot()

winner_id, game_points, score = engine.play_game(bot1, bot2, random.Random(0))
print(f"Game ended. Winner is {winner_id} with {game_points} points and {score}")

In [None]:
MyCode3 = """
import pathlib
import random

from schnapsen.game import SchnapsenGamePlayEngine
from schnapsen.bots import RandBot

from ml_bot_with_answers import MLPlayingBot

engine = SchnapsenGamePlayEngine()
model_dir = "ML_models"
model_name = "simple_model"
model_location = pathlib.Path(model_dir) / model_name

bot1 = MLPlayingBot(model_location=model_location, name="MLBot")
bot2 = RandBot(random.Random(42), name="RandBot")

winner_id, game_points, score = engine.play_game(bot1, bot2, random.Random(0))
print(f"Game ended. Winner is {winner_id} with {game_points} points and {score}")
"""

#### Task 6

The above cell only runs one game MLBot vs. RandBot. As you have learned already, since 
there is always randomness involved, we should play this game multiple times and compare
the results. In the below cell, write code for a round-robin tournament where $N$ bots
play against each other. One of them has to be an MLBot, and the other $N-1$ can be any
bots that you have learend in this course. $N$ should be higher than 2. You can use 
the code from the previous assignments.


In [None]:
MyCode4 = """
Write your code here.
"""

In the below cell, write down the results of your experiment.

In [None]:
MyReport3 = """
Write your report here.
"""

## Final Task: Collect all the results

Uncomment and run this cell (and all the cells above) to generate the text file that you have to hand in together with the notebook on canvas!

### Please hand in only the text file which is generated by this method!

In [None]:
def exportToText(*args):
    with open(args[0], "w") as f:
        for argument in args:
            f.write("{}\n".format(argument))

exportToText("assignment6.txt", MyCode1, MyCode2, MyReport1, MyReport2, MyCode3, MyCode4, MyReport3)

Congrats on finishing all the assignments!

Machine learning, especially deep learning, has become a key to solving many difficult
problems in AI, the past 10 years, ranging from board games, self-driving cars, to 
scientific discoveries.

This assignment only touched upon a small tiny fraction of machine learning and what it
can do. If you are interested in machine learning, feel free to check out numerous online 
materials and the related VU courses.

Have fun. To be continued in Project Intelligent Systems in Period 3. 