Introduction to Artificial Intelligence - Lab Session 0 - 
--
At the end of this session, you will be able to : 
- Create and manage the Jupyter Notebooks environment to run code, insert text and math equations.
- Perform basic matrix manipulations using Numpy. 
- Create signals and perform basic scientific computing using Scipy and Numpy.
- Use the pyrat software to generate custom datasets and save it into numpy
- Produce simple data visulisation using Matplotlib.

Part 1 - Intro to Jupyter Notebook
--
Here, we will only cover the basics. 

Jupyter Notebook is based on the .ipynb format (iPython Notebook), and is essentially a way to do rapid prototyping / demonstrations of scientific python. The basic idea is to define *cells*. 
Cells can be of several types, including python code, or rich text (using [markdown formatting](https://www.markdownguide.org/basic-syntax/)).

When a code cell is evaluated (i.e. the python code will be executed), the output of this evaluation will show up right below the cell. 

When a text cell is evaluated, the text will be formatted. 

You can now do the "User Interface Tour" from the Help menu. 

Done ? 

When working with Jupyter Notebook, you will essentially switch between two modes : 
- The Edit mode in which you edit the content of the cells 
- The Command mode, that enables you to change the cell types. 

When in Command mode, you can select cells. If you select a single cell, you can edit it by simply pressing enter, or double clicking on it. 

For example, try editing THIS CELL and change its content. 

Now, edit the cell below, change the code, and when you're done, press Shift+Enter to evalute the code. 

In [None]:
### CELL TO BE EDITED

a=32
b= 2*a
print("%d + %d"%(a,b))

Text cells can contain math expressions that use the Markdown formatting, in which you can use LaTEx expressions for maths (enclosed between two dollar signs). 

For example : $A(k) \triangleq \sum_{\mathbf{n} =1}^{k}{n^2}$

Now : 
- Edit the current cell to show the code that displays the math expression,
- Create a code cell below that defines a function that calculates $A(k)$ given k, and evaluate this cell,
- Create another cell and use the function to display $A(k)$ for a few values of $\mathbf{k}$ (eg 10 and 20).

In [None]:
### CELL TO BE COMPLETED 


In [None]:
### CELL TO BE COMPLETED 


Note that using Jupyter Notebook, if you evaluate a cell with a function followed by a "?" sign, the help of the function will pop up. 

Example : 

In [None]:
import os

os.listdir?

You can also display the code of a function using the syntax "??" 

In [None]:
A??

The popup can be closed by pressing the Escape key. 

Use the listdir function to browse the content of some directories... 


In [None]:
### CELL TO BE COMPLETED 


Part 2 - Introduction to Numpy, Scipy and Matplotlib 
--

A code cell can contain any python code, including imports. Let's start by importing the Numpy package. 

In [None]:
import numpy as np

Numpy can be used to generate pseudo-random values from various distributions. In particular, a very useful distribution is the standard normal (zero mean and unit variance). Let's generate two vectors sampled from the normal distribution, using a length parameter that we'll be able to change if needed. 

In [None]:
length = 50

vecA = np.random.randn(length)
vecB = np.random.randn(2*length)

vecA and vecB are numpy *arrays*. One of their attributes can be fetched to check their *shape*

In [None]:
print(vecA.shape)
print(vecB.shape)

In [None]:
print(vecA)

Numpy arrays can be vectors as well as matrices, or any tensor. For example the following code will create tensors with 3 dimensions using the standard normal

In [None]:
arrayC = np.random.randn(3,500,4)
print(arrayC.shape)

Note that the random package of Numpy has several other interesting functions. Try to test the two functions proposed in the cell below. 

Try uncommenting the two functions below one by one, look up their help page, and try to use them. 

In [None]:
### CELL TO BE Edited 

#np.random.randint
#np.random.permutation

A very important features of arrays is the fact they can be used as *iterables*. For example, you can iterate over the dimensions of an array by simply "looping" over it using a *for* loop

In [None]:
for curdim in arrayC:
    print(curdim.shape)

Also possible to enumerate along the dimension in order to get the index of the current "smaller" array


In [None]:
print('Initial shape is %d %d %d' % (arrayC.shape[0],arrayC.shape[1],arrayC.shape[2]))
print('Iterating over the first dimension using an index k')
for k,curdim in enumerate(arrayC):
    print('k = %d, shape is %d %d' % (k,curdim.shape[0],curdim.shape[1]))

Use the previous principle in order to calculate the average of each 500x4 subvector, using the function np.mean()

In [None]:
### CELL TO BE COMPLETED 


Check that you obtain the same result when directly computing the average over the two axis 1 and 2 (look up the arguments of np.mean) 

In [None]:
### CELL TO BE COMPLETED 


These features will prove to be very useful when manipulate large arrays. 

Another important operation when working with Numpy Arrays is *reshaping*. Essentially, *reshaping* consists in changing the organisation of the array (in terms of dimension), while keeping the same number of elements. For example, a 20x10 2D array can be converted into a 4x5x10 array

In [None]:
A = np.random.randint(1,5,(10,20))
print('Initial shape of A is %d x %d' % (A.shape[0],A.shape[1]))
print(A)
B = A.reshape((4,5,10))
print('B is A reshaped to %d x %d x %d' % (B.shape[0],B.shape[1],B.shape[2]))
print(B)

Now try implementing the same function $A(k)$ that we implemented in part 1 using numpy.

Recall that $A(k) \triangleq \sum_{\mathbf{n} =1}^{k}{n^2}$

The following numpy auxiliary functions can help you:
   - power: (np.power(base,exponent), example: np.power(2,2) = 4
   - arange: (np.arange(last element), example: np.arange(5) = [0,1,2,3,4]
   - sum: (np.sum(vector), example: np.sum([0,1,2,3]) = 6

In [None]:
### CELL TO BE COMPLETED 

One property of numpy that is really important is broadcasting. The goal of broadcasting is to simplify the vectorization of certain operations when the vectors do not have the same shape. For example you can easily perform element-wise multiplication.

To test this try doing an element-wise multiplication of the vector x and matrix y below

In [None]:
x = np.array([2,3])
y = np.array([[4,1],[9,10],[12,13]])
result = x*y
print("X: ",x)
print("Y: ")
print(y)
print("X shape is: ",x.shape)
print("Y shape is: ",y.shape)
print("Element-wise multiplication shape:", result.shape)
print("Element-wise multiplication result:")
print(result)


Another very powerful tool in numpy is indexing. You can use either an integer vector or a boolean vector to choose which indexes you want to extract from your numpy tensor.

Consider that we want to extract all elements from the first line of your vector y that have a higher value than 1, you would have to do:

In [None]:
first_row = y[0]
first_row_higher_than_one = first_row > 1
print("Result: ", first_row[first_row_higher_than_one])

You can also choose specific lines to query, for example if you want to query lines 0 and 2

In [None]:
rows = [0,2]
rows_result = y[rows]
values_higher_than_one = rows_result > 1
print("Result: ", rows_result[values_higher_than_one])

You can also save and load your numpy tensors using np.savez and np.load. This will be really important in the next courses as this enable you to generate your data only once instead of having to do all the calculations every time you need your data.

In [None]:
filename = "x.npz"
source_tensor = x
np.savez(filename,data=source_tensor)

In [None]:
loaded_npz = np.load(filename)
loaded_tensor = loaded_npz["data"]
print("Your tensor was loaded and contains: ", loaded_tensor)

Part 3 - Setup of the pyrat software and generating games
--


If you have not done so already, you need the latest version of PyRat. To obtain it, clone the [official PyRat repository](https://github.com/BastienPasdeloup/PyRat-1). 

PS: You will need to have pygame installed in your machine, open a terminal and run:

<pre>pip install pygame</pre>

In [None]:
### TO DO: open a terminal tab / window and clone the repo.

You can now launch Pyrat Games. 

In the context of the AI course, we are going to simplify the rules of PyRat a bit.
In fact, we are going to remove all walls and mud penalties. Also, we are not going to consider symmetric mazes anymore.

As such, a default game is launched with the following parameters. Please try now (note that you may have to type python instead of python3): 

<pre>python3 pyrat.py -p 40 -md 0 -d 0 --nonsymmetric</pre>

An empty labyrinth will appear.



In [None]:
### TO DO: open a terminal tab / window and launch the command

Please check out all the options offered by the pyrat software, by running : 

<pre>python3 pyrat.py -h</pre>

Importantly, there are options to change the size of the map, the number of cheese, which will be very useful later to benchmark your own solutions. 

For example, find the correct command line to launch a game with a 10 by 11 map, with only 10 cheeses. 

In [None]:
### TO DO: open a terminal tab / window and launch the command to generate a game with a 10 by 11 map, with only 10 cheeses. 

In the supervised and unsupervised projects, we are going to look at plays between two greedy algorithms. Generating 1000 such games while saving data is easily obtained with PyRat. 

Open another terminal to launch the next command line. Generating 1000 games will take a few minutes.

<pre>python3 pyrat.py --width 21 --height 15 -p 40 -md 0 -d 0 --nonsymmetric --rat AIs/manh.py --python AIs/manh.py --tests 1000 --nodrawing --synchronous --save</pre>

It is possible to open a terminal window from the "Home" Interface of Jupyter Notebook.

The 1000 generated games will be in the "saves" folder. Each time you execute the command new games are added to the saves folder. You have to manually delete the old games if you do not want to use them (for example, if you change the size of the labyrinth or if you want to train your IA on new games).

In [None]:
### TO DO: open a terminal tab / window and launch the command to generate the games.

We will now convert the result in order to open it with numpy.

To convert the games into numpy arrays, we use a few functions that we define here. Feel free to modify it later to your own needs.

In [None]:
import os
import tqdm
import ast

mazeHeight = 15
mazeWidth = 21


def convert_input(maze, mazeWidth, mazeHeight, piecesOfCheese):
    im_size = (mazeWidth, mazeHeight) 
    canvas = np.zeros(im_size,dtype=np.int8)
    for (x_cheese,y_cheese) in piecesOfCheese:
        canvas[x_cheese,y_cheese] = 1
    # to use it with sklearn, we flatten the matrix into an vector
    return canvas.ravel()


PHRASES = {
    "# Random seed\n": "seed",
    "# MazeMap\n": "maze",
    "# Pieces of cheese\n": "pieces"    ,
    "# Rat initial location\n": "rat"    ,
    "# Python initial location\n": "python"   , 
    "rat_location then python_location then pieces_of_cheese then rat_decision then python_decision\n": "play"
}
 
MOVE_DOWN = 'D'
MOVE_LEFT = 'L'
MOVE_RIGHT = 'R'
MOVE_UP = 'U'
 
translate_action = {
    MOVE_LEFT:0,
    MOVE_RIGHT:1,
    MOVE_UP:2,
    MOVE_DOWN:3
}

 
def process_file(filename):
    f = open(filename,"r")    
    info = f.readline()
    params = dict(play=list())
    while info is not None:
        if info.startswith("{"):
            params["end"] = ast.literal_eval(info)
            break
        if "turn " in info:
            info = info[info.find('rat_location'):]
        if info in PHRASES.keys():
            param = PHRASES[info]
            if param == "play":
                rat = ast.literal_eval(f.readline())
                python = ast.literal_eval(f.readline())
                pieces = ast.literal_eval(f.readline())
                rat_decision = f.readline().replace("\n","")
                python_decision = f.readline().replace("\n","")
                play_dict = dict(
                    rat=rat,python=python,piecesOfCheese=pieces,
                    rat_decision=rat_decision,python_decision=python_decision)
                params[param].append(play_dict)
            else:
                params[param] = ast.literal_eval(f.readline())
        else:
            print("did not understand:", info)
            break
        info = f.readline()
    return params

Now, we are ready to parse the "saves" folder in order to generate the data into a numpy array. 

In [None]:
## set your path to the saves folder here
directory =  #"/home/nicofarr/git/pyrat/saves/"  ## to modify


In [None]:
games = list()

for root, dirs, files in os.walk(directory):
    for filename in tqdm.tqdm(files):
        try:
            if filename.startswith("."):
                continue
            game_params = process_file(directory+filename)
            games.append(game_params)
        except:
            print("Filename {} did not work".format(filename))

x = np.array([]).reshape(0,mazeWidth * mazeHeight)
y = np.array([]).reshape(0,1)
wins_python = 0
wins_rat = 0
for game in tqdm.tqdm(games):
    if game["end"]["win_python"] == 1: 
        wins_python += 1
    elif game["end"]["win_rat"] == 1:
        wins_rat += 1    
    canvas = convert_input(game["maze"], mazeWidth, mazeHeight, game["play"][0]["piecesOfCheese"])
    if game["end"]["win_python"] == 1:
        y = np.append(y,1)
    elif game["end"]["win_rat"] == 1:
        y = np.append(y,-1)
    else:
        y = np.append(y,0)
    x = np.concatenate([x, canvas.reshape(1,-1)], axis=0)

x and y are numpy array, feel free to save them to a .npz file as seen in TP0. 

In [None]:
### CELL TO BE COMPLETED
### CHECK THE SHAPES OF X AND Y 
### SAVE X AND Y IN A NPZ FILE 




Part 4 - Visualizing PyRat Datasets
--

You now have to load the pyrat dataset that you just saved (or you can open the one we provide, "dataset.npz"). 

It contains two variables named "x" and "y". You should store them in variables named x_pyrat and y_pyrat.

In [None]:
### CELL TO BE COMPLETED 


Now with the dataset loaded we can explore it using matplotlib. Matplotlib is a very powerful python graphics display library.

We are going to be showing the games initial state and the winner of each match. The games are represented by two variables X and Y.

X is a matrix with 1000 examples of length 315. Each example can be resized to the real maze shape of 21 by 15. Each data point of the example vector has two possible values. 1 for presence of cheese and 0 for absence of cheese. 

Y is a scalar integer that ranges from -1 to 1. 1 represents a win by the python, 0 a draw and -1 a win for the rat.

The magic command "%matplotlib inline" tells jupyter notebook to display the plot results in the document, instead of opening a separate window.

Now it is your turn. Reshape the x_pyrat matrix into a tensor of (examples,mazeWidth, mazeHeight) and put it into a variable x_labyrinth

In [None]:
### CELL TO BE COMPLETED 


In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
afew = 5 # Number of samples
fig, axis = plt.subplots(1,afew,figsize=(20,10)) # Generate a new figure with one row of 5 plots. We also set the size 20,10
for i in range(afew):
    ind = np.random.randint(x_labyrinth.shape[0]) #sample a game
    ax = axis[i] # get the corresponding axis to use
    img = ax.matshow(x_labyrinth[ind].T) #Show the matrix as an image (With .T, we transpose the width dimension and the height dimension to see the game as plotted by the PyRat software.)
    
    ax.set_title('Winner : {}'.format(y_pyrat[ind])) # Set the axis title with the game winner

#Invert all of the y axis so that we see the game in the same direction than in the Pyrat software 
[ax.invert_yaxis() for ax in axis]

fig.colorbar(img,ax=axis) # add a colorbar for each image

In the plots above, the cheeses (1) are pictured in yellow, while the empty squares are represented in blue.

# Exercise

Now with all the knowledge you acquired today you can start doing some analysis. 

Compute the average initial configuration of the game for each situation (rat win, draw, python win) and plotting them side by side. 

In [None]:
### CELL TO BE COMPLETED 

In [None]:
### CELL TO BE COMPLETED 
