# Artificial Intelligence (CM20252 / CM50263) Lab - Session 0

Lecturer: Özgür Şimşek o.simsek@bath.ac.uk

Tutors: 

* Jan Malte Lichtenberg j.m.lichtenberg@bath.ac.uk
* Andreasa Morris Martin a.l.morris.martin@bath.ac.uk
* Fahid Mohammed f.r.mohammed@bath.ac.uk
* Siriphan Wichaidit s.wichaidit@bath.ac.uk

This tutorial was built upon other tutorials and lecture series from [Robert Johansson](https://github.com/jrjohansson), [Parak K Mital](https://github.com/pkmital), and [Rick Muller](http://nbviewer.jupyter.org/gist/rpmuller/5920182); the [official python.org-tutorial](https://docs.python.org/3.6/tutorial/); as well as module-specific tutorials for [numpy](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html) and [matplotlib](https://matplotlib.org/faq/usage_faq.html#usage).
This work is licensed under a [Creative Commons Attribution-ShareAlike 3.0 Unported License](http://creativecommons.org/licenses/by-sa/3.0/deed.en_US).

## Learning Goals

The goal of this notebook is for you to learn how to

* navigate a Jupyter Notebook
* "Python 101": variables, functions, control flow
* import libraries and other code files
* use the 'numpy' module for numerical computing: multi-dimensional arrays
* use the 'matplotlib' module for visualization of results
* use object-oriented programming in Python: define a class with methods and fields, create objects

## Python! 

[Python](http://www.python.org/) is a modern, general-purpose, object-oriented, high-level programming language. Also have a look at the [Zen of Python](https://en.wikipedia.org/wiki/Zen_of_Python).

General characteristics of Python:

* **clean and simple language:** Easy-to-read and intuitive code, easy-to-learn minimalistic syntax, maintainability scales well with size of projects. 
* **expressive language:** Fewer lines of code, fewer bugs, easier to maintain.

Technical details:

* **dynamically typed:** No need to define the type of variables, function arguments or return types.
* **automatic memory management:** No need to explicitly allocate and deallocate memory for variables and data arrays. No memory leak bugs. 
* **interpreted:** No need to compile the code. The Python interpreter reads and executes the python code directly.

Advantages:

* The main advantage is ease of programming, minimizing the time required to develop, debug and maintain the code.
* Well designed language that encourage many good programming practices:
    * Modular and object-oriented programming, good system for packaging and re-use of code. This often results in more transparent, maintainable and bug-free code.
    * Documentation tightly integrated with the code.
* A large standard library, and a large collection of add-on packages.

Disadvantages:

* Since Python is an interpreted and dynamically typed programming language, the execution of python code can be slow compared to compiled statically typed programming languages, such as C and Fortran. 
* Somewhat decentralized, with different environment, packages and documentation spread out at different places. Can make it harder to get started.

We chose Python for these labs because it will let you (everyone) focus on the conceptual side of things. The aim here is not to produce the most efficient code ever written but rather to let you quickly compare and understand different algorithms and approaches. 

<!---
We will cover the basics of the numpy library, which simplifies the manipulation of arrays, and should thus be helpful to code parts of the environment and the reinforcement learning agent.

We will then look at simple examples of visualizing results using the matplotlib.pyplot library. These should enable you to plot learning curves that show the performance of your agent.

Finally, we will show the very basics of object oriented programming (OOP) in Python. We strongly advise you to use OOP for your environments and agents. Note that this is _not_ an introduction to the principles of object-oriented programming. If you need a refresher, please look at, for example, [here](https://python.swaroopch.com/oop.html).

-->

# Jupyter
### Cells

Try to run/execute the next cell by selecting it and pressing shift-enter on it. 

In [None]:
3*7

You can go into "edit mode" by clicking on a cell. In order to exit the "edit mode" and go into "command mode", press escape or click somewhere outside of any cell. Once in command mode, you can, for example, create new cells by pressing "a" for above or "b" for below. Try it out! 

The cell you just executed above is called a "code cell". The kind of cell in which this text is written is called a 'markdown' cell. You can switch between both modes using the dropdown menu in the tool bar or by pressing (when in command mode) "m" or "y", respectively. In order to edit a markdown cell, you have to double-click on it.

### The Kernel

The Python interpreter in which your code cells are executed is called the "kernel". You can stop or restart the kernel (that is, start a new Python session) in the toolbar on the top of the page. You fill the Python version of your kernel just beneath the "Logout" button in the top-right corner of your notebook. Please make sure that it is Python 3.

### Saving Notebooks

Jupyter notebooks are generally saved as .ipynb files. You will have to submit .ipynb files for upcoming, graded assignments. 

Remember to save your work regularly (`Save and checkpoint` in the `File` menu, the icon of a floppy disk, or `Ctrl-S`).

Workbooks are saved into the directory you're running Jupyter; alternatively you can download a notebook from the menu above using `File -> Download As -> Notebook (.ipynb)`. 

### More about Jupyter notebooks

If you want to learn more about Jupyter, or if you have difficulties navigating this notebook during the tutorial, have a look at this [video](https://www.youtube.com/watch?v=HW29067qVWk) or the [official documentation](http://jupyter.org/documentation).

# Python

This section introduces basic concepts of Python such as variables, functions, and control flow (loops, if-statements, ...). We focus on the concepts needed for the upcoming labs and leave out concepts that will not be needed (e.g., complex numbers). For a much more complete tutorial, take a look at the [official python.org-tutorial](https://docs.python.org/3.6/tutorial/). 

## Variables and types
### Symbol names 

Variable names in Python can contain alphanumerical characters `a-z`, `A-Z`, `0-9` and some special characters such as `_`. Normal variable names must start with a letter. 

By convention, variable names start with a lower-case letter, and Class names start with a capital letter. 

In addition, there are a number of Python keywords that cannot be used as variable names. These keywords are:

    and, as, assert, break, class, continue, def, del, elif, else, except, 
    exec, finally, for, from, global, if, import, in, is, lambda, not, or,
    pass, print, raise, return, try, while, with, yield

Note: Be aware of the keyword `lambda`, which could easily be a natural variable name in a scientific program. But being a keyword, it cannot be used as a variable name.

### Assignment



The assignment operator in Python is `=`. Python is a dynamically typed language, so we do not need to specify the type of a variable when we create one.

Assigning a value to a new variable creates the variable:

In [None]:
# variable assignments
x = 1.0
my_variable = 12.2

Although not explicitly specified, a variable does have a type associated with it. The type is derived from the value that was assigned to it.

In [None]:
type(x)

If we assign a new value to a variable, its type can change.

In [None]:
x = 1

In [None]:
type(x)

If we try to use a variable that has not yet been defined we get an `NameError`:

In [None]:
print(y)

### Fundamental types

In [None]:
# integers
x = 1
type(x)

In [None]:
# float
x = 1.0
type(x)

In [None]:
# boolean
b1 = True
b2 = False

type(b1)

## Operators and comparisons

Most operators and comparisons in Python work as one would expect:

* Arithmetic operators `+`, `-`, `*`, `/`, `//` (integer division), '**' power


In [None]:
1 + 2, 1 - 2, 1 * 2, 1 / 2

In [None]:
1.0 + 2.0, 1.0 - 2.0, 1.0 * 2.0, 1.0 / 2.0

In [None]:
# Integer division of float numbers
3.0 // 2.0

In [None]:
# Note! The power operators in python isn't ^, but **
2 ** 2

Note: The `/` operator always performs a floating point division in Python 3.x.
This is not true in Python 2.x, where the result of `/` is always an integer if the operands are integers.
to be more specific, `1/2 = 0.5` (`float`) in Python 3.x, and `1/2 = 0` (`int`) in Python 2.x (but `1.0/2 = 0.5` in Python 2.x). Please use Python 3 for these labs!

* The boolean operators are spelled out as the words `and`, `not`, `or`. 

In [None]:
True and False

In [None]:
not False

In [None]:
True or False

* Comparison operators `>`, `<`, `>=` (greater or equal), `<=` (less or equal), `==` equality, `is` identical.

In [None]:
2 > 1, 2 < 1

In [None]:
2 > 2, 2 < 2

In [None]:
2 >= 2, 2 <= 2

In [None]:
# equality
[1,2] == [1,2]

In [3]:
# objects identical?
l1 = l2 = [1,2]
l1 is l2

True

## Compound types: Strings, List and dictionaries

### Strings

Strings are the variable type that is used for storing text messages. 

In [None]:
s = "Hello world"
type(s)

In [None]:
# length of the string: the number of characters
len(s)

In [None]:
# replace a substring in a string with something else
s2 = s.replace("world", "test")
print(s2)

We can index a character in a string using `[]`:

In [None]:
s[0]

### List

Lists are very similar to strings, except that each element can be of any type.

The syntax for creating lists in Python is `[...]`:

In [None]:
l = [1,2,3,4]

print(type(l))
print(l)

We can use the same slicing techniques to manipulate lists as we could use on strings:

In [None]:
print(l)

print(l[1:3])

print(l[::2])

**Heads up MATLAB users:** Indexing starts at 0!

In [None]:
l[0]

Elements in a list do not all have to be of the same type:

In [None]:
l = [1, 'a', 1.0, 1-1j]

print(l)

Python lists can be inhomogeneous and arbitrarily nested:

In [None]:
nested_list = [1, [2, [3, [4, [5]]]]]

nested_list

Lists play a very important role in Python. For example they are used in loops and other flow control structures (discussed below). There are a number of convenient functions for generating lists of various types, for example the `range` function:

In [None]:
start = 10
stop = 30
step = 2

range(start, stop, step)

In [None]:
# in python 3 range generates an iterator, which can be converted to a list using 'list(...)'.
# It has no effect in python 2
list(range(start, stop, step))

In [None]:
list(range(-10, 10))

In [None]:
s

In [None]:
# convert a string to a list by type casting:
s2 = list(s)

s2

In [None]:
# sorting lists
s2.sort()

print(s2)

#### Adding, inserting, modifying, and removing elements from lists

In [None]:
# create a new empty list
l = []

# add an elements using `append`
l.append("A")
l.append("d")
l.append("d")

print(l)

We can modify lists by assigning new values to elements in the list. In technical jargon, lists are *mutable*.

In [None]:
l[1] = "p"
l[2] = "p"

print(l)

In [None]:
l[1:3] = ["d", "d"]

print(l)

Insert an element at an specific index using `insert`

In [None]:
l.insert(0, "i")
l.insert(1, "n")
l.insert(2, "s")
l.insert(3, "e")
l.insert(4, "r")
l.insert(5, "t")

print(l)

Remove first element with specific value using 'remove'

In [None]:
l.remove("A")

print(l)

Remove an element at a specific location using `del`:

In [None]:
l

Remove an element at a specific location using `del`:

In [None]:
del l[7]
del l[6]

print(l)

**Note** that lists are useful for flow control structures or when you want to store data of different types. However, we will always use the module `numpy` (discussed below in this tutorial!) for purely numerical data and arrays.  



### Tuples

Tuples are like lists, except that they cannot be modified once created, that is they are *immutable*. 

In Python, tuples are created using the syntax `(..., ..., ...)`, or even `..., ...`:

In [None]:
point = (10, 20)

print(point, type(point))

In [None]:
point = 10, 20

print(point, type(point))

We can unpack a tuple by assigning it to a comma-separated list of variables:

In [None]:
x, y = point

print("x =", x)
print("y =", y)

If we try to assign a new value to an element in a tuple we get an error:

In [None]:
point[0] = 20

### Dictionaries

Dictionaries are also like lists, except that each element is a key-value pair. The syntax for dictionaries is `{key1 : value1, ...}`:

In [None]:
params = {"parameter1" : 1.0,
          "parameter2" : 2.0,
          "parameter3" : 3.0,}

print(type(params))
print(params)

In [None]:
print("parameter1 = " + str(params["parameter1"]))
print("parameter2 = " + str(params["parameter2"]))
print("parameter3 = " + str(params["parameter3"]))

In [None]:
params["parameter1"] = "A"
params["parameter2"] = "B"

# add a new entry
params["parameter4"] = "D"

print("parameter1 = " + str(params["parameter1"]))
print("parameter2 = " + str(params["parameter2"]))
print("parameter3 = " + str(params["parameter3"]))
print("parameter4 = " + str(params["parameter4"]))

## Control Flow

### Conditional statements: if, elif, else

The Python syntax for conditional execution of code uses the keywords `if`, `elif` (else if), `else`:

In [None]:
statement1 = False
statement2 = False

if statement1:
    print("statement1 is True")
    
elif statement2:
    print("statement2 is True")
    
else:
    print("statement1 and statement2 are False")

For the first time, here we encounted a peculiar and unusual aspect of the Python programming language: Program blocks are defined by their indentation level. 

Compare to the equivalent C code:

    if (statement1)
    {
        printf("statement1 is True\n");
    }
    else if (statement2)
    {
        printf("statement2 is True\n");
    }
    else
    {
        printf("statement1 and statement2 are False\n");
    }

In C blocks are defined by the enclosing curly brakets `{` and `}`. And the level of indentation (white space before the code statements) does not matter (completely optional). 

But in Python, the extent of a code block is defined by the indentation level (usually a tab or say four white spaces). This means that we have to be careful to indent our code correctly, or else we will get syntax errors. 

#### Examples:

In [None]:
statement1 = statement2 = True

if statement1:
    if statement2:
        print("both statement1 and statement2 are True")

In [None]:
# Bad indentation!
if statement1:
    if statement2:
    print("both statement1 and statement2 are True")  # this line is not properly indented

In [None]:
statement1 = False 

if statement1:
    print("printed if statement1 is True")
    
    print("still inside the if block")

In [None]:
if statement1:
    print("printed if statement1 is True")
    
print("now outside the if block")

## Loops

In Python, loops can be programmed in a number of different ways. The most common is the `for` loop, which is used together with iterable objects, such as lists. The basic syntax is:

### **`for` loops**:

In [None]:
for x in [1,2,3]:
    print(x)

The `for` loop iterates over the elements of the supplied list, and executes the containing block once for each element. Any kind of list can be used in the `for` loop. For example:

In [None]:
for x in range(4): # by default range start at 0
    print(x)

Note: `range(4)` does not include 4 !

In [None]:
for x in range(-3,3):
    print(x)

In [None]:
for word in ["scientific", "computing", "with", "python"]:
    print(word)

To iterate over key-value pairs of a dictionary:

In [None]:
for key, value in params.items():
    print(key + " = " + str(value))

Sometimes it is useful to have access to the indices of the values when iterating over a list. We can use the `enumerate` function for this:

In [None]:
for idx, x in enumerate(range(-3,3)):
    print(idx, x)

### List comprehensions: Creating lists using `for` loops:

A convenient and compact way to initialize lists:

In [None]:
l1 = [x**2 for x in range(0,5)]

print(l1)

### `while` loops:

In [None]:
i = 0

while i < 5:
    print(i)
    
    i = i + 1
    
print("done")

Note that the `print("done")` statement is not part of the `while` loop body because of the difference in indentation.

## Functions

A function in Python is defined using the keyword `def`, followed by a function name, a signature within parentheses `()`, and a colon `:`. The following code, with one additional level of indentation, is the function body.

In [None]:
def func0():   
    print("test")

In [None]:
func0()

Optionally, but highly recommended, we can define a so called "docstring", which is a description of the functions purpose and behaivor. The docstring should follow directly after the function definition, before the code in the function body.

In [None]:
def func1(s):
    """
    Print a string 's' and tell how many characters it has    
    """
    
    print(s + " has " + str(len(s)) + " characters")

In [None]:
help(func1)

In [None]:
func1("test")

Functions that returns a value use the `return` keyword:

In [None]:
def square(x):
    """
    Return the square of x.
    """
    return x ** 2

In [None]:
square(4)

We can return multiple values from a function using tuples (see above):

In [None]:
def powers(x):
    """
    Return a few powers of x.
    """
    return x ** 2, x ** 3, x ** 4

In [None]:
powers(3)

In [None]:
x2, x3, x4 = powers(3)

print(x3)

### Default argument and keyword arguments

In a definition of a function, we can give default values to the arguments the function takes:

In [None]:
def myfunc(x, p=2, debug=False):
    if debug:
        print("evaluating myfunc for x = " + str(x) + " using exponent p = " + str(p))
    return x**p

If we don't provide a value of the `debug` argument when calling the the function `myfunc` it defaults to the value provided in the function definition:

In [None]:
myfunc(5)

In [None]:
myfunc(5, debug=True)

If we explicitly list the name of the arguments in the function calls, they do not need to come in the same order as in the function definition. This is called *keyword* arguments, and is often very useful in functions that takes a lot of optional arguments.

In [None]:
myfunc(p=3, debug=True, x=7)

## Import code: modules

Most of the functionality in Python is provided by *modules*. The Python Standard Library is a large collection of modules that provides *cross-platform* implementations of common facilities such as access to the operating system, file I/O, string management, network communication, and much more.

A python module is defined in a python file (with file-ending `.py`), and it can be made accessible to other Python modules and programs using the `import` statement. We will import the module `os`, which provides access to the operating system and is useful for loading data and images. Execute the following cell!

In [None]:
import os

After exectuing this cell, your kernel will have access to everything inside the `os` module which is a common library for interacting with the operating system.  We'll need to use the import statement for all of the libraries that we include. 

Sometimes you may want to write parts of your own code (e.g., utility functions or class definitions) in a seperate file. Or you may prefer coding in your favorite Integrated Development Environment (IDE) such as, for example, [Spyder](https://github.com/spyder-ide/spyder) or [PyCharm](https://www.jetbrains.com/pycharm/). If you want to import your code into Jupyter to visualize results (or hand in coursework), you can use the `import` statement to access your code. 

Example: There should be a file named utils.py in the same directory as this notebook file. This file contains a definition of a function called `utils_test()` that includes a print statement. We now import everything contained in the file utils.py.

In [None]:
import utils

We can then access the function defined in the file utils.py by writing `utils.` in front of the function name:

In [None]:
utils.utils_test()

Alternatively, you can also import the function directly into the current namespace:

In [None]:
from utils import utils_test
utils_test()
# OR 
from utils import * 
# `*` means `everything`
another_function()

Note the missing `utils.` in front of the function call.

Using `from module import function` imports functions directly into the current namespace and thus overwrites existing functions. Compare in the example below.

In [None]:
def utils_test():
    print("This function was defined inside the Jupyter notebook.")
utils_test()

from utils import utils_test
utils_test()
    

We discourage you using the "`from module import function`" statement as you could overwrite some function. The overhead of writing `modulename.` in front of imported functions is a small cost for the increased readability and security of using the plain "`import module`" statement. 

# Numpy

`numpy` is a popular numerical computing module for Python. It comes pre-installed with most Python distributions and is widely used across academia and industry. The main data type of `numpy` are multi-dimensional arrays. Remember what we said about Python being slow because it is dynamically typed? Almost all methods in numpy that manipulate arrays (indexing, transofrming, linear algebra), are written in C and are thus blazingly fast!

We can use the statement "`import long_name_of_a_module as short_name`" to make our lives easier. It is common practice to import `numpy` as `np`.

In [None]:
import numpy as np

**Creating `numpy` arrays**

In [None]:
# One-dimensional array
a = np.array([3, 4, 5, 6])
print("a =", a)

a_2 = np.arange(3, 7) 
print("a_2 =", a_2)

# Two-dimensional array
b = np.array([[1, 2],
              [3, 4]])
print("b ="); print(b)

# Variable-length array of zeros
num_rows = 3
num_cols = 5
c = np.zeros(shape=(num_rows, num_cols))
print("c ="); print(c)

# Variable length array of random integers between 0 (inclusive) and 6 (exclusive) 
d = np.random.randint(low = 0, high=6, size=(num_rows, num_cols))
print("d ="); print(d)

Note that the array `c` consists of floats instead of integers (as can be seen by the trailing '.' behind each 0). You can specify the type as follows.

In [None]:
# Variable-length array of integers
c_int = np.zeros(shape = (num_rows, num_cols), dtype = int)
print("c_int =")
print(c_int)

**Indexing arrays.** Numpy has numerous ways to access elements of your array. Notice that indices start at 0! Let's look at a few of these, more info on indexing can be found [here](http://scipy-cookbook.readthedocs.io/items/Indexing.html).

In [None]:
# Indexing one-dimensional arrays.
second_element_of_a = a[1]
print("Second element of a:  a[1]=", second_element_of_a)

# Two-dimensional arrays
print("Element in the first row, second column of b:   b[0, 1]=", b[0, 1])

**Slicing arrays.** A convinent method to access sub-arrays is to use the ':' operator. Intuitively, ':' says "choose all elements along this dimension" or, if used with a preceding or succeeding integer index, "choose all suceeding or preceding elements", respectively. For example, we can easily access rows or columns of a multi-dimensional array or "elements 5 to 10" of a one-dimensional array.

In [None]:
## Slicing one-dimensional arrays.
print("a =", a)
print("a[1:] =", a[1:])
print("a[2:] =", a[2:])
print("a[:2] =", a[:2])
print("a[1:3] =", a[1:3])

## Slicing two-dimensionl arrays.
# First row of b.
print("b ="); print(b)
print("b[0, :] =", b[0, :])
# Second column of b.
print("b[:, 1] =", b[:, 1])

**"Fancy" indexing**. Numpy allows you to index numpy arrays using other numpy arrays or lists. These can either be  arrays or lists of integers or "boolean masks".

In [None]:
print("d ="); print(d)
row_indices = [0, 2]
# row_indices = np.array([0, 2])

print("Rows 0 and 2 of d:")
print(d[row_indices])

col_indices = [2, 3]
print("Elements [0, 2] and [2, 3] of d:")
print(d[row_indices, col_indices])


In [None]:
print("a =", a)
boolean_mask = np.array([True, False, True, False])
print("a[boolean_mask] =", a[boolean_mask])

# Same as
boolean_mask2 = np.array([1, 0, 1, 0], dtype=bool)
print("a[boolean_mask2] =", a[boolean_mask2])

**Information about your array**

In [None]:
print(a.shape)
print(d.shape)
print(d.dtype)
print(d.ndim)

**Updating array values.**

In [None]:
print(b)
# Assign a specific value
b[0, 1] = 9
print(b)
# Increase / decrease values.
b[1, 0] += 2
b[0, 0] -= 1
print(b)

Numpy provides a large methods of functions for array manipulation, statistics, and linear algebra. Have a look at the module by checking the documentation with `help(np)`. Another option to get information is to write `np.` and then press `<tab>`. This shows a dropdown of all available functions in this module:

In [None]:
# uncomment the lines to try them
# help(np)
# np.<tab>

Selecting a function from the dropdown and adding a `?` at the end will bring up the function's documentation.
Let's look at the mean function of `np`.

In [None]:
np.mean?

We can now calculate means of arrays. If we do not specify any further arguments, np.mean calculates the mean of all elements. Note also the two different ways of executing the same function.

In [None]:
print(np.mean(a))
print(a.mean())
print(np.mean(b))
print(b.mean())

Sometimes we want to compute column-wise or row-wise means or maxima. We can do so by specifying the axis-argument of the `mean()` and `max()` functions, respectively.

In [None]:
# Column-wise means
print(np.mean(b, axis = 0))

# Row-wise maxima
print(np.max(b, axis = 1))

## Linear algebra with `numpy`

Vectorizing code is the key to writing efficient numerical calculation with Python/Numpy. That means that as much as possible of a program should be formulated in terms of matrix and vector operations, like matrix-matrix multiplication.

### Scalar-array operations

We can use the usual arithmetic operators to multiply, add, subtract, and divide arrays with scalar numbers.

In [None]:
v1 = np.arange(0, 5)
print(v1)

In [None]:
v1 * 2

In [None]:
v1 + 2

In [None]:
# Let's create a two-dimensional array using list comprehensions
A = np.array([[n+m*10 for n in range(5)] for m in range(5)])
print(A)

In [None]:
print(A * 2)
print(A + 2)

### Element-wise array-array operations

When we add, subtract, multiply and divide arrays with each other, the default behaviour is **element-wise** operations:

In [None]:
A * A # element-wise multiplication

In [None]:
v1 * v1

If we multiply arrays with compatible shapes, we get an element-wise multiplication of each row (be careful with these...):

In [None]:
A.shape, v1.shape

In [None]:
A * v1

### Matrix algebra

What about matrix mutiplication? We can use the `dot` function, which applies a matrix-matrix, matrix-vector, or inner vector multiplication to its two arguments. 

In [None]:
np.dot(A, A)

In [None]:
np.dot(A, v1)

In [None]:
np.dot(v1, v1)

Now if you want to perform many matrix multiplications in one line, this can become quite ugly (e.g., `np.dot(np.dot(np.dot(A, B), C), D)`). You may ask (especially if you are a Matlab user): "Is there no matrix class such that `A * B * C * D` does matrix multiplication?" The answer is "yes", `numpy` does have a matrix class. However, we recommend to use numpy arrays for two-dimensional arrays. For a thorough comparision between numpy arrays and numpy matrices, have a look at this [Numpy for Matlab users](http://scipy.github.io/old-wiki/pages/NumPy_for_Matlab_Users) guide.

# Plotting with Matplotlib

`matplotlib` is an incredibly powerful Python visualization module. In a reinforcement learning context, we can use it to plot learning curves or to visualize policy functions. We actually use the `matplotlib.pyplot` module which is specifically used in jupyter notebooks.

In [None]:
import matplotlib.pyplot as plt

We'll now tell matplotlib to "inline" plots using an ipython magic function:

In [None]:
%matplotlib inline

This isn't python, so won't work inside of any python script files.  This only works inside notebooks.  What this is saying is that whenever we plot something using matplotlib, put the plots directly into the notebook, instead of using a window popup, which is the default behavior.

Let us now start visualizing some random data. You may use this code as a template to plot a learning curve for an agent. 

Specifically, we will create a sample of 50 random numbers drawn from a [Gaussian random variable](https://en.wikipedia.org/wiki/Normal_distribution) $X \sim \mathcal{N}(\mu, \sigma)$ with mean $\mu = 0$ and standard deviation $\sigma = 0.1$.

In [None]:
mu, sigma = 0, 0.1
sample = np.random.normal(mu, sigma, size=50)

We can create a simple line plot using the plt.plot() command. Note thaty pyplot automatically assumes that the given data is a function of $x = [1, \dots, 50]$ because we did not provide any further data.

In [None]:
plt.plot(sample);

**Customizing your plot.** `matplotlib` lets you change almost every detail of your plot. The `plt.plot()` command that we used in the last cell actually does many things at once: 
* It creates a **figure**-object, which keeps tracks of all 'axes'-objects (see below) and handles general attributes of the plot such as, for example, the figure size. 
* It creates one **axes**-object, which is what you think of as 'a plot', that is, the region where the data is visualized.
* It creates some essential **artist**-objects such as, for example, both the x-axis and the y-axis and their corresponding ticks.
* It uses the data to create the line, another 'artist'.
* Finally, it actually 'shows' the plot by drawing all the artists on the canvas.

We can take control of any of the steps shown above in order to modify aspects of the plot. In the following example, we create one figure with two subplots (two axes-objects) and populate these with different data.

In [None]:
# We create more data that we want to compare with the first sample. 
# Note that this operation adds 2 to every element of the np.array 'sample'
sample2 = np.random.normal(mu, sigma, size=50) + np.linspace(0, 2, num=50)

# We also specify the x-coordinates
x = np.arange(0, 50)

# Create one figure with two subplots that are positioned within one column.
fig, (ax1, ax2) = plt.subplots(nrows=2, ncols=1)

# Modify the first subplot.
ax1.plot(x, sample, color = "red")
ax1.set_title("Two random line plots")
ax1.set_xlabel('Time')
ax1.set_ylabel('y')

# Modify the second subplot.
ax2.plot(x, sample, color="red", label="XYZ")
ax2.plot(x, sample2, color="blue", label="GRLC")
ax2.set_xlabel('Time')
ax2.set_ylabel('y')

plt.legend()
# Show the whole figure including both subplots
plt.show()

# Object-oriented programming in Python: Classes

Classes are the key features of object-oriented programming. A class is a structure for representing an object and the operations that can be performed on the object. 

In Python a class can contain *attributes* (variables) and *methods* (functions).

A class is defined almost like a function, but using the `class` keyword, and the class definition usually contains a number of class method definitions (a function in a class).

* Each class method should have an argument `self` as its first argument. This object is a self-reference.

* Some class method names have special meaning, for example:

    * `__init__`: The name of the method that is invoked when the object is first created.
    * `__str__` : A method that is invoked when a simple string representation of the class is needed, as for example when printed.
    * There are many more, see http://docs.python.org/2/reference/datamodel.html#special-method-names

In [None]:
class Point:
    """
    Simple class for representing a point in a Cartesian coordinate system.
    """
    
    def __init__(self, x, y):
        """
        Create a new Point at x, y.
        """
        self.x = x
        self.y = y
        
    def translate(self, dx, dy):
        """
        Translate the point by dx and dy in the x and y direction.
        """
        self.x += dx
        self.y += dy
        
    def __str__(self):
        return("Point at [%f, %f]" % (self.x, self.y))

To create a new instance of a class:

In [None]:
p1 = Point(0, 0) # this will invoke the __init__ method in the Point class

print(p1)         # this will invoke the __str__ method

To invoke a class method in the class instance `p`:

In [None]:
p2 = Point(1, 1)

p1.translate(0.25, 1.5)

print(p1)
print(p2)

Note that calling class methods can modifiy the state of that particular class instance, but does not effect other class instances or any global variables.

That is one of the nice things about object-oriented design: code such as functions and related variables are grouped in separate and independent entities. 

If you want to know more about OOP in Python (for example, how to use inheritance), give this [tutorial](https://python.swaroopch.com/oop.html) a try.

# Exercise: The Missionaries and Cannibals Problem

The [missionaries and cannibals](https://en.wikipedia.org/wiki/Missionaries_and_cannibals_problem) is a well known toy problem in the AI literature. The problem was used by [Saul Amarel (1968)](https://web.archive.org/web/20080308224227/http://www.cc.gatech.edu/~jimmyd/summaries/amarel1968-1.html) as an example of problem representation but versions of the game are known to be at least [1000 years old](https://en.wikipedia.org/wiki/Missionaries_and_cannibals_problem#History). The problem is also subject of Exercise 3.9 in Russell & Norvig (2016, 3rd ed.) where it is stated as follows (p. 115).

_"Three missionaries and three cannibals are on one side of a river, along with a boat that can hold one or two people. Find a way to get everyone to the other side without ever leaving a group of missionaries in one place outnumbered by the cannibals in that place."_

<br><br>
<figure>
<img src="img/mc-search-space.png",width=600>
<figcaption>Figure 1. The complete search space of the missionaries and cannibals problem. The initial state is shown on the left, whereas the goal state is all the way to the right. Missionaries are represented by black triangles and cannibals by red circles. Arrows represent state transitions and are labelled with actions, e.g., 2c represents the action of two cannibals crossing the river. Credit: [Gerhard Wickler](http://www.aiai.ed.ac.uk/~gwickler/missionaries.html)</figcaption>
</figure>
<br><br>
Your task is to write a Python program that solves the missionaries and cannibals problem using **breadth-first search**. You can directly follow the pseudo-code of R&N (p. 82) depicted below.

<img src="img/Breadth_first_search.png",width=600>

Furthermore can also use the accompanying "Infrastructure for search algorithms" shown in Section 3.3.1 of the same book. Specifically, you may define a `Node` class with attributes
 * `state`: the state in the state space to which the node corresponds;
 * `parent` (optional): the node in the search tree that generated this node;
 * `action` (optional): the action that was applied to the parent to generate the node;
 
and methods

 * `is_goal_state()`: check whether the Node is the goal state;
 * `get_child_node()`: given an action, return the resulting child state;
 * `is_valid_state()`: would the state result in missionaries getting eaten?
 
Once you have a functioning `Node` class you need to come up with data structures for your `frontier` and your set of  `explored` nodes. 

The next choice you have to make is how to represent both states and actions. You may follow Saul Amarel, representing the current state by a simple vector $<a,b,c>$. The vector's elements represent the number of missionaries on the wrong side, the number of cannibals on the wrong side, and the number of boats on the wrong side, respectively. Since the boat and all of the missionaries and cannibals start on the wrong side, the vector is initialized to $<3,3,1>$. Actions are represented using vector subtraction/addition to manipulate the state vector. For instance, if a lone cannibal crossed the river, the vector $<0,1,1>$ would be subtracted from the state to yield $<3,2,0>$.

You can thus (but do not have to) use the following structure and define two classes `Node` and `Game`:

In [6]:
### YOUR CODE HERE!

# class Node:
#     def __init__(self, missionaries_wrong_side, , cannibals_wrong_side, boat_wrong_side, ...):
#         self.state = ...
    
#     def is_goal_state(self):
#         ...

#     def get_child_node(self, action):
#         ...

#     ...

        
# class Game:
#     def __init__(self):
#         self.initial_node = Node(missionaries_wrong_side=3, cannibals_wrong_side=3, boat_wrong_side=1)
#         ...
    
#     def breadth_first_search(self):
#         ...

If you use the provided template you could then try to repduce the following printed output.

In [8]:
g = Game()
goal_node = g.breadth_first_search()
print("The goal node is", goal_node)

Exploring Node <3,3,1> ...
Exploring Node <3,2,0> ...
Exploring Node <3,1,0> ...
Exploring Node <2,2,0> ...
Exploring Node <3,2,1> ...
Exploring Node <3,0,0> ...
Exploring Node <3,1,1> ...
Exploring Node <1,1,0> ...
Exploring Node <2,2,1> ...
Exploring Node <0,2,0> ...
Exploring Node <0,3,1> ...
Exploring Node <0,1,0> ...
Exploring Node <1,1,1> ...
The goal node is Node <0,0,0>


The code used to print to run the previous cell we be made available next week.