# 6COM1034 Concurrency - Practical
## 1st session - week starting 1st February

Congratulations to successfully starting your notebook!

This is the first of a series of notebooks in which we will practise concurrency in Python. Finally, we will CODE a little concurrent python!

Learning outcomes:

1. Learn how to use IPython/Jupyter notebooks, 
2. Know the essential libraries for concurrent programming in Python.
3. Awareness for problems arising from concurrent access to shared state variables without proper synchronisation.

All tutorial exercises are __self-assessed__, which means that you control your progress. But feel free to share your progress with other students and with the tutor!

## 1. Introduction to Jupyter notebooks

Notebooks are a way of interactive computing with Python that allows you to document your work as you write the code. They have originated with IPython (http://ipython.org), a beefed-up version of the Python shell. Notebooks have since grown into a project of their own, Jupyter (http://jupyter.org). 

University computer still have a older version of IPython installed, therefore the notebooks are called "IPython notebooks".

### 1.1 Using a notebook

Please take a moment to make yourself familar yourself with the notebook. In particular, please have a look at the keyboard shortcuts in the help menu - chances are they will make your life a lot easier ;)

The notebook consists of cells of text (like this one), and cells of code (like the one below). To execute a code cell, select it with the arrow keys and press "Shift" + "Enter". 

Try it below.

In [None]:
a = 5
b = 3

The above cell has defined the variables '`a`' and '`b`'. They are now defined in the underlying "*kernel*" - this is simply an instance of Python that is running 'behind the scenes' of the notebook.

We can access these variables in the next cell:

In [None]:
c = a + b
print("c has been assigned the value {}.".format(c))

Note that the output of the `print()` statement appears below the cell when you execute it.

<hr/>
### Exercise 1

Modify `a` and `b` in the cell above and execute it. Then execute the cell that prints the variable `c`. What happens?

<br/>

<hr/>

### Exercise 2

1. Execute the cells below. What happens when you execute the second cell?
2. Please fix the error by assigning a value to the variable `c` that won't result in an error.

In [None]:
a = 1
b = 2
c = 3

In [None]:
d = (a + b)
e = d/c
print("Finally fixed it", e)

<hr/>

## 2. Libraries for concurrent programming in Python

There are two principal libraries, called _modules_, included in Python that support concurrent programming: `threading` and `multiprocessing`.

1. [threading](https://docs.python.org/2/library/threading.html)
2. [multiprocessing](https://docs.python.org/2/library/multiprocessing.html)

(Hint: You may want to open browser tabs with these links - they point to the official documentation for the modules.)

The name say's it all - '`threading`' is about using threads, '`multiprocessing`' is about using multiple processes.

Let's start with the `threading` module. 



### 2.1 Using `threading` for the "Bank Account" example

We will use the `threading` module to simulate the "Bank Account" scenario from the lecture and the tutorial.

Imagine John, Paul, and Mary each having concurrent access to a shared bank account. The bank account initially has £100 in it. They each make a transaction:

 - John makes a deposit of £10
 - Paul withdraws £20
 - Mary withdraws half of the balance she sees.

Let's implement each person as a `Thread` object. The first step is to import the `threading` library.

In [None]:
import threading as th

__Note__: Jupyter notebooks provide a quick and convenient way to access the documentation of any class or function - just add a question mark `?` after the statement. Please have a look at the documentation of the `Thread` class by executing the following cell.

You can close the help viewer either __by pressing the '`q`' key__, or by clicking the small __X__ in the top right of the help window.

In [None]:
th.Thread?

The following part of the documentation is of particular relevance because it tells us how to make a `Thread` object do something

```There are two ways to specify the activity: 
by passing a callable object to the constructor, or
by overriding the run() method in a subclass.```

We will choose the first option here, to pass a callable object to the constructor.

 We will be using the `withdraw` function from the tutorial to manipulate the balance (see below). Note that calling `withdraw` with a negative amount will make a deposit.

__NOTE__: Using a shared state variable between threads (`balance` in this case) without proper synchronisation is __BAD PROGRAMMING STYLE__ and should be avoided. We use it here for educational purposes only, to show you how __things go utterly wrong__ when doing this. 



In [None]:
balance = 100

def threaded_withdraw(amount):
    global balance # this allows us to access the 'balance' variable 
    if (balance >= amount):
        import threading # we need to import this to get access to the 'current_thread' function.
        print("{} sees balance {}, withdraws {}".format(threading.current_thread(),
                                                        balance,
                                                        amount))
        balance = balance - amount; 
        return True;
    return False;


The important thing here is the `target` argument in the constructor. It lets us define a function that the Thread's `run` method will call. We can provide parameters for the function to be called through the `args` argument in the constructor.

The Thread should call the `withdraw` method with the appropriate amount. Let's make threads for the transactions by John and Paul:

In [None]:
transaction_john = th.Thread(target=threaded_withdraw, args=(-10,), name='John')
transaction_paul = th.Thread(target=threaded_withdraw, args=(20,), name='Paul')

Mary's transaction is a bit more complicated, since we need to get the balance first. We'll use a custom function for this and make it the target of Mary's transaction thread.

In [None]:
def threaded_withdraw_mary():
    global balance
    amount = balance/2.
    if (balance >= amount):
        import threading
        print("{} sees balance {}, withdraws {}".format(threading.current_thread(),
                                                        balance,
                                                        amount))
        balance = balance - amount; 
        return True;
    return False;

transaction_mary = th.Thread(target=threaded_withdraw_mary, name='Mary') #no args since we compute the amount in the method itself. 

Let's carry out all transactions one after the other and check the balance after each step.

In [None]:
balance = 100
print('initial balance is {}'.format(balance))
transaction_john.start()
transaction_john.join()
print('balance after John is {}'.format(balance))
transaction_paul.start()
transaction_paul.join()
print('balance after Paul is {}'.format(balance))
transaction_mary.start()
transaction_mary.join()
print('balance after Mary is {}'.format(balance))

<hr/>
### Exercise 3

Change the order in which the transactions are executed. Reproduce all 6 permutations of John, Paul, Mary that we discussed in the tutorial session. You will have to create a new cell in the notebook: From the "Insert" menu, use the command "Insert Cell below" to create a new cell. Copy the code above into the new cell using copy and paste. 

<hr/>

### 2.2 Using `multiprocessing`

The `multiprocessing` library provides classes and functions to manage concurrent programming using processes that are executed in parallel. 

To use the multiprocessing library, we first have to import it:

In [None]:
import multiprocessing as mp

Please take a minute to take a look at the documentation of the `Process` class in the `multiprocessing` module:

In [None]:
mp.Process?

Doesn't say much, eh? But actually it says all we need: 

```The class is analogous to 'threading.Thread'.```

Therefore we can construct `Process` objects in the same way that we construct `Thread` objects, i.e. by passing a callable object to the constructor.

Unfortunately, due to a particular behavior of the Windows operating system, we can't just define the functions in the interactive notebook, as we did for the Threading example. We must define them in a file that we import.

Below are the contents of the `withdraw_process.py` file that you downloaded at the beginning of the session. Note that this is not an executable cell, but rather a text cell with syntax highlighting. In that file we define the the `withdraw` functions for use with the `multiprocessing` module. Please take a minute to look at the contents: 

```python
import multiprocessing as mp

def mp_withdraw(amount, balance):
    if (balance.value >= amount):
        print("{} sees balance {}, withdraws {}".format(mp.current_process(),
                                                        balance.value,
                                                        amount))
        balance.value = balance.value - amount 
        return True
    return False

def mp_withdraw_mary(balance):
    amount = balance.value//2
    if (balance.value >= amount):
        print("{} sees balance {}, withdraws {}".format(mp.current_process(),
                                                        balance.value,
                                                        amount))
        balance.value = balance.value - amount 
        return True
    return False

def make_initial_balance():
    balance = mp.Value('i', 100) # 'i' refers to the type, it's an integer
    return balance

def make_processes():
    balance = make_initial_balance()
    transaction_john = mp.Process(target=mp_withdraw, args=(-10, balance), name='John')
    transaction_paul = mp.Process(target=mp_withdraw, args=(20, balance), name='Paul')
    transaction_mary = mp.Process(target=mp_withdraw_mary, args=(balance,), name='Mary')
    return transaction_john, transaction_paul, transaction_mary, balance```


You will have notices the `mp_withdraw` and `mp_withdraw_mary` functions that are very similar to the threaded functions that we used above. But there are also a few important differences. First, there is an additional agrument in the function signature, `balance`.

The reason is that we can't use `global` to share data between processes. Each process has its own encapsulated data and we can't just access the `balance` variable from another process. Therefore we use the `Value` class provided by the `multiprocessing` module, which provides the functionality we need. Have a second look at the `make_initial_balance` function where we create the `balance` variable as an instance of the `Value` class with type `'i'` (for integer) and value 100.

The second difference is a helper function called `make_processes` that creates the processes and the balance (as an instance of `Value`) for us. This simplifies the usage of this module. 

We will next import the `withdraw_process` file as a module:

In [None]:
import withdraw_processes as wp

In [None]:
wp.mp_withdraw?

All we need to do now is to call the `make_processes` function in the `withdraw_processes` module, and then call the `start()` method of each process.

In [None]:
john, paul, mary, balance = wp.make_processes()

print('initial balance is {}'.format(balance.value))
john.start()
paul.start()
mary.start()
john.join()
paul.join()
mary.join()
print('final balance is {}'.format(balance.value))

<hr/>
### Exercise 4

Run the above cell at least 10 times. 
 * Is the final balance always the same? 

__Note__: On windows, only the initial and final balance are shown in the notebook. The outputs of the processes will appear in the console window behind the notebook.

<br/>
<hr/>

### Exercise 5

Have a look at the code in the cell below. It does essentially the same as the code for exercise 4. The only difference is that calls to `start()` and `join()` are now interleaved. Run that code several times.
 * Does the final balance ever vary?
 * Do John, Paul and Mary always see the same balance?
 * Why does the balance not vary as in exercise 4?

In [None]:
john, paul, mary, balance = wp.make_processes()

print('initial balance is {}'.format(balance.value))
john.start()
john.join()
paul.start()
paul.join()
mary.start()
mary.join()
print('final balance is {}'.format(balance.value))

## 3 Wrapping it up

In this session you have learned about the two central modules that support concurrent programming in Python, `threading` and `multiprocessing`. 

### 3.1 Lesson 1: Shared state without synchronisation is problematic.
We have studied a practical example of concurrent banking transactions in which we  deliberately used a shared state variable ("balance") without proper synchronisation. The problem that this causes is that the variable's value will be different depending on the order of execution of the concurrent threads (Exercise 3) or processes (Exercise 4). The programmer has no control over the order in which concurrent threads or processes are executed. Therefore, __you should never write concurrent code without ensuring proper synchronisation!__

### 3.2 Lesson 2: There are differences between `threading` and `multiprocessing` 
The `threading` module works with threads, that is, the local variables are still visible in the context of the threads. That's why we can use the `global` statement to share variables between threads. (Did we mention that doing so without proper synchronisation is to be avoided?) 

The `multiprocessing` module uses processes, which means that the code that is executed in separate processes cannot by default access code in other processes. Any sharing of values must happen through specific means, like e.g. the  `Value` class that we used. 

### 3.3 Outlook
The `multiprocessing` and `threading` modules provides many more means to synchronise state between threads and processes. We will learn about some of those in the next session.

