#### <center>Intermediate Python and Software Enginnering</center>


## <center>Section 08 - Part 03 - Global Interpreter Lock</center>


### <center>Innovation Scholars Programme</center>
### <center>King's College London, Medical Research Council and UKRI <center>

## Global Interpreter Lock

* Python uses a global lock, the GIL, to prevent threads from interpreting code at the same time
* Prevents objects from being accessed by multiple threads at once, alternative would be expensive per-object locks
* Threading in Python doesn't provide much speed-up as a result
* Threads still needed for GUIs, asynchronous IO (which releases the GIL), concurrent calls to non-Python code (also releases the GIL, eg. Numpy)

* One solution to speeding up calculations is multiprocessing
* A process differs from threads in having it's own memory address space
* Processes need to use shared memory, pipes, sockets, or other mechanisms to communicate
* Each process is a separate program with separate GIL
* Locks, events, etc. have versions for processes

* Creating a process is (almost) as easy:

In [5]:
import multiprocessing, os


def print_mp():
    print("I am process", os.getpid())


print("Process", os.getpid(), "spawning a new process")
p = multiprocessing.Process(target=print_mp)
p.start()

Process 19329 spawning a new process
I am process 19353


<IPython.core.display.Javascript object>

* Processes don't share memory so how did the code for `print_mp` get to the new process to be executed?
* Answer: black magic
* Real answer: Python has ways of reloading script files in subprocesses, serializing data/code and sending it down a channel, or (in Linux and macOS) using the weird way subprocesses are spawned by cloning with `fork()`

* Above code doesn't work in Windows
* Instead write to file and run as a module:

In [6]:
%%writefile mptest.py

import multiprocessing, os

def print_mp():
    print('I am process',os.getpid())

if __name__=='__main__': # needed to prevent errors from multiple loads
    print('Process',os.getpid(), 'spawning a new process')
    p=multiprocessing.Process(target=print_mp)
    p.start()

Overwriting mptest.py


<IPython.core.display.Javascript object>

In [7]:
!python mptest.py

Process 19357 spawning a new process
I am process 19358


<IPython.core.display.Javascript object>

* Processes communicate using locks, events, etc. but share large amounts of data with shared memory
* Special memory OS allows multiple processes to access

In [8]:
%%writefile arraytest.py
import multiprocessing

def fill_array(arr):
    arr[:]=[5,6,7]
    
if __name__=='__main__':
    dat=multiprocessing.Array('i',[1,2,3])
    p=multiprocessing.Process(target=fill_array,args=(dat,))
    p.start()
    p.join()
    print(dat[:])

Overwriting arraytest.py


<IPython.core.display.Javascript object>

In [9]:
!python arraytest.py

[5, 6, 7]


<IPython.core.display.Javascript object>

* Easier to use advanced facilities like `Pool` and `Manager`:

In [10]:
%%writefile pooltest.py
from multiprocessing import Pool, Manager

def fill_array(index,arr):
    arr[index]=index**2

if __name__=='__main__':
    with Manager() as manager:
        arr=manager.list(range(10))
        
        with Pool(processes=2) as pool:
            dat = [(i,arr) for i in range(10)]
            res=pool.starmap_async(fill_array, dat)
            res.get(timeout=1)
            print(arr)

Overwriting pooltest.py


<IPython.core.display.Javascript object>

In [11]:
!python pooltest.py

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


<IPython.core.display.Javascript object>

* `Manager` is used to manage shared lists and other structures between processes
* `Pool` manages spawning processes and apply operations to them like map or starmap in this case