# Concurrency

Computers have been getting faster and faster over the decades, but we are starting
to bump into some of the limitations of physics. This means that in order to get more
work done, we need to move to using multiple processes in parallel. There are several
techniques available within Python to support concurrent execution of code.
The first technique is to use threads to break up the work. The main problem with this
method is that it suffers from the bottleneck caused by the GIL (Global Interpreter Lock ).
Threads that are doing I/O or using certain modules, such as numpy, can get around this
bottleneck. If you need to do more computational work, you may want to uses processes
instead. In this chapter, you will look at several of the available options within Python.
### 10-1. Creating a Thread

You want to create a thread to do some task.
<br>
The Python standard library contains a module named threading that contains a Thread
class.
<br>
The main class, Thread , provides for running multiple functions in parallel. Listing 10-1
shows how to create and run a basic thread.
### Listing 10-1. Creating a Thread

In [2]:
import threading
def print_sum():
    print('The sum of 1 and 2 is 3')
my_thread = threading.Thread(target=print_sum)
my_thread.start()

The sum of 1 and 2 is 3


You should note that the thread that you created won't start executing the target
function until you call the start() method. If the function is a longer-running one, you
can check to see whether it is still running by using the is_alive() method. It will return
a Boolean value telling you whether it is still running or not. If you can't continue without
the results from a given thread, you can call the join() method to force a wait until the
thread is all done.
### 10-2. Using Locks

You need to control thread access to a particular resource.
<br>
The threading module includes a Lock class to control thread access.
<br>
Locks are used when threads need to access global resources safely. Listing 10-2 shows
how to create and use a lock object.
#### Listing 10-2. Creating a Lock Object

In [None]:
import threading
sum = 0
my_lock = threading.Lock()
def adder():
    global sum, my_lock
    my_lock.acquire()
    sum = sum + 1
    my_lock.release()
my_thread = threading.thread(target=adder)
my_thread.start()

By default, the acquire() method of the lock object blocks if the lock was already
acquired by another thread. If, instead, you want to be able to do something else while
waiting for a lock, you can use the parameter blocking=False in the acquire() method.
It will return right away, giving you a Boolean value as to whether the acquire attempt
succeeded or not.
### 10-3. Setting a Barrier

You need to synchronize thread activity by setting a common stop point.
<br>
The threading module includes a barrier object that can be used to set a common
stopping point.
<br>
In many languages, using a barrier involves a simple function call, whereas in Python,
barriers are managed with an object. Listing 10-3 shows how you can create a barrier for
five threads.
#### Listing 10-3. Creating a Barrier Object import threading

In [4]:
b = threading.Barrier(5, timeout=10)

As you can see, you have to explicitly tell the barrier object how many threads will
be using it. You can also set a timeout for the maximum amount of time that the threads
are allowed to wait for the barrier to be satisfied. To actually use the barrier object, each
thread needs to call the wait() method.
### 10-4. Creating a Process

You need to create multiple processes for multiprocessing .
<br>
The Python standard library includes a module named multiprocessing that contains a
Process class.
<br>
If your code is being affected by the GIL, one way around it is by using the class Process
to spawn other tasks outside the main Python process. The interface is very similar to
that for threads. For example, Listing 10-4 shows how to create a new process and start it
running.
#### Listing 10-4. Creating a New Process

In [5]:
import multiprocessing
def adder(a, b):
    return a+b
proc1 = multiprocessing.Process(target=adder, args=(2,2))
proc1.start()
proc1.join()

As you can see, your newly created process object is executed with the start() method,
and you can force the main part of your code to wait for results with the join() method.
### 10-5. Communicating Between Processes

You need to send information from one process object to another.
<br>
The multiprocessing module has two classes that can be used for inter-process
communications: the pipe and queue classes.
<br>
Because process objects execute outside the main part of the Python interpreter,
communicating with them, or between them, requires a bit more work. The most basic
form of communication is the pipe. Listing 10-5 shows how to create a new pipe object.
#### Listing 10-5. Creating a Pipe
import multiprocessing

In [6]:
def func(pipe_end):
    pipe_end.send(['hello', 'world'])
    pipe_end.close()
parent_end, child_end = multiprocessing.Pipe()
proc1 = multiprocessing.Process(target=func, args=(child_end,))
proc1.start()
print(parent_end.recv())
proc1.join()

['hello', 'world']


As you can see, pipes are a simple communication channel with two ends that
processes can read to and write from. Pipes are full-duplex, so messages can be sent from
both ends. The major problem with pipes, however, is that the ends can only be used by
one process at a time. If two processes try to read from or write to the same end at the
same time, the data could be corrupted.
A different technique is to use a queue object to communicate with. A queue is a
FIFO (First In, First Out ) object. It can accept data from multiple processes and multiple
processes can pull data off the queue . Listing 10-6 show how you can create and use a
queue object.
#### Listing 10-6. Creating a Queue

In [7]:
import multiprocessing
def func(queue1):
    queue1.put(['hello', 'world'])
my_queue = multiprocessing.Queue()
proc1 = multiprocessing.Process(target=func, args=(my_queue,))
proc1.start()
print(my_queue.get())
proc1.join()

['hello', 'world']


### 10-6. Creating a Pool of Workers


You need to start up and run a pool of processes .
<br>
The Python standard library module, multiprocessing , contains a Pool class to manage
a task queue.
<br>
When you have a whole series of tasks that need to be handled, you can create a pool of
processes to work through those tasks. Listing 10-7 shows how to create a pool of four
worker processes and have them work on a stack of tasks.
#### Listing 10-7. Creating a Pool of Processes

In [9]:
import multiprocessing
def adder(a):
    return a+a
pool1 = multiprocessing.Pool(processes=4)

This newly created pool object has several different methods to divide the tasks
among the processes. There are both blocking and non-blocking versions of these
methods . For example, Listing 10-8 shows how to use the map method.
Listing 10-8. Mapping a Function to a Pool of Processes
# This method blocks untill all processes return

In [10]:
pool1.map(adder, range(10))

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [11]:
# This method returns a result object
results = pool1.map_async(adder, range(10))
results.wait()

The last line in Listing 10-8 is used to block and wait until all of the results are returned.
You can use it when you are ready to use all of the results from the farmed out tasks.

### 10-7. Creating a Subprocess

You need to spawn a subprocess to handle an external task.
<br>
The Python standard library contains a module named subprocess that can spawn
external processes.
<br>
The subprocess module is intended to replace the older os.system and os.spawn
methods for running external processes. The module contains a method named run() that
is the usual method for using subprocesses. For example, Listing 10-9 shows how you can
get a list of the files in the current directory on either a Linux box or a Mac OS machine.
#### Listing 10-9. Spawning a Subprocess

In [12]:
import subprocess
results = subprocess.run(['ls', '-l'], stdout=subprocess.PIPE)
print(results.stdout)

b'total 216\n-rw-rw-r-- 1 sara sara 14336 Apr 29 20:02 10. Concurrency.ipynb\n-rw-rw-r-- 1 sara sara  9856 Apr 29 19:54 11. C and Other Extensions.ipynb\n-rw-rw-r-- 1 sara sara 15163 Apr 21 22:24 1. Strings and Texts.ipynb\n-rw-rw-r-- 1 sara sara 11092 Apr 21 22:55 2. Numbers, Dates, and Times.ipynb\n-rw-rw-r-- 1 sara sara 17781 Apr 22 10:01 3. Iterators and Generators.ipynb\n-rw-rw-r-- 1 sara sara 19906 Apr 22 11:11 4. Files and IO.ipynb\n-rw-rw-r-- 1 sara sara 38590 Apr 22 12:43 5. Python Data Analysis with pandas.ipynb\n-rw-rw-r-- 1 sara sara 15322 Apr 22 13:15 6. Functions.ipynb\n-rw-rw-r-- 1 sara sara 18356 Apr 29 17:56 7. Classes and Objects.ipynb\n-rw-rw-r-- 1 sara sara 11598 Apr 29 18:34 8. Metaprogramming.ipynb\n-rw-rw-r-- 1 sara sara 28714 Apr 29 19:31 9. Numerics and Numpy.ipynb\n'


The returned results object contains a lot of information about how the external
process ran, including the exit code.
### 10-8. Scheduling Events

You need to schedule tasks for some later time.
<br>
The Python standard library contains a module named sched that has several objects and
methods that are useful in scheduling work at different times.
<br> To schedule future tasks, you can create a Scheduler object to can take a queue of events
and manage them. You can use the enter() method to add events to the queue, as in
Listing 10-10 .
#### Listing 10-10. Creating a Scheduler

In [13]:
import sched, time
def print_time():
    print(time.time())
my_sched = sched.scheduler()
my_sched.enter(10, 1, print_time)

Event(time=12741.710048027, priority=1, action=<function print_time at 0x7f3ce8487ae8>, argument=(), kwargs={})

The enter() method takes a delay, a priority, and a function to execute when the
event triggers. If you want to have an event trigger at a specific time instead, you can use the
enterabs() method. Once you have a larger list of events, you can always check the current
queue with the scheduler's queue attribute. If you get to the end of your program, you can
force your code to stop and wait until all the events have finished with the run() method.