# Speed Up Your Python Program With Concurrency

The dictionary definition of concurrency is simultaneous occurrence. In Python, the things that are occurring simultaneously are called by different names (thread, task, process) but at a high level, they all refer to a sequence of instructions that run in order.

<https://realpython.com/python-concurrency/>

## 001. Synchronous Version

Let’s start by focusing on I/O-bound programs and a common problem: downloading content over the network. For our example, you will be downloading web pages from a few sites, but it really could be any network traffic. It’s just easier to visualize and set up with web pages.

In [1]:
import sys
from pathlib import Path

current_dir = Path().resolve()
while current_dir != current_dir.parent and current_dir.name != "katas":
    current_dir = current_dir.parent
if current_dir != current_dir.parent:
    sys.path.append(current_dir.as_posix())

In [2]:
from IPython.core.interactiveshell import InteractiveShell

InteractiveShell.ast_node_interactivity = "all"

### 001.001 IO intensive, Non-concurrent

We’ll start with a non-concurrent version of this task. Note that this program requires the requests module. You should run pip install requests before running it, probably using a virtualenv. This version does not use concurrency at all:


1. Create a function `download_site`
    1. it passed a session and a URL
    1. issues a GET request to the URL
    1. prints a dot and a space without a newline at the end
1. Create a function `download_all_sites`
    1. which creates a session
    1. loops through sites
    1. passes url and session to `download_site`
1. profile it with a simple `time` call before and after the `download_all_sites` call


In [3]:
import requests
import time

sites = [
        "https://www.jython.org",
        "http://olympus.realpython.org/dice",
    ] * 80

1
def download_site(url, session):
    ...
 
2   
def download_all_sites(sites):
    ...

3
...
duration = 0
print(f"\nDownloaded {len(sites)} sites in {duration} seconds")    
# solution


1

2

3

Ellipsis


Downloaded 160 sites in 0 seconds


1

2

3

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
Downloaded 160 sites in 12.296099185943604 seconds


### 001.002 IO intensive, Threads

When you add threading, the overall structure is the same and you only needed to make a few changes. download_all_sites() changes from calling the function once per site to a more complex structure.

You will use 5 threads

1. Each thread needs to create its own requests.Session() object. How many thread local storage objects are needed, so that each can create its own Session obejct?
    1. Create them
    1. Create a function `get_session` that creates a requests.Session object specific to this
1. Create a function `download_site`
    1. This time don't pass the session, but get it from a `get_session` call 
    1. Print an error message showing bytes length
    1. Everything else as previous exercise
1. Create a function `download_all_sites`
    1. which uses a ThreadPoolExecutor with 5 workers to call `download_site`
1. profile the same as the previous example


In [4]:
import concurrent.futures
import requests
import threading
import time

sites = [
    "https://www.jython.org",
    "http://olympus.realpython.org/dice",
] * 80

2


def download_site(url):
    ...
    print(f"Read {??} bytes" + f" from {url}")


3


def download_all_sites(sites):
    ...


5
start_time = time.time()
download_all_sites(sites)
duration = time.time() - start_time
print(f"\nDownloaded {len(sites)} in {duration} seconds")

# solution


SyntaxError: f-string: invalid syntax (2220576202.py, line 16)

### 001.003 IO intensive, Asyncio

One of the cool advantages of asyncio is that it scales far better than threading. Each task takes far fewer resources and less time to create than a thread, so creating and running more of them works well. This example just creates a separate task for each site to download, which works out quite well.

1. `download_site` get passed an aiohttp ClientSession object
    1. Q: Why can they share the session and threads couldn't?
    1. Everything else as previous exercise
1. Create a function `download_all_sites`
    1. which creates a context for the ClientSession, and passes it to calls of `download_site`
    1. for each url, create a task with `ensure_future`
    1. Q: when do tasks start? 
    1. Conclude by gathering tasks
1. profile the same as the previous example
    1. Replace the ellipse with asyncio; get the event loop and run all tasks in it until completion
    1. Up to 3.7 you needed to do `asyncio.get_event_loop().run_until_complete(` but now you can just use...


In [None]:
import asyncio
import time
import aiohttp

# this is needed to make asyncio run inside notebooks, without the
# This event loop is already running RunTimeError
import nest_asyncio
nest_asyncio.apply()

sites = [
        "https://www.jython.org",
        "http://olympus.realpython.org/dice",
    ] * 80

1
def download_site(url, session):
    ...

3
def download_all_sites(sites):
    tasks = []
    ...
    
5
start_time = time.time()
...
duration = time.time() - start_time
print(f"\nDownloaded {len(sites)} in {duration} seconds")

# solution


1

3

5

Ellipsis


Downloaded 160 in 0.001135110855102539 seconds


1

Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www.jython.org
Read 3721 bytes from https://www

### 001.004 IO intensive, Multiprocessing

Unlike the previous approaches, the multiprocessing version of the code takes full advantage of the multiple CPUs that your cool, new computer has. 

1. Note that you have to run this in a python file. I don't know how run it inside Jupyter
1. Set up a requests.Session for each process by creating a `set_global_session` function
    1. Q: Will all the processes share the same requests.Session?
    1. A singleton
1. Create a function `download_all_sites`
    1. which creates a pool of processes, and calls the `set_global_session` function before on init
    1. Then rus `download_site` on each member of `sites`
1. Slight difference with `download_site` - it should also print the process' name
    1. Uses the global `sessopm`
1. profile the same as the previous example
    1. This time use `__main__` because you are in a self.contained file


In [None]:
import requests
import multiprocessing
import time

sites = [
        "https://www.jython.org",
        "http://olympus.realpython.org/dice",
    ] # * 80


# solution


In [None]:
%%bash
python 001_python_real_python_4.py


Downloaded 160 in 0.0 seconds


### 001.005 CPU intensive, synchronous

For the purposes of our example, we’ll use a somewhat silly function to create something that takes a long time to run on the CPU. This code calls cpu_bound() 20 times with a different large number each time. It does all of this on a single thread in a single process on a single CPU. 

1. Create `find_sums` which runs cpu_bound sequentially on numbers

In [None]:
import time

1
def cpu_bound(number):
    return sum(i * i for i in range(number))

2
def find_sums(numbers):
    ...

3
numbers = [5_000_000 + x for x in range(20)]
start_time = time.time()
find_sums(numbers)
duration = time.time() - start_time
print(f"Duration {duration} seconds")

# solution


1

2

3

Duration 2.384185791015625e-05 seconds


2

3

Duration 4.716731071472168 seconds


### 001.006 CPU intensive, threading

In your I/O-bound example above, much of the overall time was spent waiting for slow operations to finish. threading and asyncio sped this up by allowing you to overlap the times you were waiting instead of doing them sequentially. On a CPU-bound problem, however, there is no waiting. The CPU is cranking away as fast as it can to finish the problem.

1. This time `find_sums` should use a threadpoolexecutor with 5 workers

In [5]:
import concurrent.futures
import time


1
def cpu_bound(number):
    return sum(i * i for i in range(number))

2
def find_sums(numbers):
    ...


# 3
# numbers = [5_000_000 + x for x in range(20)]
# start_time = time.time()
# find_sums(numbers)
# duration = time.time() - start_time
# print(f"Duration {duration} seconds")

# solution


1

2

1

2

3

Duration 4.705935955047607 seconds


### 001.007 CPU intensive, multiprocessing

In your I/O-bound example above, much of the overall time was spent waiting for slow operations to finish. threading and asyncio sped this up by allowing you to overlap the times you were waiting instead of doing them sequentially. On a CPU-bound problem, however, there is no waiting. The CPU is cranking away as fast as it can to finish the problem.

1. This time `find_sums` should use a multiprocessing
    1. also note that the timing function runs in `__main__`

In [None]:
import multiprocessing
import time


# solution 

# 1
# def cpu_bound(number):
#     return sum(i * i for i in range(number))

# 2
# def find_sums(numbers):
#     with multiprocessing.Pool() as pool:
#         pool.map(cpu_bound, numbers)

# 3
# if __name__ == "__main__":
#     numbers = [5_000_000 + x for x in range(20)]
#     start_time = time.time()
#     find_sums(numbers)
#     duration = time.time() - start_time
#     print(f"Duration {duration} seconds")

1

In [None]:
%%bash
python 001_python_real_python_7.py

Duration 1.1920928955078125e-06 seconds
