# What Is Concurrency?

The dictionary definition of concurrency is simultaneous occurrence. In Python, the things that are occurring simultaneously are called by different names (thread, task, process) but at a high level, they all refer to a sequence of instructions that run in order.

I like to think of them as different trains of thought. Each one can be stopped at certain points, and the CPU or brain that is processing them can switch to a different one. The state of each one is saved so it can be restarted right where it was interrupted.

You might wonder why Python uses different words for the same concept. It turns out that threads, tasks, and processes are only the same if you view them from a high level. Once you start digging into the details, they all represent slightly different things. You’ll see more of how they are different as you progress through the examples.

Now let’s talk about the simultaneous part of that definition. You have to be a little careful because, when you get down to the details, only multiprocessing actually runs these trains of thought at literally the same time. Threading and asyncio both run on a single processor and therefore only run one at a time. They just cleverly find ways to take turns to speed up the overall process. Even though they don’t run different trains of thought simultaneously, we still call this concurrency.

The way the threads or tasks take turns is the big difference between threading and asyncio. In threading, the operating system actually knows about each thread and can interrupt it at any time to start running a different thread. This is called pre-emptive multitasking since the operating system can pre-empt your thread to make the switch.

Pre-emptive multitasking is handy in that the code in the thread doesn’t need to do anything to make the switch. It can also be difficult because of that “at any time” phrase. This switch can happen in the middle of a single Python statement, even a trivial one like x = x + 1.

Asyncio, on the other hand, uses cooperative multitasking. The tasks must cooperate by announcing when they are ready to be switched out. That means that the code in the task has to change slightly to make this happen.

The benefit of doing this extra work up front is that you always know where your task will be swapped out. It will not be swapped out in the middle of a Python statement unless that statement is marked. You’ll see later how this can simplify parts of your design.

# What Is Parallelism?

So far, you’ve looked at concurrency that happens on a single processor. What about all of those CPU cores your cool, new laptop has? How can you make use of them? multiprocessing is the answer.

With multiprocessing, Python creates new processes. A process here can be thought of as almost a completely different program, though technically they’re usually defined as a collection of resources where the resources include memory, file handles and things like that. One way to think about it is that each process runs in its own Python interpreter.

Because they are different processes, each of your trains of thought in a multiprocessing program can run on a different core. Running on a different core means that they actually can run at the same time, which is fabulous. There are some complications that arise from doing this, but Python does a pretty good job of smoothing them over most of the time.

Now that you have an idea of what concurrency and parallelism are, let’s review their differences, and then we can look at why they can be useful:


Concurrency Type |	Switching Decision |	Number of Processors
-----------------|---------------------|------------------------
Pre-emptive multitasking (threading) |	The operating system decides when to switch tasks external to Python. |	1
Cooperative multitasking (asyncio) |	The tasks decide when to give up control. |	1
Multiprocessing (multiprocessing) |	The processes all run at the same time on different processors.	| Many

## When Is Concurrency Useful?

Concurrency can make a big difference for two types of problems. These are generally called CPU-bound and I/O-bound.

I/O-bound problems cause your program to slow down because it frequently must wait for input/output (I/O) from some external resource. They arise frequently when your program is working with things that are much slower than your CPU.

![IOBound](images/IOBound.webp "Title")

On the flip side, there are classes of programs that do significant computation without talking to the network or accessing a file. These are the CPU-bound programs, because the resource limiting the speed of your program is the CPU, not the network or the file system.

![CPUBound](images/CPUBound.webp "Title")

I/O-Bound Process |	CPU-Bound Process
------------------|-------------------
Your program spends most of its time talking to a slow device, like a network connection, a hard drive, or a printer.|	You program spends most of its time doing CPU operations.
Speeding it up involves overlapping the times spent waiting for these devices. |	Speeding it up involves finding ways to do more computations in the same amount of time.


## How to Speed Up an I/O-Bound Program

### Synchronous Version

In [1]:
import requests
import time


def download_site(url, session):
    with session.get(url) as response:
        print(f"Read {len(response.content)} from {url}")


def download_all_sites(sites):
    with requests.Session() as session:
        for url in sites:
            download_site(url, session)


if __name__ == "__main__":
    sites = [
        "https://www.jython.org",
        "http://olympus.realpython.org/dice",
    ] * 80
    start_time = time.time()
    download_all_sites(sites)
    duration = time.time() - start_time
    print(f"Downloaded {len(sites)} in {duration} seconds")
    

Read 10282 from https://www.jython.org
Read 275 from http://olympus.realpython.org/dice
Read 10282 from https://www.jython.org
Read 275 from http://olympus.realpython.org/dice
Read 10282 from https://www.jython.org
Read 275 from http://olympus.realpython.org/dice
Read 10282 from https://www.jython.org
Read 275 from http://olympus.realpython.org/dice
Read 10282 from https://www.jython.org
Read 275 from http://olympus.realpython.org/dice
Read 10282 from https://www.jython.org
Read 275 from http://olympus.realpython.org/dice
Read 10282 from https://www.jython.org
Read 275 from http://olympus.realpython.org/dice
Read 10282 from https://www.jython.org
Read 275 from http://olympus.realpython.org/dice
Read 10282 from https://www.jython.org
Read 275 from http://olympus.realpython.org/dice
Read 10282 from https://www.jython.org
Read 275 from http://olympus.realpython.org/dice
Read 10282 from https://www.jython.org
Read 275 from http://olympus.realpython.org/dice
Read 10282 from https://www.jyth

```
Downloaded 160 in 15.743239879608154 seconds
```

### **threading** Version

In [86]:
import concurrent.futures
import requests
import threading
import time


thread_local = threading.local()


def get_session():
    if not hasattr(thread_local, "session"):
        thread_local.session = requests.Session()
    return thread_local.session


def download_site(url):
    session = get_session()
    with session.get(url) as response:
        print(f"{threading.current_thread().name}:Read {len(response.content)} from {url}")


def download_all_sites(sites):
    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
        executor.map(download_site, sites)


if __name__ == "__main__":
    sites = [
        "https://www.jython.org",
        "http://olympus.realpython.org/dice",
    ] * 80
    start_time = time.time()
    download_all_sites(sites)
    duration = time.time() - start_time
    print(f"Downloaded {len(sites)} in {duration} seconds")

ThreadPoolExecutor-2_2:Read 10282 from https://www.jython.org
ThreadPoolExecutor-2_4:Read 10282 from https://www.jython.org
ThreadPoolExecutor-2_0:Read 10282 from https://www.jython.org
ThreadPoolExecutor-2_4:Read 10282 from https://www.jython.org
ThreadPoolExecutor-2_1:Read 275 from http://olympus.realpython.org/dice
ThreadPoolExecutor-2_3:Read 275 from http://olympus.realpython.org/dice
ThreadPoolExecutor-2_4:Read 10282 from https://www.jython.org
ThreadPoolExecutor-2_1:Read 275 from http://olympus.realpython.org/dice
ThreadPoolExecutor-2_2:Read 275 from http://olympus.realpython.org/dice
ThreadPoolExecutor-2_0:Read 275 from http://olympus.realpython.org/dice
ThreadPoolExecutor-2_3:Read 10282 from https://www.jython.org
ThreadPoolExecutor-2_0:Read 10282 from https://www.jython.org
ThreadPoolExecutor-2_0:Read 10282 from https://www.jython.org
ThreadPoolExecutor-2_4:Read 275 from http://olympus.realpython.org/dice
ThreadPoolExecutor-2_1:Read 10282 from https://www.jython.org
ThreadPool

```
Downloaded 160 in 3.4748218059539795 seconds
```

![Diagram](images/Threading.webp "Threading")

#### The Problems with the threading Version

Threads can interact in ways that are subtle and hard to detect. These interactions can cause race conditions that frequently result in random, intermittent bugs that can be quite difficult to find.

### **asyncio** Version

The general concept of asyncio is that a single Python object, called the event loop, controls how and when each task gets run. The event loop is aware of each task and knows what state it’s in. In reality, there are many states that tasks could be in, but for now let’s imagine a simplified event loop that just has two states.

The ready state will indicate that a task has work to do and is ready to be run, and the waiting state means that the task is waiting for some external thing to finish, such as a network operation.

Your simplified event loop maintains two lists of tasks, one for each of these states. It selects one of the ready tasks and starts it back to running. That task is in complete control until it cooperatively hands the control back to the event loop.

When the running task gives control back to the event loop, the event loop places that task into either the ready or waiting list and then goes through each of the tasks in the waiting list to see if it has become ready by an I/O operation completing. It knows that the tasks in the ready list are still ready because it knows they haven’t run yet.

Once all of the tasks have been sorted into the right list again, the event loop picks the next task to run, and the process repeats. Your simplified event loop picks the task that has been waiting the longest and runs that. This process repeats until the event loop is finished.

[Details](https://stackoverflow.com/questions/49005651/how-does-asyncio-actually-work/51116910#51116910)

#### async and await

You can use async and await to handle coroutines or use @asyncio.coroutine decorator on function


In [85]:
import asyncio
import time
import aiohttp


async def download_site(session, url):
    async with session.get(url) as response:
        print("Read {0} from {1}".format(response.content_length, url))

async def download_all_sites(sites):
    async with aiohttp.ClientSession() as session:
        tasks = []
        for url in sites:
            task = asyncio.ensure_future(download_site(session, url))
            tasks.append(task)
        await asyncio.gather(*tasks, return_exceptions=True)


if __name__ == "__main__":
    sites = [
        "https://www.jython.org",
        "http://olympus.realpython.org/dice",
    ] * 80
    start_time = time.time()
    # asyncio.get_event_loop().run_until_complete(download_all_sites(sites))
    
    # use this only when in jupyter notebook (Jupyter is running already on asyncio
    # so an event loop is already present)
    asyncio.get_event_loop().create_task(download_all_sites(sites)) 
    
    duration = time.time() - start_time
    print(f"Downloaded {len(sites)} sites in {duration} seconds")

Downloaded 160 sites in 0.00017690658569335938 seconds


In [84]:

import asyncio
import time
import aiohttp

@asyncio.coroutine
def download_site(session, url):
    response = yield from session.get(url)
    print("Read {0} from {1}".format(response.content_length, url))

@asyncio.coroutine
def sleep():
    # time.sleep(.5)
    return 2

@asyncio.coroutine
def download_all_sites(sites):
    session = aiohttp.ClientSession()

    try:
        tasks = []
        for url in sites:
            task = asyncio.ensure_future(download_site(session, url))
            tasks.append(task)

        yield from sleep()
        yield from asyncio.gather(*tasks, return_exceptions=True)
    finally:
        yield from session.close()


if __name__ == "__main__":
    sites = [
                "https://www.jython.org",
                "http://olympus.realpython.org/dice",
            ] * 80
    start_time = time.time()
    # asyncio.get_event_loop().run_until_complete(download_all_sites(sites))

    # use this only when in jupyter notebook (Jupyter is running already on asyncio
    # so an event loop is already present)
    coro = download_all_sites(sites)
    asyncio.get_event_loop().create_task(coro)

    duration = time.time() - start_time
    print(f"Downloaded {len(sites)} sites in {duration} seconds")

RuntimeError: This event loop is already running

```
Downloaded 160 sites in 2.6702880859375e-05 seconds
```

![Asyncio](images/Asyncio.webp "AsyncIO")

### multiprocessing Version

multiprocessing in the standard library was designed to break down that barrier and run your code across multiple CPUs. At a high level, it does this by creating a new instance of the Python interpreter to run on each CPU and then farming out part of your program to run on it.

As you can imagine, bringing up a separate Python interpreter is not as fast as starting a new thread in the current Python interpreter. It’s a heavyweight operation and comes with some restrictions and difficulties, but for the correct problem, it can make a huge difference.


In [64]:
import requests
import multiprocessing
import time

session = None


def set_global_session():
    global session
    if not session:
        session = requests.Session()


def download_site(url):
    with session.get(url) as response:
        name = multiprocessing.current_process().name
        print(f"{name}:Read {len(response.content)} from {url}")


def download_all_sites(sites):
    with multiprocessing.Pool(initializer=set_global_session) as pool:
        pool.map(download_site, sites)


if __name__ == "__main__":
    sites = [
        "https://www.jython.org",
        "http://olympus.realpython.org/dice",
    ] * 80
    start_time = time.time()
    download_all_sites(sites)
    duration = time.time() - start_time
    print(f"Downloaded {len(sites)} in {duration} seconds")



ForkPoolWorker-3:Read 10282 from https://www.jython.org
ForkPoolWorker-1:Read 10282 from https://www.jython.org
ForkPoolWorker-5:Read 10282 from https://www.jython.org
ForkPoolWorker-7:Read 10282 from https://www.jython.org
ForkPoolWorker-2:Read 275 from http://olympus.realpython.org/dice
ForkPoolWorker-6:Read 275 from http://olympus.realpython.org/dice
ForkPoolWorker-4:Read 275 from http://olympus.realpython.org/dice
ForkPoolWorker-8:Read 275 from http://olympus.realpython.org/dice
ForkPoolWorker-3:Read 275 from http://olympus.realpython.org/dice
ForkPoolWorker-5:Read 275 from http://olympus.realpython.org/dice
ForkPoolWorker-1:Read 275 from http://olympus.realpython.org/dice
ForkPoolWorker-7:Read 275 from http://olympus.realpython.org/dice
ForkPoolWorker-2:Read 10282 from https://www.jython.org
ForkPoolWorker-8:Read 10282 from https://www.jython.org
ForkPoolWorker-6:Read 10282 from https://www.jython.org
ForkPoolWorker-3:Read 10282 from https://www.jython.org
ForkPoolWorker-4:Read 10

# Read More

[GIL](https://opensource.com/article/17/4/grok-gil)