Parallelism consists of performing multiple operations at the same time. 

Multiprocessing is a means to effect parallelism, and it entails spreading tasks over a computer’s central processing units (CPUs, or cores). Multiprocessing is well-suited for CPU-bound tasks: tightly bound for loops and mathematical computations usually fall into this category.

Concurrency is a slightly broader term than parallelism. It suggests that multiple tasks have the ability to run in an overlapping manner. (There’s a saying that concurrency does not imply parallelism.)

Threading is a concurrent execution model whereby multiple threads take turns executing tasks. One process can contain multiple threads. Python has a complicated relationship with threading thanks to its GIL, but that’s beyond the scope of this article.

**async IO** is not a newly invented concept, and it has existed or is being built into other languages and runtime environments, such as Go, C#, or Scala.

In fact, async IO is a single-threaded, single-process design: it uses **cooperative multitasking**, a term that you’ll flesh out by the end of this tutorial.

**What is Asynchronous?**

- Asynchronous routines are able to “pause” while waiting on their ultimate result and let other routines run in the meantime.

- Asynchronous code, through the mechanism above, facilitates concurrent execution. To put it differently, asynchronous code gives the look and feel of concurrency.


![Alt text](image.png)

cooperative multitasking is a fancy way of saying that a program’s event loop (more on that later) communicates with multiple tasks to let each take turns running at the optimal time.

There are two keywords, async and await, serve different purposes but come together to help you declare, build, execute, and manage asynchronous code.

**a coroutine** is a function that can **suspend its execution** before reaching return, and it can **indirectly pass control** to another coroutine for some time.

syntax async def introduces either a native coroutine or an asynchronous generator. The expressions async with and async for are also valid, and you’ll see them later on.

The keyword await passes function control back to the event loop. (It suspends the execution of the surrounding coroutine.) If Python encounters an await f() expression in the scope of g(), this is how await tells the event loop, “Suspend execution of g() until whatever I’m waiting on—the result of f()—is returned. In the meantime, go let something else run.

In [None]:
async def g():
    # Pause here and come back to g() when f() is ready
    r = await f()
    return r

A function that you introduce with async def is a coroutine. It may use await, return, or yield, but all of these are optional. Declaring async def noop(): pass is valid:

Using await and/or return creates a coroutine function. To call a coroutine function, you must await it to get its results.

await f(), it’s required that f() be an object that is **awaitable**. 

Well, that’s not very helpful, is it? 

For now, just know that an awaitable object is either 

(1) another coroutine or (2) an object defining an ._await_() dunder method that returns an iterator. 

If you’re writing a program, for the large majority of purposes, you should only need to worry about case #1.

**function** is all-or-nothing. Once it starts, it won’t stop until it hits a return, then pushes that value to the caller (the function that calls it). 

**generator**, on the other hand, pauses each time it hits a yield and goes no further. Not only can it push this value to calling stack, but it can keep a hold of its local variables when you resume it by calling next() on it

Coroutines are repurposed generators that take advantage of the peculiarities of generator methods.

Old generator-based coroutines use yield from to wait for a coroutine result. Modern Python syntax in native coroutines simply replaces yield from with await as the means of waiting on a coroutine result. The await is analogous to yield from, and it often helps to think of it as such.

The use of await is a signal that marks a break point. It lets a coroutine temporarily suspend execution and permits the program to come back to it later.

In other words, asynchronous iterators and asynchronous generators are not designed to concurrently map some function over a sequence or iterator. They’re merely designed to let the enclosing coroutine allow other tasks to take their turn. 

The async for and async with statements are only needed to the extent that using plain for or with would “break” the nature of await in the coroutine. This distinction between asynchronicity and concurrency is a key one to grasp.

entire management of the event loop has been implicitly handled by one function call:

asyncio.run(main())

One use-case for queues (as is the case here) is for the queue to act as a transmitter for producers and consumers that aren’t otherwise directly chained or associated with each other.

In [3]:
async def main():
    # This does *not* introduce concurrent execution
    # It is meant to show syntax only
    g = [i async for i in mygen()]
    f = [j async for j in mygen() if not (j // 3 % 5)]
    return g, f

In [10]:
!pip install nest-asyncio



In [11]:
%autoawait asyncio

import asyncio
import nest_asyncio
nest_asyncio.apply()


In [12]:
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
loop.close()

count one
count two
count three
count four
count five
count six


RuntimeError: Cannot close a running event loop

In [5]:
import time,asyncio

async def count():
    print("count one")
    await asyncio.sleep(1)
    print("count four")

async def count_further():
    print("count two")
    await asyncio.sleep(1)
    print("count five")

async def count_even_further():
    print("count three")
    await asyncio.sleep(1)
    print("count six")

async def main():
    await asyncio.gather(count(), count_further(), count_even_further())

s = time.perf_counter()
await main()
elapsed = time.perf_counter() - s
print(f"Script executed in {elapsed:0.2f} seconds.")

count one
count two
count three
count four
count five
count six
Script executed in 1.00 seconds.


#1: Coroutines don’t do much on their own until they are tied to the event loop.

You saw this point before in the explanation on generators, but it’s worth restating. If you have a main coroutine that awaits others, simply calling it in isolation has little effect:

In [13]:
async def main():
    print("Hello ...")
    await asyncio.sleep(1)
    print("World!")

rout = main()
rout

<coroutine object main at 0x7f985cc593c0>

In [14]:
asyncio.run(rout)

Hello ...
World!


#2: By default, an async IO event loop runs in a single thread and on a single CPU core. Usually, running one single-threaded event loop in one CPU core is more than sufficient. It is also possible to run event loops across multiple cores. Check out this talk by John Reese for more, and be warned that your laptop may spontaneously combust.

#3. Event loops are pluggable. That is, you could, if you really wanted, write your own event loop implementation and have it run tasks just the same. This is wonderfully demonstrated in the uvloop package, which is an implementation of the event loop in Cython.

 In contrast, almost everything in aiohttp is an awaitable coroutine, such as session.request() and response.text(). It’s a great package otherwise, but you’re doing yourself a disservice by using requests in asynchronous code.

In [16]:
# areq.py

"""Asynchronously get links embedded in multiple pages' HMTL."""

import asyncio
import logging
import re
import sys
from typing import IO
import urllib.error
import urllib.parse

import aiofiles
import aiohttp
from aiohttp import ClientSession

logging.basicConfig(
    format="%(asctime)s %(levelname)s:%(name)s: %(message)s",
    level=logging.DEBUG,
    datefmt="%H:%M:%S",
    stream=sys.stderr,
)
logger = logging.getLogger("areq")
logging.getLogger("chardet.charsetprober").disabled = True

HREF_RE = re.compile(r'href="(.*?)"')

async def fetch_html(url: str, session: ClientSession, **kwargs) -> str:
    """GET request wrapper to fetch page HTML.

    kwargs are passed to `session.request()`.
    """

    resp = await session.request(method="GET", url=url, **kwargs)
    resp.raise_for_status()
    logger.info("Got response [%s] for URL: %s", resp.status, url)
    html = await resp.text()
    return html

async def parse(url: str, session: ClientSession, **kwargs) -> set:
    """Find HREFs in the HTML of `url`."""
    found = set()
    try:
        html = await fetch_html(url=url, session=session, **kwargs)
    except (
        aiohttp.ClientError,
        aiohttp.http_exceptions.HttpProcessingError,
    ) as e:
        logger.error(
            "aiohttp exception for %s [%s]: %s",
            url,
            getattr(e, "status", None),
            getattr(e, "message", None),
        )
        return found
    except Exception as e:
        logger.exception(
            "Non-aiohttp exception occured:  %s", getattr(e, "__dict__", {})
        )
        return found
    else:
        for link in HREF_RE.findall(html):
            try:
                abslink = urllib.parse.urljoin(url, link)
            except (urllib.error.URLError, ValueError):
                logger.exception("Error parsing URL: %s", link)
                pass
            else:
                found.add(abslink)
        logger.info("Found %d links for %s", len(found), url)
        return found

async def write_one(file: IO, url: str, **kwargs) -> None:
    """Write the found HREFs from `url` to `file`."""
    res = await parse(url=url, **kwargs)
    if not res:
        return None
    async with aiofiles.open(file, "a") as f:
        for p in res:
            await f.write(f"{url}\t{p}\n")
        logger.info("Wrote results for source URL: %s", url)

async def bulk_crawl_and_write(file: IO, urls: set, **kwargs) -> None:
    """Crawl & write concurrently to `file` for multiple `urls`."""
    async with ClientSession() as session:
        tasks = []
        for url in urls:
            tasks.append(
                write_one(file=file, url=url, session=session, **kwargs)
            )
        await asyncio.gather(*tasks)


In [None]:
import pathlib
import sys

with open("urls.txt") as infile:
    urls = set(map(str.strip, infile))

with open("foundURLs.txt", "w") as outfile:
    outfile.write("source_url\tparsed_url\n")

asyncio.run(bulk_crawl_and_write(file=outfile, urls=urls))

18:30:44 INFO:areq: Got response [200] for URL: https://1.1.1.1/
18:30:44 DEBUG:charset_normalizer: Encoding detection: utf_8 is most likely the one.
18:30:44 INFO:areq: Found 36 links for https://1.1.1.1/
  '|'.join(regex_opt_inner(list(group[1]), '')


TypeError: expected str, bytes or os.PathLike object, not TextIOWrapper

18:30:44 ERROR:areq: aiohttp exception for https://www.ietf.org/rfc/rfc2616.txt [None]: None
18:30:44 ERROR:areq: aiohttp exception for https://docs.python.org/3/this-url-will-404.html [None]: None
18:30:44 ERROR:areq: aiohttp exception for https://www.nytimes.com/guides/ [None]: None
18:30:44 ERROR:areq: aiohttp exception for https://www.politico.com/tipsheets/morning-money [None]: None
18:30:44 ERROR:areq: aiohttp exception for https://www.bloomberg.com/markets/economics [None]: None
18:30:44 ERROR:areq: aiohttp exception for https://regex101.com/ [None]: None
18:30:45 ERROR:areq: aiohttp exception for https://www.mediamatters.org/ [None]: None


Async IO shines when you have multiple IO-bound tasks where the tasks would otherwise be dominated by blocking IO-bound wait time, such as:

Network IO, whether your program is the server or the client side

Serverless designs, such as a peer-to-peer, multi-user network like a group chatroom

Read/write operations where you want to mimic a “fire-and-forget” style but worry less about holding a lock on whatever you’re reading and writing to

The biggest reason not to use it is that await only supports a specific set of objects that define a specific set of methods. If you want to do async read operations with a certain DBMS, you’ll need to find not just a Python wrapper for that DBMS, but one that supports the async/await syntax. Coroutines that contain synchronous calls block other coroutines and tasks from running.

In [18]:
async def coro(seq) -> list:
    """'IO' wait time is proportional to the max element."""
    await asyncio.sleep(max(seq))
    return list(reversed(seq))

async def main():
    # This is a bit redundant in the case of one task
    # We could use `await coro([3, 2, 1])` on its own
    t = asyncio.create_task(coro([3, 2, 1]))  # Python 3.7+
    await t
    print(f't: type {type(t)}')
    print(f't done: {t.done()}')

In [19]:
asyncio.run(main())

t: type <class 'asyncio.tasks.Task'>
t done: True


In [20]:
import time
async def main():
    t = asyncio.create_task(coro([3, 2, 1]))
    t2 = asyncio.create_task(coro([10, 5, 0]))  # Python 3.7+
    print('Start:', time.strftime('%X'))
    a = await asyncio.gather(t, t2)
    print('End:', time.strftime('%X'))  # Should be 10 seconds
    print(f'Both tasks done: {all((t.done(), t2.done()))}')
    return a

In [21]:
asyncio.run(main())

Start: 18:35:17
End: 18:35:27
Both tasks done: True


[[1, 2, 3], [0, 5, 10]]

In [22]:
async def main():
    t = asyncio.create_task(coro([3, 2, 1]))
    t2 = asyncio.create_task(coro([10, 5, 0]))
    print('Start:', time.strftime('%X'))
    for res in asyncio.as_completed((t, t2)):
        compl = await res
        print(f'res: {compl} completed at {time.strftime("%X")}')
    print('End:', time.strftime('%X'))
    print(f'Both tasks done: {all((t.done(), t2.done()))}')

In [23]:
asyncio.run(main())

Start: 18:36:27
res: [1, 2, 3] completed at 18:36:30
res: [0, 5, 10] completed at 18:36:37
End: 18:36:37
Both tasks done: True


### Precedence

While they behave somewhat similarly, the await keyword has significantly higher precedence than yield. 

- You’re now equipped to use async/await and the libraries built off of it. Here’s a recap of what you’ve covered:

- Asynchronous IO as a language-agnostic model and a way to effect concurrency by letting coroutines indirectly communicate with each other

- The specifics of Python’s new async and await keywords, used to mark and define coroutines
asyncio, the Python package that provides the API to run and manage coroutines