
The scripts and notes below were developed/written based on exercises or "totally copied" from [Teclado Code](https://github.com/tecladocode/complete-python-course/tree/master/course_contents/13_async_development)


# Asynchronous Python Development

- __Synchronous:__ actions that happen one after another. Programming as we've seen it until now is synchronous, because each line executes after the previous one.
- __Asynchronous:__ actions that don't necessary happen after one another, or that can happen in arbitrary order ("without synchrony").
- __Concurrency:__ The ability of our programs to run things in different order every time the program runs, without affecting the final outcome.
- __Parallelism:__ Running two or more things at the same time.
- __Thread:__ A line of code execution that can run in one of your computer's cores.


## Dining Philosopher

- __Case:__ 5 philosophers and 2 forks who are hungry. The are able to eat just using 2 forks.  
- __Solution:__ If there is a waiter (master, orquestrator), he can get the 2 forks and send them to the philosophers in a limited time for them to eat.
It will be possible to feed 2 philopher by time, and the others would need to wait for their turn (__time slicing__).
Even decreasing the time, it will never be possible for all of them to eat at the same time.  
- __Limited resources__ -> Forks

## Processes and threads

- __Processor__: Each processor has a number of cores.

- __Cores:__ 
Generally, each unit of core has 4 cores inside it.
Each core can work independently and communicate with each other, performing mathematical operations.
Cores are "philosophers" with resources (forks).

- __Threads:__
They are line of code execution.
Each one can run in one core by a time.
Threads are "philosophers waiting to eat".

- __Processes:__
They manage everything (resources, which can be cores + network, hard drive, etc) that is necessary to run one or more threads (which runs things).
Time slicing.
The OS is responsible to save the current status (checkpoint) of the thread to manage it.

- __GIL (Global Interpreter Lock) - Asynchronous Python:__ 
Lauching a Python app, it will get a new Python process.
Python doesn't run 2 threads in one process at same time.
Each Python process creates a key resource(GIL) and each thread acquires that resource.

- __Multiple Pythons__
It is possible to run multiple processes, which means that each one will creates its own GIL, and execute one thread.
However it is expensive to communicate between 2 processes.
It is normally used when the machine has multicore and it is desired to have complex calculations on both.

__What's the point of multiple threads in Python?__  

Reduce waiting time!

Ex: If you need to request a parameter for a user and execute some mathematical processing. The first thing is going to request the waiting time. With GIL and multiple threads the time will be reduced, because it will consume the CPU just for what the computer needs to execute.

__What should I use: Multithreading or multiprocessing?__
- __Multithreadings__ is used for process that need to wait (Ex: asking for some argument from user).
- __Multiprocessing__ is used when the machine has multicore and it is desired to have complex calculations on both.

## Using Multithreads

In [4]:
import time
from threading import Thread

####### USING SINGLE THREAD

def ask_user():
    start = time.time()
    user_input = input('Enter your name: ')
    greet = f'Hello, {user_input}'
    print(greet)
    print('ask_user: ', time.time() - start)

def complex_calculation():
    print('Started calculating...')
    start = time.time()
    [x**2 for x in range(20000)]
    print('complex_calculation: ', time.time() - start)


# With a single thread, we can do one at a time—e.g.
start = time.time()
ask_user()
complex_calculation()
print('Single thread total time: ', time.time() - start, '\n\n')


####### USING TWO THREADS

# With two threads, we can do them both at once.
thread = Thread(target=complex_calculation)
thread2 = Thread(target=ask_user)

start = time.time()

# Start both threads.
thread.start()
thread2.start()

# Make the main thread (the whole code) wait for the 2 threads to print the final total time
# They are blocking operations because their behaviour.
thread.join()
thread2.join()

print('Two thread total time: ', time.time() - start)


Enter your name: Jose
Hello, Jose
ask_user:  1.4345934391021729
Started calculating...
complex_calculation:  0.01759815216064453
Single thread total time:  1.4548828601837158 


Started calculating...
complex_calculation:  0.019173622131347656
Enter your name: Jose
Hello, Jose
ask_user:  1.0603420734405518
Two thread total time:  1.0833075046539307


A most elegant way to write the Thread code is using the concurrent.futures method, because it uses a __Context Manager__ as the following example.

__Important note:__ It is possible to write commands inside the code to kill the threads between the start and the waiting process (blocking), but it SHOULDN'T be done, because it may create a deadlock, killing the GIL, which will make the code "wait forever" for the next step.

In [4]:
import time
from concurrent.futures import ThreadPoolExecutor

####### USING SINGLE THREAD

def ask_user():
    start = time.time()
    user_input = input('Enter your name: ')
    greet = f'Hello, {user_input}'
    print(greet)
    print('ask_user: ', time.time() - start)

def complex_calculation():
    print('Started calculating...')
    start = time.time()
    [x**2 for x in range(20000)]
    print('complex_calculation: ', time.time() - start)


# With a single thread, we can do one at a time—e.g.
start = time.time()
ask_user()
complex_calculation()
print('Single thread total time: ', time.time() - start, '\n\n')


####### USING TWO THREADS
# With two threads, we can do them both at once
start = time.time()

# Create a pool of threads (in this case, 2), then submit to start, forcing the main thread to wait the 2 new ones
with ThreadPoolExecutor(max_workers=2) as pool:
    pool.submit(complex_calculation)
    pool.submit(ask_user)

# The pool.shutdown() is implicit into the Context Manager, that is the reason why we don't need to call it

print('Two thread total time: ', time.time() - start)

Enter your name: Ana
Hello, Ana
ask_user:  4.666213035583496
Started calculating...
complex_calculation:  0.01547098159790039
Single thread total time:  4.683517932891846 


Started calculating...
complex_calculation:  0.013892173767089844
Enter your name: Ana
Hello, Ana
ask_user:  1.5530469417572021
Two thread total time:  1.5554237365722656


## Using Multiprocessing

In [6]:
import time
from multiprocessing import Process

####### USING SINGLE THREAD

def ask_user():
    start = time.time()
    user_input = input('Enter your name: ')
    greet = f'Hello, {user_input}'
    print(greet)
    print('ask_user: ', time.time() - start)

def complex_calculation():
    print('Started calculating...')
    start = time.time()
    [x**2 for x in range(20000)]
    print('complex_calculation: ', time.time() - start)


# With a single thread, we can do one at a time—e.g.
start = time.time()
ask_user()
complex_calculation()
print('Single thread total time: ', time.time() - start, '\n\n')


####### USING TWO THREADS

process1 = Process(target=complex_calculation)
process2 = Process(target=complex_calculation)

process1.start()
process2.start()

start = time.time()

ask_user()

process1.join()  # this waits for the process to finish
process2.join()  # this waits for the process to finish

print('Two process total time: ', time.time() - start)

Enter your name: Ana
Hello, Ana
ask_user:  1.8233802318572998
Started calculating...
complex_calculation:  0.013601064682006836
Single thread total time:  1.8386273384094238 


Started calculating...
Started calculating...
complex_calculation:  0.015327692031860352
complex_calculation:  0.025389432907104492
Enter your name: Ana
Hello, Ana
ask_user:  0.7216448783874512
Two process total time:  0.7232041358947754


A most elegant way to write the Processes code is using the concurrent.futures method, because it uses a __Context Manager__ as the following example.


In [7]:
import time
from concurrent.futures import ProcessPoolExecutor

####### USING SINGLE THREAD

def ask_user():
    start = time.time()
    user_input = input('Enter your name: ')
    greet = f'Hello, {user_input}'
    print(greet)
    print('ask_user: ', time.time() - start)

def complex_calculation():
    print('Started calculating...')
    start = time.time()
    [x**2 for x in range(20000)]
    print('complex_calculation: ', time.time() - start)


# With a single thread, we can do one at a time—e.g.
start = time.time()
ask_user()
complex_calculation()
print('Single thread total time: ', time.time() - start, '\n\n')


####### USING TWO THREADS

start = time.time()

with ProcessPoolExecutor(max_workers=2) as pool:
    pool.submit(complex_calculation)
    pool.submit(complex_calculation)

print('Two process total time: ', time.time() - start)

Enter your name: ANa
Hello, ANa
ask_user:  1.655616044998169
Started calculating...
complex_calculation:  0.01416325569152832
Single thread total time:  1.6710083484649658 


Started calculating...
Started calculating...
complex_calculation:  0.014949560165405273
complex_calculation:  0.02122807502746582
Two process total time:  0.08966970443725586


__Atomic Operation__:   
Cannot be interrupted during a multithead execution.

__Fuzzying__:  
Input time.sleep(random.random()) among some statements.


If the idea is to run something sequentially, we shouldn't use multithreads, otherwise it is going to show the results incorrectly

In [6]:
from threading import Thread
import time
import random

counter = 0

def increment_counter():
    global counter
    time.sleep(random.randint(0, 1))
    counter += 1
    time.sleep(random.randint(0, 1))
    print(f'New counter value: {counter}')
    time.sleep(random.randint(0, 1))
    print('-----------')



for x in range(10):
    t = Thread(target=increment_counter)
    time.sleep(random.randint(0, 1))
    t.start()

New counter value: 1
-----------
New counter value: 2
New counter value: 3
-----------
New counter value: 4
-----------
New counter value: 5
-----------
-----------
New counter value: 6
New counter value: 8
New counter value: 8
-----------
-----------
-----------
New counter value: 10
-----------
New counter value: 10
-----------


To use the same resource, but in a sequential process, it can be used queues.

In [8]:
from threading import Thread
import time
import random
import queue

counter = 0
job_queue = queue.Queue()
counter_queue = queue.Queue()

def increment_manager():
    """Start the queue each execution and in the end finalize de the task for the queue"""
    global counter
    while True:
        increment = counter_queue.get()  # this waits until an item is available and locks the queue
        time.sleep(random.random())
        old_counter = counter
        time.sleep(random.random())
        counter = old_counter + increment
        time.sleep(random.random())
        job_queue.put((f'New counter value {counter}', '------------'))
        time.sleep(random.random())
        counter_queue.task_done()  # this unlocks the queue


# printer_manager and increment_manager run continuously because of the `daemon` flag.
Thread(target=increment_manager, daemon=True).start()


def printer_manager():
    while True:
        for line in job_queue.get():
            time.sleep(random.random())
            print(line)
        job_queue.task_done()

# printer_manager and increment_manager run continuously because of the `daemon` flag.
Thread(target=printer_manager, daemon=True).start()


def increment_counter():
    counter_queue.put(1) 
    time.sleep(random.random())


worker_threads = [Thread(target=increment_counter) for thread in range(10)]

for thread in worker_threads:
    time.sleep(random.random())
    thread.start()

for thread in worker_threads:
    thread.join()  # wait for it to finish

counter_queue.join()  # wait for counter_queue to be empty
job_queue.join()  # wait for job_queue to be empty

New counter value 1
------------
New counter value 2
------------
New counter value 3
------------
New counter value 4
------------
New counter value 5
------------
New counter value 6
------------
New counter value 7
------------
New counter value 8
------------
New counter value 9
------------
New counter value 10
------------


Using generators instead threads.  
It is cheaper to use yield instead queues to execute multithreads.

In [9]:
def countdown(n):
    while n > 0:
        yield n
        n -= 1


tasks = [countdown(10), countdown(5), countdown(20)]

while tasks:
    task = tasks[0]
    tasks.remove(task)
    try:
        x = next(task)
        print(x)
        tasks.append(task)
    except StopIteration:
        print('Task finished')


10
5
20
9
4
19
8
3
18
7
2
17
6
1
16
5
Task finished
15
4
14
3
13
2
12
1
11
Task finished
10
9
8
7
6
5
4
3
2
1
Task finished


Yielding from another iterator:

In [2]:
from collections import deque

friends = deque(('Rolf', 'Jose', 'Charlie', 'Jen', 'Anna'))


def get_friend():
    yield from friends


def greet(g):
    while True:
        try:
            friend = next(g)
            yield f'HELLO {friend}'
        except StopIteration:
            pass


friends_generator = get_friend()
g = greet(friends_generator)
print(next(g))
print(next(g))

HELLO Rolf
HELLO Jose


Receiving data from through yield

In [3]:
from collections import deque

friends = deque(('Rolf', 'Jose', 'Charlie', 'Jen', 'Anna'))


# As it takes(receives) data and can be suspended, it is called CO-ROUTINE
def friend_upper():
    while friends:
        friend = friends.popleft().upper()
        greeting = yield
        print(f'{greeting} {friend}')

        
def greet(g):
    g.send(None)
    while True:
        greeting = yield
        g.send(greeting)
        
#def greet(g):      -- It does the same thing that the code above does
#    yield from g


greeter = greet(friend_upper())
greeter.send(None)
greeter.send('Hello')
print('Hello, world! Multitasking...')
greeter.send('How are you,')

Hello ROLF
Hello, world! Multitasking...
How are you, JOSE


The same code can be written with the async function + __@coroutine__ decorator:  
It is importante to keep in mind that __await__ command is going to continue until fetch all generator

In [5]:
from collections import deque
from types import coroutine

friends = deque(('Rolf', 'Jose', 'Charlie', 'Jen', 'Anna'))


@coroutine
def friend_upper():
    while friends:
        friend = friends.popleft().upper()
        greeting = yield
        print(f'{greeting} {friend}')


async def greet(g):
    print('Starting...')
    await g
    print('Ending...')


greeter = greet(friend_upper())
greeter.send(None)
greeter.send('Hello')

Starting...
Hello ROLF


More about asynchronous Python:  
- https://www.youtube.com/watch?v=MCs5OvhV9S4  
- https://www.youtube.com/watch?v=ZzfHjytDceU  
- https://www.youtube.com/watch?v=9zinZmE3Ogk  
- https://www.youtube.com/watch?v=Obt-vMVdM8s  





Ver resposta da aula 210




In [6]:
import aiohttp
import asyncio

async def fetch_page(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            print(response.status)
            return response.status
        
loop = asyncio.get_event_loop()
loop.run_until_complete(fetch_page('http://google.com'))


RuntimeError: This event loop is already running

200


In [5]:
import aiohttp
import asyncio
import time

async def fetch_page(url):
    page_start = time.time()
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            print(f"Page took {time.time() - page_start}")
            print(response.status)
            return response.status
        
loop = asyncio.get_event_loop()
tasks = [fetch_page("http://google.com") for i in range(50)]
start = time.time()
loop.run_until_complete(fetch_page('http://google.com'))
print(f"All took {time.time() - page_start}")


RuntimeError: This event loop is already running

Page took 0.21466708183288574
200


## Async request in Python

In [7]:
import asyncio
import async_timeout
import aiohttp
import time

async def fetch_page(session, url):
    async with async_timeout.timeout(10): ## It includes a security waiting process forced by run_until_complete
        start = time.time()
        async with session.get(url) as response:
            print(f'{url} took {time.time() - start}')
            return response.status


async def get_multiple_pages(loop, *urls):
    tasks = []
    async with aiohttp.ClientSession(loop=loop) as session:
        for url in urls:
            tasks.append(fetch_page(session, url))
        return await asyncio.gather(*tasks)
    
loop = asyncio.get_event_loop()

urls = ["http://google.com" for i in range(50)]
start = time.time()
loop.run_until_complete(get_multiple_pages(loop, *urls))
print(f"All took {time.time() - page_start}")


RuntimeError: This event loop is already running

http://google.com took 0.27380800247192383
http://google.com took 0.2909736633300781
http://google.com took 0.30023956298828125
http://google.com took 0.29531121253967285
http://google.com took 0.2821311950683594
http://google.com took 0.29773855209350586
http://google.com took 0.30672311782836914
http://google.com took 0.3088226318359375
http://google.com took 0.28849029541015625
http://google.com took 0.3092207908630371
http://google.com took 0.3067035675048828
http://google.com took 0.3094444274902344
http://google.com took 0.29774999618530273
http://google.com took 0.3016631603240967
http://google.com took 0.2922806739807129
http://google.com took 0.30062365531921387
http://google.com took 0.3143448829650879
http://google.com took 0.29723477363586426
http://google.com took 0.31242823600769043
http://google.com took 0.32497692108154297
http://google.com took 0.32184553146362305
http://google.com took 0.3253345489501953
http://google.com took 0.3212311267852783
http://google.com took

In [None]:
from bs4 import BeautifulSoup

import requests

page = requests.get("http://quotes.toscrape.com/")
soup = BeautifulSoup(page.content, "html.parser")

# Define the locators
quote_page_locator = "div.quote"
author_locator = "small.author"
quote_locator = "span.text"
tag_locator = "div a.tag"

print("******************************\n")

# Get data from each block of html
for content_html in soup.select(quote_page_locator):
    # print(content_html)

    author = content_html.select_one(author_locator)
    quote = content_html.select_one(quote_locator)
    tags = [tag.string for tag in content_html.select(tag_locator)]

    print("Author: ", author.string)
    print("Quote Text: ", quote.string)
    print("Tags: ", tags, "\n")
    #print("-----------\n")

print("******************************\n")

### [br.linkedin.com/in/jmilhomem](https://www.linkedin.com/in/jmilhomem/)