# Concurrency

### This script have been created with a help of this StackAbuse tutorial: https://stackabuse.com/concurrency-in-python/

### Imports

In [35]:
import time
import nest_asyncio
import requests
import json
from urllib import request
import multiprocessing
import asyncio
import aiohttp
from Code.Lab10.service import save_image
from concurrent.futures import ThreadPoolExecutor

### Download images and save them into a folder
#### To see images check the /downloads directory

We use the https://picsum.photos site to retrieve a list of pictures.
Then we just split every link so we are able to name the file and create an jpg file
At the end we just use the urlretrieve() method to save the file
There is also a time counter to check the execution speed

In [36]:
def download_images():
    response = requests.get("https://picsum.photos/v2/list")
    if response.status_code != 200:
        raise AttributeError('GET /tasks/ {}'.format(response.status_code))
    data = json.loads(response.text)

    pictures=[]
    for s in data:
        pictures.append(s['download_url']+".jpg")
    return pictures

def saveImages(link):
    filename = link.split('/')[6].split('.')[0]
    fileformat = link.split('/')[6].split('.')[1]
    request.urlretrieve(link, "downloads/{}.{}".format(filename, fileformat))

def main():
    images = download_images()
    for image in images:
        saveImages(image)

start_time = time.time()
main()
duration_synch = time.time() - start_time
print(f"Time taken to download 30 images into the downloads folder synchronously: {duration_synch}")


Time taken to download 30 images into the downloads folder synchronously: 11.287117719650269


### Let´s use multithreading!

This time we use multiple threads by creating an ThreadPoolExecutor that handles creating and managing threads.
We limit our program to maximal 5 threads.

In [37]:
def process_images_threading():
    images = download_images()
    with ThreadPoolExecutor(max_workers=5) as executor:
        executor.map(saveImages,images)

start_time = time.time()
process_images_threading()
duration_threading = time.time() - start_time
print(f"Time taken to download 30 images into the downloads folder with multithreading: {duration_threading}")

Time taken to download 30 images into the downloads folder with multithreading: 6.731784105300903


### Multiprocessing

Until now we used only one CPU.
With this approach we tell our program to use more than only one core.


## Disclaimer
Jupiter does not support multiprocessing so I had to export the saveImages() method into an external .py class.
Check save_image.py for the source code

In [38]:
def process_images_multiprocessing():
    images = download_images()
    pool = multiprocessing.Pool(multiprocessing.cpu_count())
    pool.map(save_image.saveImages,images)


start_time = time.time()
process_images_multiprocessing()
duration_multiprocessing = time.time() - start_time
print(f"Time taken to download 30 images into the downloads folder with multiprocessing: {duration_multiprocessing}")

Time taken to download 30 images into the downloads folder with multiprocessing: 7.468779563903809


### AsyncIO

With this approach we need to define all of the methods as async methods before we can use asyncio.run()
Also we need to declare a session when working with aiohttp

In [39]:
async def download_images_asyncio(link, session):
    filename = link.split('/')[6].split('.')[0]
    fileformat = link.split('/')[6].split('.')[1]
    async with session.get(link) as response:
        with open("downloads/{}.{}".format(filename, fileformat), 'wb') as fd:
            async for data in response.content.iter_chunked(1024):
                fd.write(data)

async def main_asyncio():
    images = download_images()

    async with aiohttp.ClientSession() as session:
        tasks=[download_images_asyncio(image,session)for image in images]
        return await asyncio.gather(*tasks)

start_time = time.time()
nest_asyncio.apply()
# asyncio.run() is able for python 3.7+ users if you use an earlier version use:
# asyncio.get_event_loop().run_until_complete()
asyncio.run(main_asyncio())
duration_asyncio = time.time() - start_time
print(f"Time taken to download 30 images into the downloads folder with asyncio: {duration_asyncio}")

Time taken to download 30 images into the downloads folder with asyncio: 7.343893766403198


### Let´s compare the results!

In [40]:
print(f'Synchronous methods execution time: {duration_synch} seconds \n'
      f'Threading methods execution time: {duration_threading} seconds\n'
      f'Multiprocessing methods execution time: {duration_multiprocessing} seconds\n'
      f'Asyncio methods execution time: {duration_asyncio} seconds')



Synchronous methods execution time: 11.287117719650269 seconds 
Threading methods execution time: 6.731784105300903 seconds
Multiprocessing methods execution time: 7.468779563903809 seconds
Asyncio methods execution time: 7.343893766403198 seconds


Now we see why concurrency is so important!

Remember those results happened on my machine they WILL be different on others depending on the amount of CPUs and the CPU performance.