##### College of Engineering, Construction and Living Sciences<br>Bachelor of Information Technology<br>IN710: Object-Oriented Systems Development<br>Level 7, Credits 15<br><br>Deadline: Tuesday, 7 April at 5pm

# Practical 17: Concurrency & Parallelism

In this **self-directed** practical, you will complete a series of tasks covering today's lecture. This practical is worth 1% of the final mark for the Object-Oriented Systems Development course.

In [None]:
%config IPCompleter.greedy=True

### Threading & Concurrency
**Task 1:** Answer the following questions:
1. What is concurrency?
2. What is the difference between CPU bound & I/O bound?
3. What is a thread?
4. When the following cell is executed, what is happening?
5. In the `Thread` object, there are three arguments. What are these arguments & their purpose?
6. What does the `threading` function `start` do?
7. What does the `threading` function `join` do?

**Resources:**
- Concurrency - https://en.wikipedia.org/wiki/Concurrency_(computer_science)
- Thread Wikipedia - https://en.wikipedia.org/wiki/Thread_(computing)
- Threading Module - https://docs.python.org/3/library/threading.html

In [None]:
# Write your answers below

# 1.
# 2.
# 3.
# 4.
# 5.
# 6.
# 7.

In [None]:
from threading import Thread
from time import perf_counter, sleep


def sleeping(secs):
    print(f'Going to sleep for {secs} second(s)')
    sleep(secs)
    print(f'Woke up after {secs} second(s)')


def main():
    start = perf_counter()

    threads = [Thread(target=sleeping, args=[5], daemon=True) for _ in range(5)]

    for t in threads:
        t.start()

    for t in threads:
        t.join()

    finish = perf_counter()
    print(f'Process finished in {round(finish - start, 2)} second(s)')


if __name__ == '__main__':
    main()

### ThreadPoolExecutor

`concurrent.futures` `ThreadPoolExecutor` is an alternative to the `threading` module.

**Resources:**
- Concurrent Futures Module - https://docs.python.org/3/library/concurrent.futures.html

In [None]:
from concurrent.futures import as_completed, ThreadPoolExecutor
from time import perf_counter, sleep


def sleeping(secs):
    print(f'Going to sleep for {secs} second(s)')
    sleep(secs)
    return f'Woke up after {secs} second(s)'


def main():
    start = perf_counter()

    with ThreadPoolExecutor() as executor:
        secs = [5, 4, 3, 2, 1]
        results = executor.map(sleeping, secs) 
        for r in results:
            print(r)

    finish = perf_counter()
    print(f'Process finished in {round(finish - start, 2)} second(s)')


if __name__ == '__main__':
    main()

### Threading & Picsum API

**Task 2:** In this task, you will use your solution from practical 15 question 5 to download images from the **Picsum API** to your current directory. Currently, it takes eight seconds to download all 10 images, but with threading, it takes two seconds. Refactor the `main` method so that it is using the `ThreadPoolExecutor` class & it's `map` method. 

In [None]:
from concurrent.futures import ThreadPoolExecutor
from requests import get
from time import perf_counter


def download_img(url):
    img_bytes = get(url).content
    img_name = ''.join(url.split('/')[4:])
    img_name = f'{img_name}.jpg'
    with open(img_name, 'wb') as f:
        f.write(img_bytes)
        print(f'{img_name} was downloaded.')


def main():
    start = perf_counter()
    
    # Make a request to the Picsum API
    # Get the downloaded urls from the response and append to a list
    
    # Use a context manager with ThreadPoolExecutor() as executor
        # Call the executor's map method - pass in a function & iterable

    finish = perf_counter()
    print(f'Process finished in {round(finish - start, 2)} second(s)')


if __name__ == '__main__':
    main()

### Multi-Processing & Parallelism
**Task 3:** Answer the following questions:
1. What is parallelism?
2. When the following cell is executed, what is happening?
3. What does the `multiprocessing` function `start` do?
4. What does the `multiprocessing` function `join` do?

**Resources:**
- Parallel Computing - https://en.wikipedia.org/wiki/Parallel_computing
- Multi-Processing Module - https://docs.python.org/3/library/multiprocessing.html

In [None]:
# Write your answers below

# 1.
# 2.
# 3.
# 4.

In [None]:
from multiprocessing import Process
from time import perf_counter, sleep


def sleeping(secs):
    print(f'Going to sleep for {secs} second(s)')
    sleep(secs)
    print(f'Woke up after {secs} second(s)')


def main():
    start = perf_counter()

    processes = [Process(target=sleeping, args=[5]) for _ in range(5)]

    for p in processes:
        p.start()

    for p in processes:
        p.join()

    finish = perf_counter()
    print(f'Process finished in {round(finish - start, 2)} second(s)')


if __name__ == '__main__':
    main()

### ProcessPoolExecutor

`concurrent.futures` `ProcessPoolExecutor` is an alternative to the `threading` module.

In [None]:
from concurrent.futures import as_completed, ProcessPoolExecutor
from time import perf_counter, sleep


def sleeping(secs):
    print(f'Going to sleep for {secs} second(s)')
    sleep(secs)
    return f'Woke up after {secs} second(s)'


def main():
    start = perf_counter()

    with ProcessPoolExecutor() as executor:
        secs = [5, 4, 3, 2, 1]
        results = executor.map(sleeping, secs)
        for r in results:
            print(r)

    finish = perf_counter()
    print(f'Process finished in {round(finish - start, 2)} second(s)')


if __name__ == '__main__':
    main()

### Multi-Processing & Photo Filtering

**Task 4:** 
1. In this task, you will experiment with multi-processing where you will apply an image filter to each downloaded image. Use the `glob` module to get all downloaded images in the current directory with the extension `.jpg`. Append these images to a list called `imgs`. Refactor the `main` method so that it is using the `ProcessPoolExecutor` class & it's `map` method. 
2. Make a note of the process time. Again, refactor the code so that it is using `ThreadPoolExecutor` instead of `ProcessPoolExecutor`. Execute the code & compare the process time.
3. Is this program CPU bound or I/O bound?

**Resources:**
- Glob Module - https://docs.python.org/3/library/glob.html

In [None]:
from concurrent.futures import ProcessPoolExecutor
from glob import glob
from os import chdir
from PIL import Image, ImageFilter
from requests import get
from time import perf_counter


def filter_img(img_name):
    img = Image.open(img_name)
    img = img.filter(ImageFilter.GaussianBlur(15))
    img.save(img_name)
    print(f'{img_name} was processed.')


def main():
    start = perf_counter()

    # Get all images in the current directory with the extension .jpg and append to a list

    # Use a context manager with ProcessPoolExecutor() as executor
        # Call the executor's map method - pass in a function & iterable

    finish = perf_counter()
    print(f'Process finished in {round(finish - start, 2)} second(s)')


if __name__ == '__main__':
    main()

# Submission
1. Create a new branch named 17-checkpoint within your practicals GitHub repository
2. Create a new pull request and assign Grayson-Orr to review your submission

**Note:** Please don't merge your own pull request.