# Assignment 1 - Threading and Multiprocessing

In this project, we will explore multithreading an multiprocessing difference. For that purpose, we have an imaginary colleage whose name is John, who asks for your help to increase the speed of his process while downloading images from internet.

John already has the code for serial-programming, however, he don't know concurrent programming and parallel programming! Help John to succeed in his mission by using multithreading and multiprocessing logic to increase the speed of his task.

He has two tasks:

1. Download images from internet
2. Resize them to 128x128 px. 


## Imports

In [1]:
import os
import utils
import PIL

## Global Variables

In [2]:
CLIENT_ID = "20e704a7b229339"

In [3]:
NUM_OF_IMAGES = 1000 # max requests can be done per day is 12500
CLIENT_ID = utils.get_imgur_client_id()
IMAGES_DIR = utils.create_download_dir()

## 1. Downloading Images from Internet (Threading)

In this section, we will download some images from internet. As network related tasks are considered as IO bound, it can be fasten by multithreading the downloading task. Our john already did serial way of downloading, it is your turn to do multithreading.

You are free to choose any library you want. Your success will be based on your ability to beat John's timing.

### Serial Code of John

In [4]:
%%time

image_links = utils.build_link_list(CLIENT_ID, NUM_OF_IMAGES)

for image_link in image_links:
    utils.download_image_from_url(image_link, IMAGES_DIR)

too many requests, enough, or you can choose to put time.sleep() in here...
Wall time: 5.81 s


### Multithreading John's Task

In [5]:
%%time

import threading

if __name__ == "__main__":
    image_links = utils.build_link_list(CLIENT_ID, NUM_OF_IMAGES)
    for image_link in image_links:
        utils.download_image_from_url(image_link, IMAGES_DIR)
    #Create Thread
    t1 = threading.Thread(target = utils.download_image_from_url,args =(image_link, IMAGES_DIR) )
    
    t2 = threading.Thread(target = utils.download_image_from_url,args =(image_link, IMAGES_DIR) )
    
    t1.start()
    t2.start()
    
    t1.join()
    t2.join()
    
    print("Done")

too many requests, enough, or you can choose to put time.sleep() in here...
Done
Wall time: 5.09 s


## 2. Resizing (Multiprocessing)

In this part, we have to resize the images downloaded into another size, in this example case, it will be 128x128px. As CPU bound operations are generally considered as multiprocessing tasks, resizing suits exactly for this purpose!

You are free to choose any library you want. Your success will be based on your ability to beat John's timing.

### Serial Code of John

In [6]:
%%time

# PS: time for 845 images : 10.1 s

image_path_list = os.listdir('images')

for image_path in image_path_list:
    utils.create_thumbnail((128, 128), os.path.join('images', image_path))

Wall time: 10.3 s


# Multiprocessing:- Multithreading John's Task

In [7]:
%%time

import multiprocessing
from multiprocessing import Pool

p = multiprocessing.Pool(processes=6)

image_path_list = os.listdir('images')

for image_path in image_path_list:
    
    p.apply_async(utils.create_thumbnail(128, 128), os.path.join('images', image_path))

Wall time: 200 ms


## Conclusion

John is very happy with your help and he wants to show his progress to his manager. Help him to create a dataframe/ table to present his results. 

Create a table to show differences between all four approaches and the time it took for those tasks. Table can be anything, as long as you show the differences, as in below.

|Description | Time 
|:----------- | :---- 
|Task 1 | 19.2 sec
|Task 2 | 3.2 sec
|Task N | 6.2 sec
|... | ...

In [10]:
from tabulate import tabulate
print(tabulate(
    {"Description": 
     ["Download images by John", "Download images = Multithreading",
                                "Resize Image John's code", "Multiprocessing = Multithreading"],
        "Wall time": ["5.81s","5.09s","10.3s", "200s"]},
               headers="keys", showindex="always", tablefmt="psql"))

+----+----------------------------------+-------------+
|    | Description                      | Wall time   |
|----+----------------------------------+-------------|
|  0 | Download images by John          | 5.81s       |
|  1 | Download images = Multithreading | 5.09s       |
|  2 | Resize Image John's code         | 10.3s       |
|  3 | Multiprocessing = Multithreading | 200s        |
+----+----------------------------------+-------------+
