# Assignment 1 - Threading and Multiprocessing

In this project, we will explore multithreading an multiprocessing difference. For that purpose, we have an imaginary colleage whose name is John, who asks for your help to increase the speed of his process while downloading images from internet.

John already has the code for serial-programming, however, he don't know concurrent programming and parallel programming! Help John to succeed in his mission by using multithreading and multiprocessing logic to increase the speed of his task.

He has two tasks:

1. Download images from internet
2. Resize them to 128x128 px. 


## Imports

In [9]:
import utils
import os

## Global Variables

In [10]:
NUM_OF_IMAGES = 500 # max requests can be done per day is 12500
CLIENT_ID = utils.get_imgur_client_id()
IMAGES_DIR = utils.create_download_dir()

## 1. Downloading Images from Internet (Threading)

In this section, we will download some images from internet. As network related tasks are considered as IO bound, it can be fasten by multithreading the downloading task. Our john already did serial way of downloading, it is your turn to do multithreading.

You are free to choose any library you want. Your success will be based on your ability to beat John's timing.

### Serial Code of John

In [11]:
%%time

image_links = utils.build_link_list(CLIENT_ID, NUM_OF_IMAGES)

for image_link in image_links:
    utils.download_image_from_url(image_link, IMAGES_DIR)

api limit reached!
CPU times: user 20.3 ms, sys: 3.84 ms, total: 24.2 ms
Wall time: 355 ms


### Multithreading John's Task

In [12]:
%%time

import threading

image_links = utils.build_link_list(CLIENT_ID, NUM_OF_IMAGES)

def download():
    for image_link in image_links:
        util.download_image_from_url(image_link, IMAGES_DIR)
    
threads = []
for i in range(10):
    t = threading.Thread(target=download)
    threads.append(t)
    t.start()


api limit reached!
CPU times: user 18.6 ms, sys: 3.92 ms, total: 22.6 ms
Wall time: 43.6 ms


## 2. Resizing (Multiprocessing)

In this part, we have to resize the images downloaded into another size, in this example case, it will be 128x128px. As CPU bound operations are generally considered as multiprocessing tasks, resizing suits exactly for this purpose!

You are free to choose any library you want. Your success will be based on your ability to beat John's timing.

### Serial Code of John

In [13]:
%%time

# PS: time for 845 images : 10.1 s

image_path_list = os.listdir('images')

for image_path in image_path_list:
    utils.create_thumbnail((128, 128), os.path.join('images', image_path))

CPU times: user 27.1 ms, sys: 14.9 ms, total: 42 ms
Wall time: 54.3 ms


### Multithreading John's Task

In [14]:
import multiprocessing

In [15]:
%%time
pool = multiprocessing.Pool(3)
for image_path in image_path_list:
    pool.apply_async(utils.create_thumbnail, args=((128, 128), os.path.join('images', image_path)))

CPU times: user 10.9 ms, sys: 13.2 ms, total: 24.1 ms
Wall time: 36.5 ms


## Conclusion

John is very happy with your help and he wants to show his progress to his manager. Help him to create a dataframe/ table to present his results. 

Create a table to show differences between all four approaches and the time it took for those tasks. Table can be anything, as long as you show the differences, as in below.

|Description | Time 
|:----------- | :---- 
|Task 1 | 19.2 sec
|Task 2 | 3.2 sec
|Task N | 6.2 sec
|... | ...

In [27]:
import pandas as pd

df = pd.DataFrame([], columns = ['Description','Time'])
df = df.append({'Description': 'Serial Code Gather Images','Time':'api limit reached' },ignore_index=True)
df = df.append({'Description': 'Multithreading Gather Images','Time':'api limit reached' },ignore_index=True)
df = df.append({'Description': 'Serial Code Resize Images','Time':'54.3ms' },ignore_index=True)
df = df.append({'Description': 'Multiprocessing Resize Images','Time':'36.5 ms' },ignore_index=True)
df


Unnamed: 0,Description,Time
0,Serial Code Gather Images,api limit reached
1,Multithreading Gather Images,api limit reached
2,Serial Code Resize Images,54.3ms
3,Multiprocessing Resize Images,36.5 ms
