### Homework 07: Concurrency

## Due Date: Apr 5, 2021, 04:00pm

#### Firstname Lastname: 

#### E-mail: 

#### Enter your solutions and submit this notebook


---

**Problem 1** **(60 Points)**

Let us consider the Gamma function, or the Euler integral of the second kind: 

$$\Gamma(x) = \int_{0} ^ \infty t ^{x - 1} e^{-t} dt, $$

and in this HW we consider real $x > 0$.

(Here is more on the Gamma function https://en.wikipedia.org/wiki/Gamma_function .
It is not needed for this HW assignment.) 

**1.1 (Points 15)**: 

Write a function (in the cell below) that sequentially calculates the given Gamma integral.


In [1]:
import numpy as np

def calculate_gamma(x, bound_1, bound_2, number_of_steps):
    res, t = 0, bound_1
    step = (bound_2 - bound_1)/number_of_steps
    
    while t <= bound_2:
        res += t ** (x - 1) * np.exp(-t) * step
        t += step

    return res

**1.2 (Points 5)** 

Evaluate, $\Gamma(6)$ by using `calculate_gamma(x, bound_1, bound_2, number_of_steps)` and the error of this computation.


As arguments, use `x=6, bound_1=0, bound_2=1000, number_of_steps=10_000_000`. We know that $\Gamma(x) = x!$, so $\Gamma(6) = 5! = 120$. 


In [2]:
import time

start = time.time()
gamma_6 = calculate_gamma(x = 6, bound_1 = 0, bound_2 = 2000, number_of_steps=10000000)
end = time.time()
print("finished in {}s".format(end - start))

finished in 16.18636441230774s


In [3]:
print("result: {}, error: {}".format(gamma_6, 120 - gamma_6))

result: 119.99999999987826, error: 1.2174439234513557e-10


---

Write two functions to calculate $\Gamma(x)$ by using:



**1.3.1 (Points 15)**
**threading** with N=4 threads; 

**1.3.2 (Points 15)**
**multiprocessing** with N=4 processes. 


**1.3.3 (Points 10)** 
Compare the times of the three versions and write a short explanation of what you are observing.

How does the answer change when N=8 and why?

    

### 1.3.1

In [4]:
import concurrent.futures
import time

def calculate_gamma_threading(x, bound_1, bound_2, number_of_steps, num_threads):
    chunk_size = (bound_2 - bound_1)/num_threads
    x_iter = [x] * num_threads
    num_steps = [number_of_steps/num_threads] * num_threads
    
    starts, ends = [], []
    for i in range(num_threads):
        start = bound_1 + i * chunk_size
        end = bound_1 + (i + 1) * chunk_size
        if end <= bound_2:
            starts.append(start)
            ends.append(end)
    
    time1 = time.time()
    
    with concurrent.futures.ThreadPoolExecutor(max_workers=num_threads) as executor:
        results = executor.map(calculate_gamma, x_iter, starts, ends, num_steps)
    
    time2 = time.time()
    print("finished in {}".format(time2 - time1))
    return 'result: {}'.format(sum(results))

In [5]:
# N = 4
calculate_gamma_threading(x = 6, bound_1 = 0, bound_2 = 2000, number_of_steps=10000000, num_threads = 4)

finished in 16.21132731437683


'result: 119.99999999987826'

In [6]:
# N = 8
calculate_gamma_threading(x = 6, bound_1 = 0, bound_2 = 2000, number_of_steps=10000000, num_threads = 8)

finished in 16.148525714874268


'result: 119.99999999987826'

### 1.3.2

In [8]:
from multiprocessing.pool import Pool
import time

def calculate_gamma_pool(x, bound_1, bound_2, number_of_steps, num_processes):
    chunk_size = (bound_2 - bound_1)/num_processes
    chunk_steps = number_of_steps/num_processes
    
    bounds = []
    for i in range(num_processes):
        start = bound_1 + i * chunk_size
        end = bound_1 + (i + 1) * chunk_size
        if end <= bound_2:
            bounds.append((x, start, end, chunk_steps))
    
    time1 = time.time()
    
    with Pool(num_processes) as pool:
        result = pool.starmap(calculate_gamma, bounds)
        
    time2 = time.time()
    print("finished in {}".format(time2 - time1))
    return 'result: {}'.format(sum(result))

In [9]:
# N = 4
calculate_gamma_pool(x = 6, bound_1 = 0, bound_2 = 2000, number_of_steps=10000000, num_processes=4)

finished in 4.351138353347778


'result: 119.99999999987826'

In [10]:
# N = 8
calculate_gamma_pool(x = 6, bound_1 = 0, bound_2 = 2000, number_of_steps=10000000, num_processes=8)

finished in 2.250614881515503


'result: 119.99999999987826'

### 1.3.3

In this task, using multiprocessing is much faster than the other two approaches because the main bottleneck of this task is the CPU bound. Using any number of thread does not really help speeding up the task because it still needs to wait previous thread to finish the computation and then the next thread can start working, which just makes things sequential. While using multiprocessing, we have multiple CPU cores to work on parallel tasks simultaneously, which makes the speed much faster. It becomes even faster when we use more cores. 

---

**Problem 2 (40 points)**

__Website uptime__ is the time that a website or web service is available to the users over a given period.

The task is to build an application that checks the uptime of websites. 

- The application should go over a list of website URLs and checks if those websites are up.
- Instead of performing a classic HTTP GET request, it performs a HEAD request so that it does not affect traffic significantly.
- If the HTTP status is in the danger ranges (400+, 500+), a message is casted. 

Here are some useful functions:

In [11]:
#### _website uptimer_ ####

import time
import logging
import requests
 
class WebsiteDownException(Exception):
    pass
 
def ping_website(address, timeout=20):
    """
    Check if a website is down. A website is considered down 
    if either the status_code >= 400 or if the timeout expires
     
    Throw a WebsiteDownException if any of the website down conditions are met
    """
    try:
        response = requests.head(address, timeout=timeout)
        if response.status_code >= 400:
            logging.warning("Website %s returned status_code=%s" % (address, response.status_code))
            raise WebsiteDownException()
    except requests.exceptions.RequestException:
        logging.warning("Timeout expired for website %s" % address)
        raise WebsiteDownException()
         
def check_website(address):
    """
    Utility function: check if a website is down, if so, notify the user
    """
    try:
        ping_website(address)
    except WebsiteDownException:
        print('The websie ' + address + ' is down')

---

You need a website list to try our system out. Create your own list or use the following one. 

---

In [12]:
WEBSITE_LIST = [
    'http://amazon.co.uk',
    'http://amazon.com',
    'http://facebook.com',
    'http://google.com',
    'http://google.fr',
    'http://google.es',
    'http://google.co.uk',
    'http://gmail.com',
    'http://stackoverflow.com',
    'http://github.com',
    'http://heroku.com',
    'http://really-cool-available-domain.com',
    'http://djangoproject.com',
    'http://rubyonrails.org',
    'http://basecamp.com',
    'http://trello.com',
    'http://shopify.com',
    'http://another-really-interesting-domain.co',
    'http://airbnb.com',
    'http://instagram.com',
    'http://snapchat.com',
    'http://youtube.com',
    'http://baidu.com',
    'http://yahoo.com',
    'http://live.com',
    'http://linkedin.com',
    'http://netflix.com',
    'http://wordpress.com',
    'http://bing.com',
]

---

A serial version of the _website uptimer_ can be written as: 

---


You should build two versions of the **website uptimer**, by using:

**2.1 (Points 15)**
**threading** with N=4 threads; 

**2.2 (Points 15)**
**multiprocessing** with N=4 processes. 


**2.3 (Points 10)** 

Compare the times of the three versions and write a short explanation of what you are observing.

How does the answer change when N=8 and why?


In [13]:
import time
 
start_time = time.time()
 
for address in WEBSITE_LIST:
    check_website(address)
         
end_time = time.time()        
 
print("Time for Serial: %ssecs" % (end_time - start_time))



The websie http://really-cool-available-domain.com is down




The websie http://another-really-interesting-domain.co is down
Time for Serial: 2.4581668376922607secs


### 2.1

In [14]:
# N = 4
start_time = time.time()

with concurrent.futures.ThreadPoolExecutor(max_workers = 4) as executor:
    executor.map(check_website, WEBSITE_LIST)

end_time = time.time()

print(end_time - start_time)



The websie http://really-cool-available-domain.com is down
The websie http://another-really-interesting-domain.co is down
0.9698996543884277


In [15]:
# N = 8
start_time = time.time()

with concurrent.futures.ThreadPoolExecutor(max_workers = 8) as executor:
    executor.map(check_website, WEBSITE_LIST)

end_time = time.time()

print(end_time - start_time)



The websie http://really-cool-available-domain.com is down
The websie http://another-really-interesting-domain.co is down
0.5381925106048584


### 2.2

In [16]:
# N = 4
start_time = time.time()

with Pool(4) as pool:
    pool.map(check_website, WEBSITE_LIST)

end_time = time.time()

print(end_time - start_time)



The websie http://really-cool-available-domain.com is down




The websie http://another-really-interesting-domain.co is down
1.126875638961792


In [17]:
# N = 8
start_time = time.time()

with Pool(8) as pool:
    pool.map(check_website, WEBSITE_LIST)

end_time = time.time()

print(end_time - start_time)



The websie http://really-cool-available-domain.com is down




The websie http://another-really-interesting-domain.co is down
0.6441435813903809


### 2.3

In this task, using threading is faster than using multiprocessing because the main bottleneck in this task is I/O bound. Although multiprocessing can speed up the task to some extent, but in each process, it still need to wait until it gets the response from the website so that it can send the request to the next website, while in threading, once we send the request and wait for response, the next thread is ready to go. Increasing N=4 to N=8 also speed up the task because we have more threads and more processes that can work concurrently.