### Solutions - Homework 07: Concurrency

## Due Date: Apr 13, 2020, 08:00am

#### Firstname Lastname: 

#### E-mail: 

#### Enter your solutions and submit this notebook


---

**Problem 1** **(60 Points)**

Let us consider the Gamma function, or the Euler integral of the second kind: 

$$\Gamma(x) = \int_{0} ^ \infty t ^{x - 1} e^{-t} dt, $$

and in this HW we consider real $x > 0$.

(Here is more on the Gamma function https://en.wikipedia.org/wiki/Gamma_function .
It is not needed for this HW assignment.) 

**1.1 (Points 15)**: 

Write a function (in the cell below) that sequentially calculates the given Gamma integral.


In [52]:
def calculate_gamma(x, bound_1, bound_2, number_of_steps):
    # sequential version to calculate Gamma(x):
    # where we approximate the given integral,
    # like this a discrete sum in number_of_steps
    # equidistant points on the interval [bound_1, bound_2]
    
    # return Gamma(x)

    pass


In [53]:
def calculate_gamma(x, bound_1, bound_2, number_of_steps):
    # sequential version to calculate Gamma(x):
    # where we approximate the given integral,
    # like this a discrete sum in number_of_steps
    # equidistant points on the interval [bound_1, bound_2]
    
    # return Gamma(x)
    s = 0.0
    for t in np.linspace(bound_1, bound_2, number_of_steps):
        s += t ** (x - 1) * exp(-t) * ((bound_2 - bound_1) / number_of_steps)

    return s

**1.2 (Points 5)** 

Evaluate, $\Gamma(6)$ by using `calculate_gamma(x, bound_1, bound_2, number_of_steps)` and the error of this computation.


As arguments, use `x=6, bound_1=0, bound_2=1000, number_of_steps=10_000_000`. We know that $\Gamma(x) = x!$, so $\Gamma(6) = 5! = 120$. 


In [54]:
gamma_num = calculate_gamma(x=6, bound_1=0, bound_2=1000, number_of_steps=10_000_000)
print(gamma_num - 120)

-1.2000053061456128e-05


---

Write two functions to calculate $\Gamma(x)$ by using:



**1.3.1 (Points 15)**
**threading** with N=4 threads; 

**1.3.2 (Points 15)**
**multiprocessing** with N=4 processes. 


**1.3.3 (Points 10)** 
Compare the times of the three versions and write a short explanation of what you are observing.

How does the answer change when N=8 and why?

    

In [55]:
from queue import Queue
from time import time
from threading import Thread, Lock

global integral 
integral = 0.0
lock = Lock()

class DownloadWorker(Thread):
    def __init__(self, queue):
        Thread.__init__(self)
        self.queue = queue
    
    def run(self):
        while True:
            global integral 
            (x, bound_1, bound_2, number_of_steps) = self.queue.get()
            lock.acquire()
            integral += calculate_gamma(x, bound_1, bound_2, number_of_steps)
            lock.release()
            self.queue.task_done()
    
N = 4
number_of_steps = 10_000_000
bound_1 = 0 
bound_2 = 1000
x = 6

ts = time()
# Create a queue to communicate with the worker threads
queue = Queue()
    
# Create N worker threads
for _ in range(N):
    worker = DownloadWorker(queue)
    worker.daemon = True
    worker.start()

# Put the tasks into the queue as a tuple
for i in range(N):
    queue.put([x, i * (bound_2 - bound_1) / N, (i + 1)  * (bound_2 - bound_1) / N, \
           int(number_of_steps/N)])
    

queue.join()
print(integral, '-->', time()-ts,'ms')       

119.99995199994225 --> 6.642161846160889 ms


In [56]:
# multiprocessor version
from time import time
from multiprocessing.pool import Pool
from math import exp, pi
import numpy as np


def calculate_gamma_parallel(input_values):
    x, bound_1, bound_2, number_of_steps = input_values
    s = 0
    for t in np.linspace(bound_1, bound_2, number_of_steps):
        s += t ** (x - 1) * exp(-t) * ((bound_2 - bound_1) / number_of_steps)
    return s

N = 4
number_of_steps = 10_000_000
bound_1 = 0 
bound_2 = 1_000
x = 6
chunks = [[x, i * (bound_2 - bound_1) / N, (i + 1)  * (bound_2 - bound_1) / N, \
           int(number_of_steps/N)] for i in range(N)]

ts = time()
with Pool(N) as p:
      results = p.map(calculate_gamma_parallel, chunks)

print(sum(results), '-->', time()-ts,'ms') 

119.99995199994225 --> 2.3473944664001465 ms


---

**Problem 2 (40 points)**

__Website uptime__ is the time that a website or web service is available to the users over a given period.

The task is to build an application that checks the uptime of websites. 

- The application should go over a list of website URLs and checks if those websites are up.
- Instead of performing a classic HTTP GET request, it performs a HEAD request so that it does not affect traffic significantly.
- If the HTTP status is in the danger ranges (400+, 500+), a message is casted. 

Here are some useful functions:

In [57]:
#### _website uptimer_ ####

import time
import logging
import requests
 
class WebsiteDownException(Exception):
    pass
 
def ping_website(address, timeout=20):
    """
    Check if a website is down. A website is considered down 
    if either the status_code >= 400 or if the timeout expires
     
    Throw a WebsiteDownException if any of the website down conditions are met
    """
    try:
        response = requests.head(address, timeout=timeout)
        if response.status_code >= 400:
            logging.warning("Website %s returned status_code=%s" % (address, response.status_code))
            raise WebsiteDownException()
    except requests.exceptions.RequestException:
        logging.warning("Timeout expired for website %s" % address)
        raise WebsiteDownException()
         
def check_website(address):
    """
    Utility function: check if a website is down, if so, notify the user
    """
    try:
        ping_website(address)
    except WebsiteDownException:
        print('The websie ' + address + ' is down')

---

You need a website list to try our system out. Create your own list or use the following one. 

---

In [60]:
WEBSITE_LIST = [
    'http://amazon.co.uk',
    'http://amazon.com',
    'http://facebook.com',
    'http://google.com',
    'http://google.fr',
    'http://google.es',
    'http://google.co.uk',
    'http://gmail.com',
    'http://stackoverflow.com',
    'http://github.com',
    'http://heroku.com',
    'http://really-cool-available-domain.com',
    'http://djangoproject.com',
    'http://rubyonrails.org',
    'http://basecamp.com',
    'http://trello.com',
    'http://shopify.com',
    'http://another-really-interesting-domain.co',
    'http://airbnb.com',
    'http://instagram.com',
    'http://snapchat.com',
    'http://youtube.com',
    'http://baidu.com',
    'http://yahoo.com',
    'http://live.com',
    'http://linkedin.com',
    'http://netflix.com',
    'http://wordpress.com',
    'http://bing.com',
]

---

A serial version of the _website uptimer_ can be written as: 

---


In [61]:
import time
 
start_time = time.time()
 
for address in WEBSITE_LIST:
    check_website(address)
         
end_time = time.time()        
 
print("Time for Serial: %ssecs" % (end_time - start_time))



The websie http://really-cool-available-domain.com is down




The websie http://another-really-interesting-domain.co is down




The websie http://netflix.com is down
Time for Serial: 13.558212757110596secs


You should build two versions of the **website uptimer**, by using:

**2.1 (Points 15)**
**threading** with N=4 threads; 

**2.2 (Points 15)**
**multiprocessing** with N=4 processes. 


**2.3 (Points 10)** 

Compare the times of the three versions and write a short explanation of what you are observing.

How does the answer change when N=8 and why?


In [62]:
####
#### Solution with threads ###
####

import time
from queue import Queue
from threading import Thread
 
NUM_WORKERS = 4
task_queue = Queue()
 
def worker():
    # Constantly check the queue for addresses
    while True:
        address = task_queue.get()
        check_website(address)
         
        # Mark the processed task as done
        task_queue.task_done()

start_time = time.time()
         
# Create the worker threads
threads = [Thread(target=worker) for _ in range(NUM_WORKERS)]
 
# Add the websites to the task queue
[task_queue.put(item) for item in WEBSITE_LIST]
 
# Start all the workers
[thread.start() for thread in threads]
 
# Wait for all the tasks in the queue to be processed
task_queue.join()
 
         
end_time = time.time()        
 
print("Time for ThreadedSquirrel: %ssecs" % (end_time - start_time))



The websie http://really-cool-available-domain.com is down
The websie http://another-really-interesting-domain.co is down




The websie http://netflix.com is down
Time for ThreadedSquirrel: 3.7716572284698486secs


In [63]:
####
#### Solution with multiprocessing ###
####

import time
# import socket
import multiprocessing
 
NUM_WORKERS = 4
 
start_time = time.time()
 
with multiprocessing.Pool(processes=NUM_WORKERS) as pool:
    results = pool.map_async(check_website, WEBSITE_LIST)
    results.wait()

end_time = time.time()        
 
print("Time for MultiProcessingSquirrel: %ssecs" % (end_time - start_time))



The websie http://really-cool-available-domain.com is down




The websie http://another-really-interesting-domain.co is down




The websie http://netflix.com is down
Time for MultiProcessingSquirrel: 2.5546627044677734secs
