## Assigment 7: Concurrency

__Website uptime__ is the time that a website or web service is available to the users over a given period.

The task is to build an application that checks the uptime of websites. 

- The application should go over a list of website URLs and checks if those websites are up.
- Instead of performing a classic HTTP GET request, it performs a HEAD request so that it does not affect traffic significantly.
- If the HTTP status is in the danger ranges (400+, 500+), a message is casted. 

Here are some useful functions:

In [4]:
#### _website uptimer_ ####

import time
import logging
import requests
 
class WebsiteDownException(Exception):
    pass
 
def ping_website(address, timeout=20):
    """
    Check if a website is down. A website is considered down 
    if either the status_code >= 400 or if the timeout expires
     
    Throw a WebsiteDownException if any of the website down conditions are met
    """
    try:
        response = requests.head(address, timeout=timeout)
        if response.status_code >= 400:
            logging.warning("Website %s returned status_code=%s" % (address, response.status_code))
            raise WebsiteDownException()
    except requests.exceptions.RequestException:
        logging.warning("Timeout expired for website %s" % address)
        raise WebsiteDownException()
         
def check_website(address):
    """
    Utility function: check if a website is down, if so, notify the user
    """
    try:
        ping_website(address)
    except WebsiteDownException:
        print('The websie '+address+' is down')

You need a website list to try our system out. Create your own list or use the following one:

In [3]:
WEBSITE_LIST = [
    'http://envato.com',
    'http://amazon.co.uk',
    'http://amazon.com',
    'http://facebook.com',
    'http://google.com',
    'http://google.fr',
    'http://google.es',
    'http://google.co.uk',
    'http://internet.org',
    'http://gmail.com',
    'http://stackoverflow.com',
    'http://github.com',
    'http://heroku.com',
    'http://really-cool-available-domain.com',
    'http://djangoproject.com',
    'http://rubyonrails.org',
    'http://basecamp.com',
    'http://trello.com',
    'http://yiiframework.com',
    'http://shopify.com',
    'http://another-really-interesting-domain.co',
    'http://airbnb.com',
    'http://instagram.com',
    'http://snapchat.com',
    'http://youtube.com',
    'http://baidu.com',
    'http://yahoo.com',
    'http://live.com',
    'http://linkedin.com',
    'http://yndex.ru',
    'http://netflix.com',
    'http://wordpress.com',
    'http://bing.com',
]

A serial version of the _website uptimer_ can be written as: 

In [6]:
import time
 
start_time = time.time()
 
for address in WEBSITE_LIST:
    check_website(address)
         
end_time = time.time()        
 
print("Time for Serial: %ssecs" % (end_time - start_time))



The websie http://live.com is down




The websie http://netflix.com is down
The websie http://bing.com is down
Time for Serial: 5.274498701095581secs


You should build two versions of the _website uptimer_, one using threads and another using multiprocessing. Compare the time of each version and write a short explanation of what you are observing.

## Threading

In [9]:
#### _website uptimer_ ####

import time
import logging
import requests
 
class WebsiteDownException(Exception):
    pass
 
def ping_website(address, timeout=20):
    """
    Check if a website is down. A website is considered down 
    if either the status_code >= 400 or if the timeout expires
     
    Throw a WebsiteDownException if any of the website down conditions are met
    """
    try:
        response = requests.head(address, timeout=timeout)
        if response.status_code >= 400:
            logging.warning("Website %s returned status_code=%s" % (address, response.status_code))
            raise WebsiteDownException()
    except requests.exceptions.RequestException:
        logging.warning("Timeout expired for website %s" % address)
        raise WebsiteDownException()
         
def check_website_thread(q):
    """
    Utility function: check if a website is down, if so, notify the user
    """
    while True:
        web_link = q.get()
        try:
            ping_website(web_link)
        except WebsiteDownException:
            print('The websie '+web_link+' is down')
        q.task_done()

In [10]:
from queue import Queue
from threading import Thread

start_time = time.time()

q = Queue()
num_threads = 12

for i in range(num_threads):
    worker = Thread(target=check_website_thread, args=(q,))
    worker.setDaemon(True) # this stop the threads when the program quits  
    worker.start()         # start the threads

for address in WEBSITE_LIST:
    q.put(address)
q.join()
         
end_time = time.time()        
 
print("Time for Serial: %ssecs" % (end_time - start_time))




The websie http://live.com is down
The websie http://bing.com is down
The websie http://netflix.com is down
Time for Serial: 1.304875135421753secs


## Multi-Processing

In [5]:
WEBSITE_LIST = [
    'http://envato.com',
    'http://amazon.co.uk',
    'http://amazon.com',
    'http://facebook.com',
    'http://google.com',
    'http://google.fr',
    'http://google.es',
    'http://google.co.uk',
    'http://internet.org',
    'http://gmail.com',
    'http://stackoverflow.com',
    'http://github.com',
    'http://heroku.com',
    'http://really-cool-available-domain.com',
    'http://djangoproject.com',
    'http://rubyonrails.org',
    'http://basecamp.com',
    'http://trello.com',
    'http://yiiframework.com',
    'http://shopify.com',
    'http://another-really-interesting-domain.co',
    'http://airbnb.com',
    'http://instagram.com',
    'http://snapchat.com',
    'http://youtube.com',
    'http://baidu.com',
    'http://yahoo.com',
    'http://live.com',
    'http://linkedin.com',
    'http://yndex.ru',
    'http://netflix.com',
    'http://wordpress.com',
    'http://bing.com',
]

In [6]:
#### _website uptimer_ ####

import time
import logging
import requests
 
class WebsiteDownException(Exception):
    pass
 
def ping_website(address, timeout=20):
    """
    Check if a website is down. A website is considered down 
    if either the status_code >= 400 or if the timeout expires
     
    Throw a WebsiteDownException if any of the website down conditions are met
    """
    try:
        response = requests.head(address, timeout=timeout)
        if response.status_code >= 400:
            logging.warning("Website %s returned status_code=%s" % (address, response.status_code))
            raise WebsiteDownException()
    except requests.exceptions.RequestException:
        logging.warning("Timeout expired for website %s" % address)
        raise WebsiteDownException()
         
def check_website_mprocessing(address):
    """
    Utility function: check if a website is down, if so, notify the user
    """
    try:
        ping_website(address)
    except WebsiteDownException:
        print('The websie '+address+' is down')

In [None]:

from multiprocessing.pool import Pool
 
start_time = time.time()

with Pool(8) as p:
    p.map(check_website_mprocessing, WEBSITE_LIST)
         
end_time = time.time()        
 
print("Time for Mprocessing: %ssecs" % (end_time - start_time))

## Discussion

Both the codes are giving similar timings - which is around 1.30 sec. This behavior mainly depends on the type of task that is being done. In case the task would have been CPU intensive than the multiprocessing might have run faster. In case of a task related to some API Threading might have been the winner. This task involves pinging websites on the internet and it involvs the internet much. So, the time for both the methods is quite similar.