# Using web APIs

In this lab session we will make use of three web APIs, all able to respresent the resources as JSON and using the HTTP protocol to exchange the information:
- a simple brewerey api
- an api with data on the corona virus
- a trading api (for this you will need to make an account to get your own API key)

In [121]:
!pip install plotly
!pip install numpy
!pip install pandas

[33mYou are using pip version 19.0.3, however version 20.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
[33mYou are using pip version 19.0.3, however version 20.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
[33mYou are using pip version 19.0.3, however version 20.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [2]:
import json
import http.client
import mimetypes

from datetime import datetime, timedelta

import plotly.offline as py
from plotly.offline import init_notebook_mode
import plotly.graph_objs as go
import plotly.express as px

import pandas as pd
import numpy as np

### A brewery example

To use web APIs based on the HTTP protocol we first need to be able to make an HTTP connection <br>

Python has a standard library http.client that is able to do that for you <br>

There is a third-party library "requests" that they actually recommend, but we will be using the default library <br>

Documentation for this very simple demo brewery api can be found here: https://www.openbrewerydb.org/documentation/01-listbreweries <br>

The base endpoint is https://api.openbrewerydb.org/breweries <br>

By using parameters in the url, i.e. key value pairs after the question mark, we can select breweries per city <br>

Example: https://api.openbrewerydb.org/breweries?by_city=san_diego <br>

Let's first make a HTTP connection to api.openbrewerydb.org

In [3]:
base_url = "api.openbrewerydb.org"

conn = http.client.HTTPSConnection(base_url)

### Now we can use the connection object to make a request

For API calls where we just want to retrieve a resource object, we typically use the HTTP GET method (but always verify with the documentation)

In [4]:
endpoint = "/breweries"
conn.request("GET", endpoint)

### Next we retrieve the response and transform it into a Python JSON object

In [5]:
def get_response(conn):
    response = conn.getresponse() 
    if response.status == 200: #OK status code as defined by HTTP protocol
        return json.loads(response.read())
    else:
        print(f"status code: {response.status}")

In [6]:
response = get_response(conn)
conn.close()

In [7]:
#this is how one element in the response list looks like
print(json.dumps(response[0], indent=2))

{
  "id": 2,
  "name": "Avondale Brewing Co",
  "brewery_type": "micro",
  "street": "201 41st St S",
  "city": "Birmingham",
  "state": "Alabama",
  "postal_code": "35222-1932",
  "country": "United States",
  "longitude": "-86.774322",
  "latitude": "33.524521",
  "phone": "2057775456",
  "website_url": "http://www.avondalebrewing.com",
  "updated_at": "2018-08-23T23:19:57.825Z",
  "tag_list": []
}


### Let's print the brewery names and city

In [8]:
for brewery in response:
    print(brewery["name"],",",brewery["city"])

Avondale Brewing Co , Birmingham
Trim Tab Brewing , Birmingham
Yellowhammer Brewery , Huntsville
Bearpaw River Brewing Co , Wasilla
King Street Brewing Co , Anchorage
1912 Brewing , Tucson
Bad Water Brewing , Scottsdale
BJs Restaurant & Brewery - Chandler , Chandler
BlackRock Brewers , Tucson
Dragoon Brewing Co , Tucson
Grand Canyon Brewing Company , Williams
Mudshark Brewing Co , Lake Havasu City
Richter Aleworks , Peoria
SanTan Brewing Co , Chandler
State 48 Brewery , Surprise
Wren House Brewing Company , Phoenix
Brick Oven Pizza Co / Brick & Forge Brewing , Paragould
Diamond Bear Brewing Co , North Little Rock
Lost Forty Brewing , Little Rock
Rapp's Barren Brewing Company , Mountain Home


### Sometimes we want specific resources, e.g. breweries in certain cities
Typically the API documentation will tell you how <br>
To select breweries in one city the brewery API allow us to specify a parameter "by_city" in the URL <br>
Let's get all breweries in Tucson

In [10]:
city = "Tucson"
endpoint = f"/breweries?by_city={city}"

conn = http.client.HTTPSConnection(base_url)
conn.request("GET", endpoint)
response = get_response(conn)
conn.close()

for brewery in response:
    print(brewery["name"], ",", brewery["city"])

1912 Brewing , Tucson
BlackRock Brewers , Tucson
Dragoon Brewing Co , Tucson
Button Brew House, LLC , Tucson
Catalina Brewing Company , Tucson
Copper Mine Brewing Co , Tucson
Barrio Brewing Co , Tucson
Corbett Brewing Company , Tucson
Crooked Tooth Brewing Co. , Tucson
Dillinger Brewing Company , Tucson
Iron John's Brewing Company , Tucson
Green Feet Brewing , Tucson
Public Brewhouse , Tucson
Pueblo Vida Brewing Co , Tucson
Ten Fifty Five Brewing , Tucson
Thunder Canyon Brewery , Tucson
Sentinel Peak Brewing Company , Tucson
Borderlands Brewing Co , Tucson
The Address Brewing / 1702 Beer & Pizza , Tucson


### You can immediately read this as a Pandas dataframe

# Now it's your turn

### Let's make some plots to compare the progress of the number corona infections in China, Belgium, Netherlands, Italy, Spain and the US

To visualize exponential growth we will make plots as demonstrated here: https://www.youtube.com/watch?v=54XLXg4fYsc&t=305s <br>

We will use the data provided by the web API: covidapi.info

In [16]:
list_of_countries = ["CHN","BEL","NLD","ITA","ESP","USA"]

### Let's first make a class to simplify the reuse of the connection object, request and responses, catch and deal with some potential errors

In [11]:
class CoronaSDK():
    def __init__(self, host):
        self.host = host
        self.conn = http.client.HTTPSConnection(self.host)
    
    #to enable the use of the with statement
    def __enter__(self):
        return self
    
    #to enable the use of the with statement: always close the connection whatever happens
    def __exit__(self, type, value, traceback):
        self.close()
    
    def close(self):
        self.conn.close()
        self.conn = None
    
    def get(self, endpoint):
        #send request
        self.conn.request("GET", endpoint) 
        
        #try to get response
        try:
            response = self.conn.getresponse()
        except (http.client.NotConnected, http.client.RemoteDisconnected):
            self.conn.connect() #reconnect in case the connection was closed by the other end
            return self.get(endpoint)
        except http.client.HTTPException as httpe:
            print(f"Error: {httpe}")
            return None

        if response.status == 200: #OK status code as defined by HTTP protocol
            return json.loads(response.read())
        else:
            print(f"status code: {response.status}")
            return None

### Let's define a function that uses an object instantiation of the CoronaSDK class to fetch the data for a specified country

In [12]:
def get_covid_data_for_country(client, country_code):
    today_str = datetime.now().strftime("%Y-%m-%d")
    endpoint = f"/api/v1/country/{country_code}/timeseries/2020-01-01/{today_str}"
    return client.get(endpoint)

### Let's use an object of the CoronaSDK class and the above defined function to retrieve the data for Belgium

In [14]:
client = CoronaSDK("covidapi.info")
response = get_covid_data_for_country(client, "BEL")
n_records = response["count"]
print(n_records)

85


In [19]:
records = []
for country_code in list_of_countries:
    country_recs = get_covid_data_for_country(client, country_code)["result"]
    for rec in country_recs:
        rec["country"] = country_code
    records.extend(country_recs)

In [20]:
covid_df = pd.DataFrame(records, columns=["confirmed","date","deaths","recovered","country"])
print(covid_df)

     confirmed        date  deaths  recovered country
0          548  2020-01-22      17         28     CHN
1          643  2020-01-23      18         30     CHN
2          920  2020-01-24      26         36     CHN
3         1406  2020-01-25      42         39     CHN
4         2075  2020-01-26      56         49     CHN
..         ...         ...     ...        ...     ...
505     526396  2020-04-11   20463      31270     USA
506     555313  2020-04-12   22020      32988     USA
507     580619  2020-04-13   23529      43482     USA
508     607670  2020-04-14   25832      47763     USA
509     636350  2020-04-15   28326      52096     USA

[510 rows x 5 columns]


In [23]:
pivot_df = pd.pivot_table(covid_df, values='deaths', index=['date'],columns='country', aggfunc=np.sum)

In [24]:
diff_df = pivot_df.diff(periods=7)

In [25]:
def create_country_traces(df):
    traces = []
    for col in df.columns:
        traces.append(go.Scatter(x=df.index, y=df[col].values,
                             name = col,
                             mode = 'markers+lines',
                             line=dict(shape='linear'),
                             connectgaps=True
                             )
                 )
    return traces

In [26]:
fig = go.Figure()
for trace in create_country_traces(pivot_df):
    fig.add_trace(trace)
fig.update_layout(
    width=1920,
    height=1080
)
fig.show()

In [27]:
fig = go.Figure()
for trace in create_country_traces(pivot_df):
    fig.add_trace(trace)
fig.update_layout(
    yaxis_type="log",
    width=1920,
    height=1080
)
fig.show()

In [28]:
merged_df = pivot_df.merge(diff_df, left_index=True, right_index=True, suffixes=("_total","_difference"))

In [29]:
merged_df.columns

Index(['BEL_total', 'CHN_total', 'ESP_total', 'ITA_total', 'NLD_total',
       'USA_total', 'BEL_difference', 'CHN_difference', 'ESP_difference',
       'ITA_difference', 'NLD_difference', 'USA_difference'],
      dtype='object', name='country')

In [30]:
fig = go.Figure()
for country_code in list_of_countries:
    col_total = f"{country_code}_total"
    col_difference = f"{country_code}_difference"
    fig.add_trace(go.Scatter(x=merged_df[col_total], y=merged_df[col_difference],
                             name = country_code,
                             mode = 'markers+lines',
                             line=dict(shape='linear'),
                             connectgaps=True
                )
    )
fig.update_layout(
    yaxis_title="New cases in last week (log scale)",
    xaxis_title="Total cases (log scale)",
    xaxis_type="log",
    yaxis_type="log",
    width=1920,
    height=1080
)
fig.show()

In [None]:
IEX_TOKEN = "sk_83e91e7db6df4c5faedd8784559322df"

In [51]:
class IEXSDK():
    def __init__(self, host, token=None):
        self.host = host
        self.token = token
        self.conn = http.client.HTTPSConnection(self.host)
    
    def get(self, endpoint):
        payload = ''
        headers = {}
        self.conn.request("GET", f"{endpoint}?token={self.token}")
        try:
            res = self.conn.getresponse().read()
            return json.loads(res)
        except http.client.HTTPException:
            return {}        
    
    def get_stock_for_symbol_and_date(self, symbol, date):
        endpoint = f"/stable/stock/{symbol}/chart/date/{date}"
        return self.get(endpoint)
    
    def get_stock_for_symbol(self, symbol, start_date, end_date, date_format="%Y%m%d"):
        start_date = datetime.strptime(start_date, date_format)
        end_date = datetime.strptime(end_date, date_format)
        
        records = []
        while start_date <= end_date:
            start_date_str = start_date.strftime(date_format)
            records.extend(self.get_stock_for_symbol_and_date(symbol, start_date_str))            
            start_date += timedelta(days = 1)
        
        return pd.DataFrame(records)

In [52]:
client = IEXSDK("cloud.iexapis.com", token=IEX_TOKEN)

In [85]:
df = client.get_stock_for_symbol("AAPL","20200401","20200415")

In [86]:
def merge_date_time(row):
    return datetime.strptime(f"{row.date} {row.minute}","%Y-%m-%d %H:%M")
df["datetime"] = df.apply(merge_date_time, axis=1)

In [102]:
average_df = df.groupby(by="date").agg({"average": [np.mean, np.std]})
average_df = average_df.average

In [112]:
average_df["mean"]

date
2020-04-02    241.171801
2020-04-03    241.649964
2020-04-06    254.435754
2020-04-07    264.501301
2020-04-08    264.512857
2020-04-09    267.016254
2020-04-13    268.731098
2020-04-14    285.627885
2020-04-15    283.825076
Name: mean, dtype: float64

In [119]:
fig = go.Figure()

fig.add_trace(go.Scatter(
    name="mean",
    x=average_df.index.values,
    y=average_df["mean"],
    mode="lines",
    line={"color":"red"}
))
fig.add_trace(go.Scatter(
    name="upper std",
    x=average_df.index.values,
    y=average_df["mean"]+average_df["std"],
    mode="lines",
    line={"color":"blue", "dash":"dot"}
))
fig.add_trace(go.Scatter(
    name="lower std",
    x=average_df.index.values,
    y=average_df["mean"]-average_df["std"],
    mode="lines",
    line={"color":"blue", "dash":"dot"}
))

In [77]:
fig = go.Figure(data=go.Ohlc(x=df['datetime'],
                open=df['open'],
                high=df['high'],
                low=df['low'],
                close=df['close'])
)
fig.show()

Unnamed: 0_level_0,average,average
Unnamed: 0_level_1,mean,std
date,Unnamed: 1_level_2,Unnamed: 2_level_2
2020-04-14,285.627885,1.650578
2020-04-15,283.825076,1.252668


In [83]:
px.box(df, x="date", y="average")