# Using an API to Capture COVID-19 Data

This Jupyter notebook demonstrates how to use two different application programming interfaces (APIs) that are made available through rapidapi.com:

- https://rapidapi.com/api-sports/api/covid-193 
- https://rapidapi.com/KishCom/api/covid-19-coronavirus-statistics

---

**Before continuing, you'll need register for a free API key at https://rapidapi.com**

---


In [1]:
# Enter your rapidapi key here:
RAPIDAPIKEY = 'yourkeygoeshere'    # <-- FIXME.  Enter your key here.

# We'll use this variable in several places below.

# NOTE:  Here's how I like to store/use passwords for shared notebooks:
#'''
import os
RAPIDAPIKEY = os.environ['RAPIDAPIKEY']
#''';
# Visit https://veroviz.org/documentation.html#installation for instructions.

In [2]:
# We'll need these libraries to help us import and manage the data:
import http.client
import json
import pandas as pd
import numpy as np

## 1) "COVID-19" API

We'll first use the API described here: https://rapidapi.com/api-sports/api/covid-193

This API provides 3 types of data:
1. A list of countries, 
2. Current statistics for each country, and 
3. Recent historical statistics for each country.

More info on this API can be found at:
- https://api-sports.io/documentation/covid-19

In [3]:
# In this cell we'll establish a connection for this particular API:
conn = http.client.HTTPSConnection("covid-193.p.rapidapi.com")

headers = {
    'x-rapidapi-host': "covid-193.p.rapidapi.com",
    'x-rapidapi-key': RAPIDAPIKEY
    }

In [4]:
# Now, we'll use the "countries" endpoint to get a list of countries:
conn.request("GET", "/countries", headers=headers)

resp  = conn.getresponse()
data = resp.read().decode("utf-8")

# `data` is a string, in the JSON (JavaScript Object Notation) format
print(type(data))
print(data)
# Note how `data` looks like a dictionary (but it's a string)

# Let's create a dictionary from this JSON data:
countries = json.loads(data)

<class 'str'>
{"get":"countries","parameters":[],"errors":[],"results":205,"response":["Afghanistan","Albania","Algeria","Andorra","Angola","Anguilla","Antigua-and-Barbuda","Argentina","Armenia","Aruba","Australia","Austria","Azerbaijan","Bahamas","Bahrain","Bangladesh","Barbados","Belarus","Belgium","Belize","Benin","Bermuda","Bhutan","Bolivia","Bosnia-and-Herzegovina","Brazil","British-Virgin-Islands","Brunei-","Bulgaria","Burkina-Faso","Cabo-Verde","Cambodia","Cameroon","Canada","CAR","Cayman-Islands","Chad","Channel-Islands","Chile","China","Colombia","Congo","Costa-Rica","Croatia","Cuba","Cura&ccedil;ao","Cyprus","Czechia","Denmark","Diamond-Princess-","Djibouti","Dominica","Dominican-Republic","DRC","Ecuador","Egypt","El-Salvador","Equatorial-Guinea","Eritrea","Estonia","Eswatini","Ethiopia","Faeroe-Islands","Fiji","Finland","France","French-Guiana","French-Polynesia","Gabon","Gambia","Georgia","Germany","Ghana","Gibraltar","Greece","Greenland","Grenada","Guadeloupe","Guam","Guat

### Let's investigate the structure of the `countries` dictionary:

In [5]:
# First, we'll just display the contents of `countries`
countries

{'get': 'countries',
 'parameters': [],
 'errors': [],
 'results': 205,
 'response': ['Afghanistan',
  'Albania',
  'Algeria',
  'Andorra',
  'Angola',
  'Anguilla',
  'Antigua-and-Barbuda',
  'Argentina',
  'Armenia',
  'Aruba',
  'Australia',
  'Austria',
  'Azerbaijan',
  'Bahamas',
  'Bahrain',
  'Bangladesh',
  'Barbados',
  'Belarus',
  'Belgium',
  'Belize',
  'Benin',
  'Bermuda',
  'Bhutan',
  'Bolivia',
  'Bosnia-and-Herzegovina',
  'Brazil',
  'British-Virgin-Islands',
  'Brunei-',
  'Bulgaria',
  'Burkina-Faso',
  'Cabo-Verde',
  'Cambodia',
  'Cameroon',
  'Canada',
  'CAR',
  'Cayman-Islands',
  'Chad',
  'Channel-Islands',
  'Chile',
  'China',
  'Colombia',
  'Congo',
  'Costa-Rica',
  'Croatia',
  'Cuba',
  'Cura&ccedil;ao',
  'Cyprus',
  'Czechia',
  'Denmark',
  'Diamond-Princess-',
  'Djibouti',
  'Dominica',
  'Dominican-Republic',
  'DRC',
  'Ecuador',
  'Egypt',
  'El-Salvador',
  'Equatorial-Guinea',
  'Eritrea',
  'Estonia',
  'Eswatini',
  'Ethiopia',
  'Faero

In [6]:
# Let's find all of the keys of the `countries` dictionary:
countries.keys()

dict_keys(['get', 'parameters', 'errors', 'results', 'response'])

In [7]:
# Now, let's see what data are associated with each key:
for key in countries.keys():
    print("key =", key)
    print(countries[key])
    print("\n")

key = get
countries


key = parameters
[]


key = errors
[]


key = results
205


key = response
['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola', 'Anguilla', 'Antigua-and-Barbuda', 'Argentina', 'Armenia', 'Aruba', 'Australia', 'Austria', 'Azerbaijan', 'Bahamas', 'Bahrain', 'Bangladesh', 'Barbados', 'Belarus', 'Belgium', 'Belize', 'Benin', 'Bermuda', 'Bhutan', 'Bolivia', 'Bosnia-and-Herzegovina', 'Brazil', 'British-Virgin-Islands', 'Brunei-', 'Bulgaria', 'Burkina-Faso', 'Cabo-Verde', 'Cambodia', 'Cameroon', 'Canada', 'CAR', 'Cayman-Islands', 'Chad', 'Channel-Islands', 'Chile', 'China', 'Colombia', 'Congo', 'Costa-Rica', 'Croatia', 'Cuba', 'Cura&ccedil;ao', 'Cyprus', 'Czechia', 'Denmark', 'Diamond-Princess-', 'Djibouti', 'Dominica', 'Dominican-Republic', 'DRC', 'Ecuador', 'Egypt', 'El-Salvador', 'Equatorial-Guinea', 'Eritrea', 'Estonia', 'Eswatini', 'Ethiopia', 'Faeroe-Islands', 'Fiji', 'Finland', 'France', 'French-Guiana', 'French-Polynesia', 'Gabon', 'Gambia', 'Georgia', 'Ger

In [8]:
# Our list of country names is found within the `response` key:
countries['response']

['Afghanistan',
 'Albania',
 'Algeria',
 'Andorra',
 'Angola',
 'Anguilla',
 'Antigua-and-Barbuda',
 'Argentina',
 'Armenia',
 'Aruba',
 'Australia',
 'Austria',
 'Azerbaijan',
 'Bahamas',
 'Bahrain',
 'Bangladesh',
 'Barbados',
 'Belarus',
 'Belgium',
 'Belize',
 'Benin',
 'Bermuda',
 'Bhutan',
 'Bolivia',
 'Bosnia-and-Herzegovina',
 'Brazil',
 'British-Virgin-Islands',
 'Brunei-',
 'Bulgaria',
 'Burkina-Faso',
 'Cabo-Verde',
 'Cambodia',
 'Cameroon',
 'Canada',
 'CAR',
 'Cayman-Islands',
 'Chad',
 'Channel-Islands',
 'Chile',
 'China',
 'Colombia',
 'Congo',
 'Costa-Rica',
 'Croatia',
 'Cuba',
 'Cura&ccedil;ao',
 'Cyprus',
 'Czechia',
 'Denmark',
 'Diamond-Princess-',
 'Djibouti',
 'Dominica',
 'Dominican-Republic',
 'DRC',
 'Ecuador',
 'Egypt',
 'El-Salvador',
 'Equatorial-Guinea',
 'Eritrea',
 'Estonia',
 'Eswatini',
 'Ethiopia',
 'Faeroe-Islands',
 'Fiji',
 'Finland',
 'France',
 'French-Guiana',
 'French-Polynesia',
 'Gabon',
 'Gambia',
 'Georgia',
 'Germany',
 'Ghana',
 'Gibralt

In [10]:
# Next, we'll use the "statistics" endpoint to get current info for each country:
conn.request("GET", "/statistics", headers=headers)

resp = conn.getresponse()
data = resp.read().decode("utf-8")

# Let's create a dictionary from this JSON data:
stats = json.loads(data)

stats

{'get': 'statistics',
 'parameters': [],
 'errors': [],
 'results': 206,
 'response': [{'country': 'China',
   'cases': {'new': '+45',
    'active': 2691,
    'critical': 742,
    'recovered': 75448,
    'total': 81439},
   'deaths': {'new': '+5', 'total': 3300},
   'day': '2020-03-30',
   'time': '2020-03-30T02:00:04+00:00'},
  {'country': 'Italy',
   'cases': {'new': '+5217',
    'active': 73880,
    'critical': 3906,
    'recovered': 13030,
    'total': 97689},
   'deaths': {'new': '+756', 'total': 10779},
   'day': '2020-03-30',
   'time': '2020-03-30T02:00:04+00:00'},
  {'country': 'Spain',
   'cases': {'new': '+6875',
    'active': 58598,
    'critical': 4165,
    'recovered': 14709,
    'total': 80110},
   'deaths': {'new': '+821', 'total': 6803},
   'day': '2020-03-30',
   'time': '2020-03-30T02:00:04+00:00'},
  {'country': 'USA',
   'cases': {'new': '+18426',
    'active': 134961,
    'critical': 2970,
    'recovered': 4559,
    'total': 142004},
   'deaths': {'new': '+264', '

In [11]:
# Again, let's see what data are associated with each key:
for key in stats.keys():
    print("key =", key)
    print(stats[key])
    print("\n")

key = get
statistics


key = parameters
[]


key = errors
[]


key = results
206


key = response
[{'country': 'China', 'cases': {'new': '+45', 'active': 2691, 'critical': 742, 'recovered': 75448, 'total': 81439}, 'deaths': {'new': '+5', 'total': 3300}, 'day': '2020-03-30', 'time': '2020-03-30T02:00:04+00:00'}, {'country': 'Italy', 'cases': {'new': '+5217', 'active': 73880, 'critical': 3906, 'recovered': 13030, 'total': 97689}, 'deaths': {'new': '+756', 'total': 10779}, 'day': '2020-03-30', 'time': '2020-03-30T02:00:04+00:00'}, {'country': 'Spain', 'cases': {'new': '+6875', 'active': 58598, 'critical': 4165, 'recovered': 14709, 'total': 80110}, 'deaths': {'new': '+821', 'total': 6803}, 'day': '2020-03-30', 'time': '2020-03-30T02:00:04+00:00'}, {'country': 'USA', 'cases': {'new': '+18426', 'active': 134961, 'critical': 2970, 'recovered': 4559, 'total': 142004}, 'deaths': {'new': '+264', 'total': 2484}, 'day': '2020-03-30', 'time': '2020-03-30T02:00:04+00:00'}, {'country': 'Germany', 'ca

In [12]:
# Our data of interest are held within the `response` key:
stats['response']

[{'country': 'China',
  'cases': {'new': '+45',
   'active': 2691,
   'critical': 742,
   'recovered': 75448,
   'total': 81439},
  'deaths': {'new': '+5', 'total': 3300},
  'day': '2020-03-30',
  'time': '2020-03-30T02:00:04+00:00'},
 {'country': 'Italy',
  'cases': {'new': '+5217',
   'active': 73880,
   'critical': 3906,
   'recovered': 13030,
   'total': 97689},
  'deaths': {'new': '+756', 'total': 10779},
  'day': '2020-03-30',
  'time': '2020-03-30T02:00:04+00:00'},
 {'country': 'Spain',
  'cases': {'new': '+6875',
   'active': 58598,
   'critical': 4165,
   'recovered': 14709,
   'total': 80110},
  'deaths': {'new': '+821', 'total': 6803},
  'day': '2020-03-30',
  'time': '2020-03-30T02:00:04+00:00'},
 {'country': 'USA',
  'cases': {'new': '+18426',
   'active': 134961,
   'critical': 2970,
   'recovered': 4559,
   'total': 142004},
  'deaths': {'new': '+264', 'total': 2484},
  'day': '2020-03-30',
  'time': '2020-03-30T02:00:04+00:00'},
 {'country': 'Germany',
  'cases': {'new'

In [13]:
# Note that `stats['response'] is a list of dictionaries.
# Each element of the list corresponds to a country.

# Let's look at the data for the first country:
stats['response'][0]

# This is a dictionary, where some keys (like 'cases' and 'deaths') are associated with sub-dictionaries.

{'country': 'China',
 'cases': {'new': '+45',
  'active': 2691,
  'critical': 742,
  'recovered': 75448,
  'total': 81439},
 'deaths': {'new': '+5', 'total': 3300},
 'day': '2020-03-30',
 'time': '2020-03-30T02:00:04+00:00'}

In [14]:
# It would be useful to create a pandas dataframe from our `stats['response']` dictionary.
stats_df_BAD = pd.DataFrame(stats['response'])
stats_df_BAD

# Unfortunately, the nested sub-dictionaries cause a problem.
# Note how the `cases` and `deaths` columns are actually strings (from our sub-dictionaries)

Unnamed: 0,cases,country,day,deaths,time
0,"{'new': '+45', 'active': 2691, 'critical': 742...",China,2020-03-30,"{'new': '+5', 'total': 3300}",2020-03-30T02:00:04+00:00
1,"{'new': '+5217', 'active': 73880, 'critical': ...",Italy,2020-03-30,"{'new': '+756', 'total': 10779}",2020-03-30T02:00:04+00:00
2,"{'new': '+6875', 'active': 58598, 'critical': ...",Spain,2020-03-30,"{'new': '+821', 'total': 6803}",2020-03-30T02:00:04+00:00
3,"{'new': '+18426', 'active': 134961, 'critical'...",USA,2020-03-30,"{'new': '+264', 'total': 2484}",2020-03-30T02:00:04+00:00
4,"{'new': '+4740', 'active': 52683, 'critical': ...",Germany,2020-03-30,"{'new': '+108', 'total': 541}",2020-03-30T02:00:04+00:00
5,"{'new': '+2901', 'active': 23278, 'critical': ...",Iran,2020-03-30,"{'new': '+123', 'total': 2640}",2020-03-30T02:00:04+00:00
6,"{'new': '+2599', 'active': 30366, 'critical': ...",France,2020-03-30,"{'new': '+292', 'total': 2606}",2020-03-30T02:00:04+00:00
7,"{'new': '+105', 'active': 4398, 'critical': 59...",S.-Korea,2020-03-30,"{'new': '+8', 'total': 152}",2020-03-30T02:00:04+00:00
8,"{'new': '+753', 'active': 12934, 'critical': 3...",Switzerland,2020-03-30,"{'new': '+36', 'total': 300}",2020-03-30T02:00:04+00:00
9,"{'new': '+2433', 'active': 18159, 'critical': ...",UK,2020-03-30,"{'new': '+209', 'total': 1228}",2020-03-30T02:00:04+00:00


Before we demonstrate a method for creating a useful dataframe, let's investigate the structure of `stats['response']` a little deeper...

In [15]:
# How many records do we have?
len(stats['response'])

206

In [16]:
# What is the first record?
stats['response'][0]

# Note:  This is a dictionary.

{'country': 'China',
 'cases': {'new': '+45',
  'active': 2691,
  'critical': 742,
  'recovered': 75448,
  'total': 81439},
 'deaths': {'new': '+5', 'total': 3300},
 'day': '2020-03-30',
 'time': '2020-03-30T02:00:04+00:00'}

In [17]:
# Here, we'll "flatten" the `stats['response'] dictionary so we can load it into a dataframe.
# Our desired output will be a list, where each element of the list is a dictionary.
#    - Each element of the list (i.e., each dictionary) will become a row of the dataframe.
#    - Each dictionary key will be a column of the dataframe.

# Start by creating an empty list
rows = []

# Loop over each element of the stats['response'] list.
for i in range(0,len(stats['response'])):
    # Initialize an empty dictionary, which will be a row of the dataframe
    flatDict = {}

    # stats['response'][i] is a dictionary.
    # We'll loop over each key in this dictionary:
    for key1 in stats['response'][i].keys():
        # key1 will be 'country', 'cases', 'deaths', 'day', and 'time'

        # You may want to uncomment the next 2 lines to see what's happening:
        # print(key1)
        # print(stats['response'][i][key1])
        
        if (type(stats['response'][i][key1]) is dict):
            # If this particular key holds a sub-dictionary, we'll loop over the 
            # keys in that dictionary.  This is how we "flatten" the original (nested) dictionary.
            # This will happen for the following key1 keys:
            #    - 'cases' :  A dictionary with keys 'new', 'active', 'critical', 'recovered', and 'total';
            #    - 'deaths':  A dictionary with keys 'new' and 'total'.
            
            for key2 in stats['response'][i][key1].keys():
                # We'll need to create a new key, by concatenating the first and second keys.
                # For example, 'casesnew' or 'deathstotal'
                newKey = key1 + key2
                
                # Now we'll add this new key (and its value) to our flattened dictionary:
                flatDict[newKey] = stats['response'][i][key1][key2]
        else:
            # If this key doesn't contain a sub-dictionary, just add the key/value pair to
            # our flattened dictionary:
            flatDict[key1] = stats['response'][i][key1]

    # Now, add our flattened dictionary as an element of the `rows` list
    rows.append(flatDict)

In [18]:
# Here's our list of dictionaries:
rows

[{'country': 'China',
  'casesnew': '+45',
  'casesactive': 2691,
  'casescritical': 742,
  'casesrecovered': 75448,
  'casestotal': 81439,
  'deathsnew': '+5',
  'deathstotal': 3300,
  'day': '2020-03-30',
  'time': '2020-03-30T02:00:04+00:00'},
 {'country': 'Italy',
  'casesnew': '+5217',
  'casesactive': 73880,
  'casescritical': 3906,
  'casesrecovered': 13030,
  'casestotal': 97689,
  'deathsnew': '+756',
  'deathstotal': 10779,
  'day': '2020-03-30',
  'time': '2020-03-30T02:00:04+00:00'},
 {'country': 'Spain',
  'casesnew': '+6875',
  'casesactive': 58598,
  'casescritical': 4165,
  'casesrecovered': 14709,
  'casestotal': 80110,
  'deathsnew': '+821',
  'deathstotal': 6803,
  'day': '2020-03-30',
  'time': '2020-03-30T02:00:04+00:00'},
 {'country': 'USA',
  'casesnew': '+18426',
  'casesactive': 134961,
  'casescritical': 2970,
  'casesrecovered': 4559,
  'casestotal': 142004,
  'deathsnew': '+264',
  'deathstotal': 2484,
  'day': '2020-03-30',
  'time': '2020-03-30T02:00:04+00

In [19]:
# How many records do we have?
print(len(rows))

# This should match the length of stats['response']
print(len(stats['response']))

206
206


In [20]:
# Now we can create our dataframe:
stats_df = pd.DataFrame(rows)
stats_df

Unnamed: 0,casesactive,casescritical,casesnew,casesrecovered,casestotal,country,day,deathsnew,deathstotal,time
0,2691,742,+45,75448,81439,China,2020-03-30,+5,3300,2020-03-30T02:00:04+00:00
1,73880,3906,+5217,13030,97689,Italy,2020-03-30,+756,10779,2020-03-30T02:00:04+00:00
2,58598,4165,+6875,14709,80110,Spain,2020-03-30,+821,6803,2020-03-30T02:00:04+00:00
3,134961,2970,+18426,4559,142004,USA,2020-03-30,+264,2484,2020-03-30T02:00:04+00:00
4,52683,1979,+4740,9211,62435,Germany,2020-03-30,+108,541,2020-03-30T02:00:04+00:00
5,23278,3206,+2901,12391,38309,Iran,2020-03-30,+123,2640,2020-03-30T02:00:04+00:00
6,30366,4632,+2599,7202,40174,France,2020-03-30,+292,2606,2020-03-30T02:00:04+00:00
7,4398,59,+105,5033,9583,S.-Korea,2020-03-30,+8,152,2020-03-30T02:00:04+00:00
8,12934,301,+753,1595,14829,Switzerland,2020-03-30,+36,300,2020-03-30T02:00:04+00:00
9,18159,163,+2433,135,19522,UK,2020-03-30,+209,1228,2020-03-30T02:00:04+00:00


In [22]:
# There's another "endpoint" named 'history'.  
# It will allow us to apply a filter on the request.
conn.request("GET", "/history?country=Italy", headers=headers)

resp = conn.getresponse()
data = resp.read()

# Let's create a dictionary from this JSON data:
hist = json.loads(data)
hist

{'get': 'history',
 'parameters': {'country': 'Italy'},
 'errors': [],
 'results': 14,
 'response': [{'country': 'Italy',
   'cases': {'new': '+5217',
    'active': 73880,
    'critical': 3906,
    'recovered': 13030,
    'total': 97689},
   'deaths': {'new': '+756', 'total': 10779},
   'day': '2020-03-30',
   'time': '2020-03-30T02:00:04+00:00'},
  {'country': 'Italy',
   'cases': {'new': '+5974',
    'active': 70065,
    'critical': 3856,
    'recovered': 12384,
    'total': 92472},
   'deaths': {'new': '+889', 'total': 10023},
   'day': '2020-03-29',
   'time': '2020-03-29T16:00:03+00:00'},
  {'country': 'Italy',
   'cases': {'new': '+5909',
    'active': 66414,
    'critical': 3732,
    'recovered': 10950,
    'total': 86498},
   'deaths': {'new': '+919', 'total': 9134},
   'day': '2020-03-28',
   'time': '2020-03-28T17:15:05+00:00'},
  {'country': 'Italy',
   'cases': {'new': '+6203',
    'active': 62013,
    'critical': 3612,
    'recovered': 10361,
    'total': 80589},
   'death

In [23]:
# Again, we'll "flatten" the dictionary so we can load it into a dataframe.

# Start by creating an empty list
histRows = []

# Loop over each element of the hist['response'] list.
for i in range(0,len(hist['response'])):
    # Initialize an empty dictionary, which will be a row of the dataframe
    flatDict = {}

    for key1 in hist['response'][i].keys():
        if (type(hist['response'][i][key1]) is dict):
            for key2 in hist['response'][i][key1].keys():
                newKey = key1 + key2
                flatDict[newKey] = hist['response'][i][key1][key2]
        else:
            flatDict[key1] = hist['response'][i][key1]

    histRows.append(flatDict)

# Now we can create our dataframe:
hist_df = pd.DataFrame(histRows)
hist_df

Unnamed: 0,casesactive,casescritical,casesnew,casesrecovered,casestotal,country,day,deathsnew,deathstotal,time
0,73880,3906,5217.0,13030,97689,Italy,2020-03-30,756.0,10779,2020-03-30T02:00:04+00:00
1,70065,3856,5974.0,12384,92472,Italy,2020-03-29,889.0,10023,2020-03-29T16:00:03+00:00
2,66414,3732,5909.0,10950,86498,Italy,2020-03-28,919.0,9134,2020-03-28T17:15:05+00:00
3,62013,3612,6203.0,10361,80589,Italy,2020-03-27,712.0,8215,2020-03-27T16:45:06+00:00
4,61963,3612,6153.0,10361,80539,Italy,2020-03-26,712.0,8215,2020-03-26T18:15:06+00:00
5,62013,3612,6153.0,10361,80539,Italy,2020-03-26,662.0,8165,2020-03-26T17:45:06+00:00
6,57521,3489,5210.0,9362,74386,Italy,2020-03-26,683.0,7503,2020-03-26T17:00:05+00:00
7,54030,3393,5249.0,8326,69176,Italy,2020-03-25,743.0,6820,2020-03-25T17:00:06+00:00
8,50418,3204,4789.0,7432,63927,Italy,2020-03-24,601.0,6077,2020-03-24T17:00:06+00:00
9,50418,3204,4790.0,7432,63928,Italy,2020-03-23,602.0,6078,2020-03-23T17:30:06+00:00


--- 

## 2) "COVID-19" API

We'll now use the API described here: https://rapidapi.com/KishCom/api/covid-19-coronavirus-statistics

This API has only one endpoint:
1. Current statistics for each country.

In [24]:
conn = http.client.HTTPSConnection("covid-19-coronavirus-statistics.p.rapidapi.com")

headers = {
    'x-rapidapi-host': "covid-19-coronavirus-statistics.p.rapidapi.com",
    'x-rapidapi-key': RAPIDAPIKEY
    }

In [25]:
conn.request("GET", "/v1/stats", headers=headers)

resp = conn.getresponse()
data = resp.read().decode("utf-8")

# Let's create a dictionary from this JSON data:
stats2 = json.loads(data)
stats2

{'error': False,
 'statusCode': 200,
 'message': 'OK',
 'data': {'lastChecked': '2020-03-30T01:49:37.363Z',
  'covid19Stats': [{'city': 'Abbeville',
    'province': 'South Carolina',
    'country': 'US',
    'lastUpdate': '2020-03-29 23:08:25',
    'keyId': 'Abbeville, South Carolina, US',
    'confirmed': 3,
    'deaths': 0,
    'recovered': 0},
   {'city': 'Acadia',
    'province': 'Louisiana',
    'country': 'US',
    'lastUpdate': '2020-03-29 23:08:25',
    'keyId': 'Acadia, Louisiana, US',
    'confirmed': 9,
    'deaths': 1,
    'recovered': 0},
   {'city': 'Accomack',
    'province': 'Virginia',
    'country': 'US',
    'lastUpdate': '2020-03-29 23:08:25',
    'keyId': 'Accomack, Virginia, US',
    'confirmed': 3,
    'deaths': 0,
    'recovered': 0},
   {'city': 'Ada',
    'province': 'Idaho',
    'country': 'US',
    'lastUpdate': '2020-03-29 23:08:25',
    'keyId': 'Ada, Idaho, US',
    'confirmed': 92,
    'deaths': 1,
    'recovered': 0},
   {'city': 'Adair',
    'province'

In [26]:
# Notice that this API produces output with different keys!
stats2.keys()

dict_keys(['error', 'statusCode', 'message', 'data'])

In [27]:
# Here's the structure of the returned data:
for key in stats2.keys():
    print('key =', key)
    print(stats2[key])
    print('\n')

key = error
False


key = statusCode
200


key = message
OK


key = data
{'lastChecked': '2020-03-30T01:49:37.363Z', 'covid19Stats': [{'city': 'Abbeville', 'province': 'South Carolina', 'country': 'US', 'lastUpdate': '2020-03-29 23:08:25', 'keyId': 'Abbeville, South Carolina, US', 'confirmed': 3, 'deaths': 0, 'recovered': 0}, {'city': 'Acadia', 'province': 'Louisiana', 'country': 'US', 'lastUpdate': '2020-03-29 23:08:25', 'keyId': 'Acadia, Louisiana, US', 'confirmed': 9, 'deaths': 1, 'recovered': 0}, {'city': 'Accomack', 'province': 'Virginia', 'country': 'US', 'lastUpdate': '2020-03-29 23:08:25', 'keyId': 'Accomack, Virginia, US', 'confirmed': 3, 'deaths': 0, 'recovered': 0}, {'city': 'Ada', 'province': 'Idaho', 'country': 'US', 'lastUpdate': '2020-03-29 23:08:25', 'keyId': 'Ada, Idaho, US', 'confirmed': 92, 'deaths': 1, 'recovered': 0}, {'city': 'Adair', 'province': 'Iowa', 'country': 'US', 'lastUpdate': '2020-03-29 23:08:25', 'keyId': 'Adair, Iowa, US', 'confirmed': 1, 'deaths': 0, 

In [28]:
# We have nested dictionaries...our dataframe will be messed up:
bad_df = pd.DataFrame(stats2['data'])
bad_df

Unnamed: 0,lastChecked,covid19Stats
0,2020-03-30T01:49:37.363Z,"{'city': 'Abbeville', 'province': 'South Carol..."
1,2020-03-30T01:49:37.363Z,"{'city': 'Acadia', 'province': 'Louisiana', 'c..."
2,2020-03-30T01:49:37.363Z,"{'city': 'Accomack', 'province': 'Virginia', '..."
3,2020-03-30T01:49:37.363Z,"{'city': 'Ada', 'province': 'Idaho', 'country'..."
4,2020-03-30T01:49:37.363Z,"{'city': 'Adair', 'province': 'Iowa', 'country..."
5,2020-03-30T01:49:37.363Z,"{'city': 'Adair', 'province': 'Kentucky', 'cou..."
6,2020-03-30T01:49:37.363Z,"{'city': 'Adair', 'province': 'Missouri', 'cou..."
7,2020-03-30T01:49:37.363Z,"{'city': 'Adair', 'province': 'Oklahoma', 'cou..."
8,2020-03-30T01:49:37.363Z,"{'city': 'Adams', 'province': 'Colorado', 'cou..."
9,2020-03-30T01:49:37.363Z,"{'city': 'Adams', 'province': 'Idaho', 'countr..."


In [29]:
stats2['data']

{'lastChecked': '2020-03-30T01:49:37.363Z',
 'covid19Stats': [{'city': 'Abbeville',
   'province': 'South Carolina',
   'country': 'US',
   'lastUpdate': '2020-03-29 23:08:25',
   'keyId': 'Abbeville, South Carolina, US',
   'confirmed': 3,
   'deaths': 0,
   'recovered': 0},
  {'city': 'Acadia',
   'province': 'Louisiana',
   'country': 'US',
   'lastUpdate': '2020-03-29 23:08:25',
   'keyId': 'Acadia, Louisiana, US',
   'confirmed': 9,
   'deaths': 1,
   'recovered': 0},
  {'city': 'Accomack',
   'province': 'Virginia',
   'country': 'US',
   'lastUpdate': '2020-03-29 23:08:25',
   'keyId': 'Accomack, Virginia, US',
   'confirmed': 3,
   'deaths': 0,
   'recovered': 0},
  {'city': 'Ada',
   'province': 'Idaho',
   'country': 'US',
   'lastUpdate': '2020-03-29 23:08:25',
   'keyId': 'Ada, Idaho, US',
   'confirmed': 92,
   'deaths': 1,
   'recovered': 0},
  {'city': 'Adair',
   'province': 'Iowa',
   'country': 'US',
   'lastUpdate': '2020-03-29 23:08:25',
   'keyId': 'Adair, Iowa, US

In [30]:
stats2['data'].keys()

dict_keys(['lastChecked', 'covid19Stats'])

In [31]:
# Let's dig a little deeper into stats2['data']:
for key in stats2['data'].keys():
    print('key =', key)
    print(stats2['data'][key])
    print('\n')

key = lastChecked
2020-03-30T01:49:37.363Z


key = covid19Stats
[{'city': 'Abbeville', 'province': 'South Carolina', 'country': 'US', 'lastUpdate': '2020-03-29 23:08:25', 'keyId': 'Abbeville, South Carolina, US', 'confirmed': 3, 'deaths': 0, 'recovered': 0}, {'city': 'Acadia', 'province': 'Louisiana', 'country': 'US', 'lastUpdate': '2020-03-29 23:08:25', 'keyId': 'Acadia, Louisiana, US', 'confirmed': 9, 'deaths': 1, 'recovered': 0}, {'city': 'Accomack', 'province': 'Virginia', 'country': 'US', 'lastUpdate': '2020-03-29 23:08:25', 'keyId': 'Accomack, Virginia, US', 'confirmed': 3, 'deaths': 0, 'recovered': 0}, {'city': 'Ada', 'province': 'Idaho', 'country': 'US', 'lastUpdate': '2020-03-29 23:08:25', 'keyId': 'Ada, Idaho, US', 'confirmed': 92, 'deaths': 1, 'recovered': 0}, {'city': 'Adair', 'province': 'Iowa', 'country': 'US', 'lastUpdate': '2020-03-29 23:08:25', 'keyId': 'Adair, Iowa, US', 'confirmed': 1, 'deaths': 0, 'recovered': 0}, {'city': 'Adair', 'province': 'Kentucky', 'country':

In [33]:
# Let's dig a little deeper into stats2['data']['covid19Stats']:
for key in stats2['data']['covid19Stats'][0].keys():
    print('key =', key)
    print(stats2['data']['covid19Stats'][0][key])
    print('\n')

key = city
Abbeville


key = province
South Carolina


key = country
US


key = lastUpdate
2020-03-29 23:08:25


key = keyId
Abbeville, South Carolina, US


key = confirmed
3


key = deaths
0


key = recovered
0




In [34]:
# Again, we'll "flatten" the dictionary so we can load it into a dataframe.
# However, the structure of our input data is different for this API.

statsRows = []

# The 'lastChecked' key appears only once.  We'll save this value for all rows.
lastChecked = stats2['data']['lastChecked']

# The 'covid19Stats' key is a list of dictionaries.
for i in range(0, len(stats2['data']['covid19Stats'])):
    # Initialize an empty dictionary, which will be a row of the dataframe
    flatDict = {}

    # Add our static value of `lastChecked`
    flatDict['lastChecked'] = lastChecked
    
    for key in stats2['data']['covid19Stats'][i].keys():
        flatDict[key] = stats2['data']['covid19Stats'][i][key]

    statsRows.append(flatDict)

# Now we can create our dataframe:
stats_df = pd.DataFrame(statsRows)
stats_df

Unnamed: 0,city,confirmed,country,deaths,keyId,lastChecked,lastUpdate,province,recovered
0,Abbeville,3,US,0,"Abbeville, South Carolina, US",2020-03-30T01:49:37.363Z,2020-03-29 23:08:25,South Carolina,0
1,Acadia,9,US,1,"Acadia, Louisiana, US",2020-03-30T01:49:37.363Z,2020-03-29 23:08:25,Louisiana,0
2,Accomack,3,US,0,"Accomack, Virginia, US",2020-03-30T01:49:37.363Z,2020-03-29 23:08:25,Virginia,0
3,Ada,92,US,1,"Ada, Idaho, US",2020-03-30T01:49:37.363Z,2020-03-29 23:08:25,Idaho,0
4,Adair,1,US,0,"Adair, Iowa, US",2020-03-30T01:49:37.363Z,2020-03-29 23:08:25,Iowa,0
5,Adair,0,US,0,"Adair, Kentucky, US",2020-03-30T01:49:37.363Z,2020-03-29 23:08:25,Kentucky,0
6,Adair,1,US,0,"Adair, Missouri, US",2020-03-30T01:49:37.363Z,2020-03-29 23:08:25,Missouri,0
7,Adair,4,US,0,"Adair, Oklahoma, US",2020-03-30T01:49:37.363Z,2020-03-29 23:08:25,Oklahoma,0
8,Adams,110,US,0,"Adams, Colorado, US",2020-03-30T01:49:37.363Z,2020-03-29 23:08:25,Colorado,0
9,Adams,0,US,0,"Adams, Idaho, US",2020-03-30T01:49:37.363Z,2020-03-29 23:08:25,Idaho,0
