<a href="https://colab.research.google.com/github/wanaflah/SL---Data-Engineering/blob/main/1.%20API/Reading_from_Public_API.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Reading data from Public API - REST API

In this notebook, the objective is to extract data from publicly available API for our usage. Generally this involves 3 steps:

1. Importing necessary libraries

> *requests* 

> *json* (to read our data)

> *pandas* (not necessary, but is used in this example)

2. Structuring the query

> World Bank Data

3. Making the requests

> Data inspection

> Status Code(s)









## Importing Libraries

In [1]:
import pandas as pd
import json
import requests

## Structuring the Query

### Using World Bank Data

[World Bank Data](https://data.worldbank.org/)

Normally, API data provider will have a developer page which defines how to grab the data. 

[World Bank Data Developer Page](https://datahelpdesk.worldbank.org/knowledgebase/articles/898581-api-basic-call-structures)

Through World Bank Data example, there are two main ways for us to structure our query:
*   Argument based: http://api.worldbank.org/V2/country?incomeLevel=LIC
*   URL based: http://api.worldbank.org/V2/incomeLevel/LIC/country

#### Example 1 : Reading Country Data 

In this example, we want to extract general information about a country.

In [2]:
# Structure the query

query = "http://api.worldbank.org/v2/country?format=json"

**Sanity Check**

*   This part is not necessary, but can be useful. To perform sanity check, simply go to query. You should be seeing json formatted data.



## Making the requests

In [3]:
response = requests.get(query)

Once code successfully ran, we can check the status code of our request. Some of the example status code are as follows:



1. 200: Everything went okay, and the result has been returned (if any).
2. 301: The server is redirecting you to a different endpoint. This can happen when a company switches domain names, or an endpoint name is changed.
3. 400: The server thinks you made a bad request. This can happen when you don’t send along the right data, among other things.
4. 401: The server thinks you’re not authenticated. Many APIs require login ccredentials, so this happens when you don’t send the right credentials to access an API.
5. 403: The resource you’re trying to access is forbidden: you don’t have the right permissions to see it.
6. 404: The resource you tried to access wasn’t found on the server.
7. 503: The server is not ready to handle the request

[Reference](https://www.restapitutorial.com/httpstatuscodes.html)


In [4]:
response.status_code

200

Yeay! This means that our request are successful. Next, let's inspect our data

## Data Inspection

In [5]:
print(response.json())

[{'page': 1, 'pages': 6, 'per_page': '50', 'total': 297}, [{'id': 'ABW', 'iso2Code': 'AW', 'name': 'Aruba', 'region': {'id': 'LCN', 'iso2code': 'ZJ', 'value': 'Latin America & Caribbean '}, 'adminregion': {'id': '', 'iso2code': '', 'value': ''}, 'incomeLevel': {'id': 'HIC', 'iso2code': 'XD', 'value': 'High income'}, 'lendingType': {'id': 'LNX', 'iso2code': 'XX', 'value': 'Not classified'}, 'capitalCity': 'Oranjestad', 'longitude': '-70.0167', 'latitude': '12.5167'}, {'id': 'AFG', 'iso2Code': 'AF', 'name': 'Afghanistan', 'region': {'id': 'SAS', 'iso2code': '8S', 'value': 'South Asia'}, 'adminregion': {'id': 'SAS', 'iso2code': '8S', 'value': 'South Asia'}, 'incomeLevel': {'id': 'LIC', 'iso2code': 'XM', 'value': 'Low income'}, 'lendingType': {'id': 'IDX', 'iso2code': 'XI', 'value': 'IDA'}, 'capitalCity': 'Kabul', 'longitude': '69.1761', 'latitude': '34.5228'}, {'id': 'AFR', 'iso2Code': 'A9', 'name': 'Africa', 'region': {'id': 'NA', 'iso2code': 'NA', 'value': 'Aggregates'}, 'adminregion': 

Ouch, that is rather hard to read. To make it more readable, let's use *json* library.

In [6]:
data_view = json.dumps(response.json(), indent=4)
data = response.json()

In [7]:
#print(data)

Based on our inspection, we can see that there's two part of the result. 

1. Basic information
        {
        "page": 1,
        "pages": 6,
        "per_page": "50",
        "total": 297},

2.   The actual data we are interested in
            {"id": "ABW",
            "iso2Code": "AW",
            "name": "Aruba",
            "region": {
                "id": "LCN",
                "iso2code": "ZJ",
                "value": "Latin America & Caribbean "
            },
            "adminregion": {
                "id": "",
                "iso2code": "",
                "value": ""
            },
            "incomeLevel": {
                "id": "HIC",
                "iso2code": "XD",
                "value": "High income"
            },
            "lendingType": {
                "id": "LNX",
                "iso2code": "XX",
                "value": "Not classified"
            },
            "capitalCity": "Oranjestad",
            "longitude": "-70.0167",
            "latitude": "12.5167" },

In [8]:
data[1]

[{'adminregion': {'id': '', 'iso2code': '', 'value': ''},
  'capitalCity': 'Oranjestad',
  'id': 'ABW',
  'incomeLevel': {'id': 'HIC', 'iso2code': 'XD', 'value': 'High income'},
  'iso2Code': 'AW',
  'latitude': '12.5167',
  'lendingType': {'id': 'LNX', 'iso2code': 'XX', 'value': 'Not classified'},
  'longitude': '-70.0167',
  'name': 'Aruba',
  'region': {'id': 'LCN',
   'iso2code': 'ZJ',
   'value': 'Latin America & Caribbean '}},
 {'adminregion': {'id': 'SAS', 'iso2code': '8S', 'value': 'South Asia'},
  'capitalCity': 'Kabul',
  'id': 'AFG',
  'incomeLevel': {'id': 'LIC', 'iso2code': 'XM', 'value': 'Low income'},
  'iso2Code': 'AF',
  'latitude': '34.5228',
  'lendingType': {'id': 'IDX', 'iso2code': 'XI', 'value': 'IDA'},
  'longitude': '69.1761',
  'name': 'Afghanistan',
  'region': {'id': 'SAS', 'iso2code': '8S', 'value': 'South Asia'}},
 {'adminregion': {'id': '', 'iso2code': '', 'value': ''},
  'capitalCity': '',
  'id': 'AFR',
  'incomeLevel': {'id': 'NA', 'iso2code': 'NA', 'va

In [9]:
test = pd.DataFrame.from_dict(data[1])

In [10]:
test.tail()

Unnamed: 0,id,iso2Code,name,region,adminregion,incomeLevel,lendingType,capitalCity,longitude,latitude
45,CHE,CH,Switzerland,"{'id': 'ECS', 'iso2code': 'Z7', 'value': 'Euro...","{'id': '', 'iso2code': '', 'value': ''}","{'id': 'HIC', 'iso2code': 'XD', 'value': 'High...","{'id': 'LNX', 'iso2code': 'XX', 'value': 'Not ...",Bern,7.44821,46.948
46,CHI,JG,Channel Islands,"{'id': 'ECS', 'iso2code': 'Z7', 'value': 'Euro...","{'id': '', 'iso2code': '', 'value': ''}","{'id': 'HIC', 'iso2code': 'XD', 'value': 'High...","{'id': 'LNX', 'iso2code': 'XX', 'value': 'Not ...",,,
47,CHL,CL,Chile,"{'id': 'LCN', 'iso2code': 'ZJ', 'value': 'Lati...","{'id': '', 'iso2code': '', 'value': ''}","{'id': 'HIC', 'iso2code': 'XD', 'value': 'High...","{'id': 'IBD', 'iso2code': 'XF', 'value': 'IBRD'}",Santiago,-70.6475,-33.475
48,CHN,CN,China,"{'id': 'EAS', 'iso2code': 'Z4', 'value': 'East...","{'id': 'EAP', 'iso2code': '4E', 'value': 'East...","{'id': 'UMC', 'iso2code': 'XT', 'value': 'Uppe...","{'id': 'IBD', 'iso2code': 'XF', 'value': 'IBRD'}",Beijing,116.286,40.0495
49,CIV,CI,Cote d'Ivoire,"{'id': 'SSF', 'iso2code': 'ZG', 'value': 'Sub-...","{'id': 'SSA', 'iso2code': 'ZF', 'value': 'Sub-...","{'id': 'LMC', 'iso2code': 'XN', 'value': 'Lowe...","{'id': 'IDX', 'iso2code': 'XI', 'value': 'IDA'}",Yamoussoukro,-4.0305,5.332


In [11]:
data[1][0]

{'adminregion': {'id': '', 'iso2code': '', 'value': ''},
 'capitalCity': 'Oranjestad',
 'id': 'ABW',
 'incomeLevel': {'id': 'HIC', 'iso2code': 'XD', 'value': 'High income'},
 'iso2Code': 'AW',
 'latitude': '12.5167',
 'lendingType': {'id': 'LNX', 'iso2code': 'XX', 'value': 'Not classified'},
 'longitude': '-70.0167',
 'name': 'Aruba',
 'region': {'id': 'LCN',
  'iso2code': 'ZJ',
  'value': 'Latin America & Caribbean '}}

## Making requests - Query Parameters

In this part, we will try to add query parameters as part of the request

In [13]:
url = "http://api.worldbank.org/V2/country"
params = {"incomeLevel" : "LIC", "format" : "json"}

In [14]:
response = requests.get(url, params=params)

In [15]:
response.status_code

200

In [18]:
data = response.json()