# Web APIs

Instead of manually scraping HTML data, some web sites provide a convenient "official" way of retrieving their data via a **Web API (Application Programming Interface)**. APIs provide structured access to data and are generally more reliable and respectful than web scraping.

**Understanding HTTP Status Codes:**

When working with APIs, we will encounter various HTTP status codes when we try to access them:
- *200*: Success - Request completed successfully. Hopefully we will always get this response!
- *400*: Bad Request - Invalid request parameters
- *401*: Unauthorized - Authentication required
- *403*: Forbidden - Access denied
- *404*: Not Found - Resource doesn't exist
- *429*: Too Many Requests - Rate limit exceeded
- *500*: Internal Server Error - Some server-side error has occurred

In this notebook we look at two examples of accessing public APIs with Python.

## Example - Wikipedia

As a simple example of using an online API, we will use the Wikipedia web API to perform a search for Wikipedia pages with titles which match a particular query keyword.

The endpoint for this API is given below. The complete documentation for using this endpoint is [online here](https://en.wikipedia.org/w/api.php):

In [None]:
end_point = "https://en.wikipedia.org/w/api.php"

We build a URL that includes the endpoint and the query parameters which specify what we are looking for. In this case, we are looking for Wikipedia article with the query keyword "Dublin":

In [None]:
# the keyword in page titles that we are searching for
keyword = "Dublin"
url = "%s?action=query&list=search&format=json&srsearch=%s" % (end_point, keyword)
print(url)

We send our request to the API using a standard HTTP request using `urllib.request.urlopen`:

In [None]:
import urllib.request
import urllib.error

try:
    # perform the request 
    # note that we need to add a special header to appear to be a proper web browser
    headers = {"User-Agent": "Mozilla/5.0"}
    req = urllib.request.Request(url, headers=headers)
    response = urllib.request.urlopen(req)
    
    # check if the request was successful
    if response.getcode() == 200:
        # read the body of the response
        raw_json = response.read().decode("utf-8")
        print(f"Successfully retrieved data (HTTP {response.getcode()})")
    else:
        print(f"Unexpected HTTP status code: {response.getcode()}")
        raw_json = None
except urllib.error.HTTPError as e:
    print(f"HTTP Error {e.code}: {e.reason}")
except urllib.error.URLError as e:
    print(f"Network Error: {e.reason}")
except Exception as e:
    print(f"Unexpected error: {e}")

Once we have downloaded the JSON data into a string, we parse it using the *loads()* function from the `json` module, which will convert it into an actual Python dictionary that we can easily work with.

In [None]:
import json

response_data = json.loads(raw_json)
print("Successfully parsed API response")
# display the structure to understand the data format
if isinstance(response_data, dict):
    print("Response keys:", list(response_data.keys()))
print("Response data:", response_data)

We can now process the results sent back by the API. For instance, we could print the top results return for our keyword query: 

In [None]:
# extract the part of the response that contains the search results
search_results = response_data["query"]["search"]
print(f"Found {len(search_results)} search results:")

# print out the results returned by the API
for i, result in enumerate(search_results, 1):
    if "title" in result:
        print(f"{i}. {result['title']}")
    else:
        print(f"{i}. (Title not available)")

## Example - Currency Exchange Rates

In the next example, we will use the *frankfurter.app* (formerly *Fixer.io*) API to get currency exchange rate information: https://frankfurter.app

To retrieve all rates in EUROs, we retrieve data from the API end point: https://api.frankfurter.app/latest - we do not need to specify any parameters in the URL in this case. 

In [None]:
end_point = "https://api.frankfurter.app/latest"

try:
    # perform the request
    # note that we need to add a special header to appear to be a proper web browser
    headers = {"User-Agent": "Mozilla/5.0"}
    req = urllib.request.Request(end_point, headers=headers)
    
    # retrieve the data from the API endpoint
    print(f"Requesting data from: {end_point}")
    response = urllib.request.urlopen(req)
    
    # check if the request was successful
    if response.getcode() == 200:
        # read the body of the response
        raw_json = response.read().decode("utf-8")
        print(f"Successfully retrieved currency data (HTTP {response.getcode()})")
        print("Raw JSON response:")
        print(raw_json)
    else:
        print(f"Unexpected HTTP status code: {response.getcode()}")
        raw_json = None
except urllib.error.HTTPError as e:
    print(f"HTTP Error {e.code}: {e.reason}")
except urllib.error.URLError as e:
    print(f"Network Error: {e.reason}")
except Exception as e:
    print(f"Unexpected error: {e}")

Parse the JSON data

In [None]:
data = json.loads(raw_json)
print("Successfully parsed currency data")

# display the information provided by the API
print("Available data fields:", list(data.keys()))

# show all rates if available
if "rates" in data:
    print(f"Currency rates (base: {data.get('base', 'EUR')}):")
    for currency, rate in data["rates"].items():
        print(f"  {currency}: {rate:.4f}")
else:
    print("No rates data found")

In [None]:
# get a specific rate with error handling
if data and "rates" in data:
    chf_rate = data["rates"]["CHF"]
    print(f"Swiss Franc (CHF) rate: {chf_rate}")
else:
    print("No currency data available")

We can change the URL to get rates for a different currency, such as US Dollars (USD). We do this by adding an extra parameter to the URL.

In [None]:
# create a URL based on the end point, with extra parameters
url = f"{end_point}?base=USD"
print(f"Requesting USD-based rates from: {url}")

try:
    # perform the request
    # note that we need to add a special header to appear to be a proper web browser
    headers = {"User-Agent": "Mozilla/5.0"}
    req = urllib.request.Request(url, headers=headers)
    response = urllib.request.urlopen(req)
    
    # check if the request was successful
    if response.getcode() == 200:
        usd_raw_json = response.read().decode("utf-8")
        print(f"Successfully retrieved currency data (HTTP {response.getcode()})")
        
        # parse the JSON
        usd_data = json.loads(usd_raw_json)
        # display the USD-based rates
        if "rates" in usd_data:
            print(f"\nCurrency rates (base: {usd_data.get('base', 'USD')}):")
            print("Sample rates:")
            # show first few rates as example
            for i, (currency, rate) in enumerate(usd_data["rates"].items()):
                print(f"  {currency}: {rate:.4f}")
        else:
            print("No rates found in USD response")
    else:
        print(f"Unexpected HTTP status code: {response.getcode()}")
        usd_raw_json = None
        
except urllib.error.HTTPError as e:
    print(f"HTTP Error {e.code}: {e.reason}")
except urllib.error.URLError as e:
    print(f"Network Error: {e.reason}")
except Exception as e:
    print(f"Unexpected error: {e}")