<a href="https://colab.research.google.com/github/e-paj/M2M_Tech/blob/main/Reading_From_API.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Reading From an API - Weather Application

Up until now, we've been following the pattern of manually downloading some data, then graphing it. However, this pattern won't work for most applications.

For example, while the 2020 US elections were happening, the developers behind the data visualizations on the news weren't manually downloading tables of data as new votes came in, then manually updating the visualizations. Instead, that data was being sent to their visualization application, and they wrote code to receive the data and visualize it.

This practice of requesting and receiving data programmatically is called reading from an API. API stands for Application Programming Interface, and APIs are sources of *live data*. So instead of downloading one CSV file and showing that same data over and over, we can write programs to read from an API and show the most up-to-date information every time our code runs.

In this notebook we will learn how to read from an API to show a visualization of the live weather forecast.

## Our First API Request

First let's see how we can make a request to an API. For this notebook, we will be using the [Geospatial Web Services API](https://eccc-msc.github.io/open-data/msc-geomet/web-services_en/#wms-getfeatureinfo) provided by the Government of Canada. This API allows us to access up-to-date weather data for many locations in the world.

> Note: an API is basically just a computer somewhere else in the world which is available through a URL, and which will perform actions (such as returning data) based on the contents of requests sent to it.

In Python, to use an API we use the `requests` module.

In [1]:
import requests

Next, we build the URL for making the request. To start, all API have a *base URL*. This is the URL which you use for all requests to this API.

In [2]:
api_base = 'https://geo.weather.gc.ca/geomet'

API's frequently also have *parameters* which need to be added on to the base URL. These parameters follow the form `PARAMETER_NAME=PARAMETER_VALUE`, and each parameter is separated by a `&`.

Our API has many parameters which are required for it to work, and we'll look more at each parameter later, but for now just know that these parameters request the air temperature prediction for Calgary Alberta at 3 o'clock.

> Because the API only provides predictions for the present, the `day` variable must be updated to the current day.

In [3]:
# Update this
day = '2024-04-04'
# Note: this is just a really long string broken into smaller pieces
api_params = (
    'SERVICE=WMS'
    '&VERSION=1.3.0'
    '&REQUEST=GetFeatureInfo'
    '&BBOX=51,-114,51.5,-113.5'
    '&CRS=EPSG:4326'
    '&WIDTH=10'
    '&HEIGHT=10'
    '&LAYERS=GDPS.ETA_TT'
    '&QUERY_LAYERS=GDPS.ETA_TT'
    '&INFO_FORMAT=application/json'
    '&I=5'
    '&J=5'
    f'&TIME={day}T15:00:00Z'
)

Finally, we just join the API base and the parameters together with a `?` as a separator and we are finished building the API URL.

In [4]:
url = api_base + '?' + api_params

Now that we have the URL, we can make a request to the API, and view the results.

In [None]:
# API is slow, so we can leave this commented to save time
# res = requests.get(url)
# res.json()

{'type': 'FeatureCollection',
 'layer': 'GDPS.ETA_TT',
 'features': [{'type': 'Feature',
   'id': 'GDPS.ETA_TT(-114.3,51.9)',
   'geometry': {'type': 'Point', 'coordinates': [-114.3, 51.9]},
   'properties': {'value': '-1.4165405',
    'class': '-5 0',
    'time': '2024-04-04T15:00:00Z',
    'dim_reference_time': '2024-04-04T12:00:00Z'}}]}

The API returned JSON data to us, which we parsed using the `json` method on the response. This JSON contains the predicted air temperature in the `value` key within the `properties` object.

In [None]:
# challenge answer

res.json()["features"][0]["properties"]["value"]

'-1.4165405'

## Making Requests Easier

Now we know how to make requests and receive data, but the method above is not very easy to use. Let's write code so that we can easily ask for weather data at specific times of day and at any location.

We'll start by getting the current time so we can tell the API what time our weather predictions should be for. To do this we'll need a couple imports from the `datetime` module. This module provides many useful features for working with dates and time, for example `datetime.now()` to get a representation of the current time, and `timedelta` which can be used to create objects for doing math with time.

The current time might not be what you expect. This is because we are using Coordinated Universal Time (UTC), which is a time that is the same regardless of location on Earth. The API expects UTC.

In the `my_time_difference` variable, store the difference between your time and the time printed. For example, for me it shows the hour 15:12:10, but where I am it is really 8:12:10, so my time difference is $8-15=-7$, meaning I am 7 hours behind. We'll use this variable later on.


In [5]:
from datetime import datetime, timedelta

print(datetime.now())
my_time_difference = -6
print(datetime.now() + timedelta(hours=my_time_difference))

time = datetime.now() + timedelta(hours=my_time_difference)

2024-04-06 00:04:06.874116
2024-04-05 18:04:06.881198


The API only has times available at three hour intervals following midnight, so we can request temperatures for midnight, 3am, 6am, 9am, etc.. Therefore we need to round to an hour which is a multiple of 3, and this function will help us do that.


In [6]:
def round_multiple(num, multiple):
    """Rounds `num` to the nearest multiple of `multiple`.

    From: https://stackoverflow.com/a/29557629/14703577
    """
    return ((num + multiple // 2) // multiple) * multiple

Now we get a string representation of the time in ISO8601 format with precision up to the second, the format the API expects (see the [Handling time section of the documentation](https://eccc-msc.github.io/open-data/msc-geomet/web-services_en/#handling-time)).

In [7]:
def get_api_datetime(time: datetime) -> str:
    """Rounds `time` to the nearest hour multiple of 3, and returns `time`
    formatted to a ISO8601 UTC time string expected by the API."""
    # Convert the time to UTC
    utc_time = datetime.utcfromtimestamp(time.timestamp())
    # Round to an hour the API accepts
    rounded_hour = round_multiple(utc_time.hour, 3)

    # Check if we're going into the next day after rounding
    if rounded_hour >= 24:
        # Round to the start of the next day
        utc_time = utc_time.replace(hour=0, minute=0, second=0) + timedelta(days=1)
    else:
        # Update to the start rounded hour
        utc_time = utc_time.replace(hour=rounded_hour, minute=0, second=0)

    # isoformat gives the ISO8601 string the API expects
    return utc_time.isoformat(timespec='seconds')

get_api_datetime(datetime.now())

'2024-04-06T00:00:00'

Now let's make a function to build up the URL for requesting specific data at a specific time and specific location. We do this by writing out the URL string, and putting in variables of the form `{var_name}` which Python will fill in when we call `format`.

The variables include:

- The latitude and longitude passed in, as well as `lat_end` and `lng_end` which are slightly bigger. The API needs the location of the prediction to be a box and we just want a small box, so this is why those variables are needed.

- The time string for the prediction, which we build using our function we just created.

- The layers, which are how we specify the type of data we want from the API. Our default parameter will be the layer name for the air temperature data. We'll look more at how to get different layers later.

We also return the prediction time which was actually used, because this may differ from the time requested to the contraints of the API (i.e. the three hour intervals).

In [8]:
def get_api_url(time: datetime, lat: float, lng: float, layers='GDPS.ETA_TT'):
    """Builds the API URL to request data for the given time and location.

    Returns the URL and the time the URL will predict for.

    `layers` gives the type of data to request, the default is to get the air
    temperature prediction.
    """
    pred_time = get_api_datetime(time)

    url = (
    'https://geo.weather.gc.ca/geomet?SERVICE=WMS&VERSION=1.3.0' # Base API URL, common to all requests
    '&REQUEST=GetFeatureInfo' # The request type we want to make, GetFeatureInfo returns weather data
    '&BBOX={lat},{lng},{lat_end},{lng_end}' # The location we want the prediction to be for
    '&CRS=EPSG:4326&WIDTH=10&HEIGHT=10' # Necessary parameters, not important to us
    '&LAYERS={layers}' # The specific type of data we want to be returned
    '&QUERY_LAYERS={layers}' # The specific type of data we want to be returned (necessary duplication)
    '&INFO_FORMAT=application/json' # Return data in JSON format
    '&I=5&J=5' # Necessary parameters, not important to us
    '&TIME={time}Z' # The time the prediction should be for
    )

    return url.format(lat=lat,
                      lng=lng,
                      lat_end=lat+0.5,
                      lng_end=lng+0.5,
                      time=pred_time,
                      layers=layers), pred_time

Now we can make a function to easily get the air temperature for a specific time.

> We'll use the latitude and longitude of Calgary Alberta, but you can use whatever coordinates you like. To find the coordinates of the city you live in, you can go to https://www.latlong.net/ and type in the name of your city.

> Uncomment the call to the function to check if it works, otherwise leave it commented as the API takes a while to finish.

In [9]:
url, pred_time = get_api_url(time, lat=53, lng =-113)
res = requests.get(url)

In [10]:
res.json()

{'type': 'FeatureCollection',
 'layer': 'GDPS.ETA_TT',
 'features': [{'type': 'Feature',
   'id': 'GDPS.ETA_TT(-112.8,53.25)',
   'geometry': {'type': 'Point', 'coordinates': [-112.8, 53.25]},
   'properties': {'value': '0.50133669',
    'class': '0 5',
    'time': '2024-04-05T18:00:00Z',
    'dim_reference_time': '2024-04-05T12:00:00Z'}}]}

In [None]:
def get_air_temperature(time:datetime) -> Tuple[float, str]:
  url, pred_time = get_api_url(time, lat = 53, lng=-113)
  res = requests.get(url)
  body = res.json()
  air_temperature = float(body['features'][0]['properties']['value'])
  return air_temperature, pred_time

In [12]:
def get_air_temperature(time: datetime):
    """Returns the JSON response from an air temperature request with the given
    time, as well as the time the prediction was made for."""
    url, pred_time = get_api_url(time, lat=53, lng=-113)
    res = requests.get(url)
    return res.json(), pred_time

#get_air_temperature(datetime.now())

# Visualizing Live Weather Data

Next let's use the API to show a weather forecast with Bokeh!

We'll get the forecasted air temperatures for the next 5 three hour intervals. This might take a while to execute, so let's also put some precautions in place to make sure we don't run this unnecessarily.

In [16]:
import pandas as pd

def get_air_temp_predictions(num=5) -> pd.DataFrame:
    """Returns a Pandas dataframe containing predicted air temperatures,
    with the prediction times as the row labels."""
    air_temps = []
    pred_times = []

    now = datetime.now()
    # Get the next `num` 3 hour intervals
    times = [now + timedelta(hours=3 * i) for i in range(num)]

    for time in times:
        air_temp_json, pred_time_str = get_air_temperature(time)

        # Get the actual prediction value buried in the JSON
        air_temp = air_temp_json['features'][0]['properties']['value']
        air_temps.append(float(air_temp))

        pred_times.append(pred_time_str)

    df = pd.DataFrame({'air_temp': air_temps}, index=pred_times)
    # Parse the prediction time as dates
    df.index = pd.to_datetime(df.index)
    return df

# Change False to True to retrieve fresh predictions
if False or 'air_temp_predictions_gathered' not in globals().keys():
    air_temps_df = get_air_temp_predictions()
    air_temp_predictions_gathered = True

In [17]:
air_temps_df

Unnamed: 0,air_temp
2024-04-06 00:00:00,2.235986
2024-04-06 03:00:00,0.880243
2024-04-06 06:00:00,-0.108679
2024-04-06 09:00:00,-0.394049
2024-04-06 12:00:00,-0.482459


In [18]:
from bokeh.plotting import show, figure, output_notebook

output_notebook()

In [19]:
# Update the x-axis values to show based on your time, not UTC
x = air_temps_df.index + timedelta(hours=my_time_difference)
y = air_temps_df['air_temp']

# Show the temperatures in increments of 10
y_min = round_multiple(y.min(), 10)
if y_min > y.min():
    y_min -= 10

y_max = round_multiple(y.max(), 10)
if y_max < y.max():
    y_max += 10

p = figure(title='Air temperature forecast', x_axis_type='datetime',
           x_axis_label='Time of day', y_axis_label='Temperature (Celsius)',
           y_range=[y_min, y_max])

p.line(x=x, y=y)
p.scatter(x=x, y=y)

show(p)

**Challenge**: Add a widget button to the above plot which refreshes the weather data when pressed.

In [49]:
# my answer (which is wrong -- Have no idea what to do)

from bokeh.io import show
from bokeh.models import Button, CustomJS
from bokeh.io import push_notebook
from ipywidgets import interact

initial_plot = show(p, notebook_handle=True)

def on_button_click():
  update_plot()

def update_plot():
  pass
  push_notebook(handle=initial_plot)

@interact(update_plot)

PressButton = Button(label = "Refresh", button_type = "success")
PressButton.js_on_click(on_button_click())







SyntaxError: invalid syntax (<ipython-input-49-547b1c015a05>, line 19)

## Extra: Getting Different Types of Data

For different kinds of data, such as precipitation, we have to update the layers in our URL. The available layers can be seen by downloading the XML document returned from a request to the `GetCapabilites` endpoint of the URL. The below function downloads this data to a file for you (it may take a while).

In [20]:
def get_api_capabilities():
    """Writes the response of the API capabilities request to `capabilities.xml`.

    This request returns XML with data on everything the API can do.
    """
    url = 'https://geo.weather.gc.ca/geomet?SERVICE=WMS&REQUEST=GetCapabilities&VERSION=1.3.0'

    res = requests.get(url)

    with open('capabilities.xml', 'w') as f:
        f.write(res.text)

get_api_capabilities()

Within the downloaded XML, you'll find a `Layer` tag which contains all of the available layers. The `Name` tags contain the layer names which should be used in the URL, and next to those names there will be descriptions of the type of data.

**Challenge**: Find a new type of data and make a request to get that data. To do this, perform a search for `GDPS.ETA_TT` (the air temperature layer), and choose a name in the same area as that tag. The name should probably start with `GDPS.`.

If you get an error with calling `res.json()`, try a different name.

In [21]:
# This layer will give dew point temperature
url, pred_time = get_api_url(datetime.now(), lat=51, lng=-114, layers='GDPS.ETA_ES')
res = requests.get(url)
res.json()

{'type': 'FeatureCollection',
 'layer': 'GDPS.ETA_ES',
 'features': [{'type': 'Feature',
   'id': 'GDPS.ETA_ES(-114.3,51.9)',
   'geometry': {'type': 'Point', 'coordinates': [-114.3, 51.9]},
   'properties': {'value': '4.5',
    'class': '3 - 5',
    'time': '2024-04-06T00:00:00Z',
    'dim_reference_time': '2024-04-05T12:00:00Z'}}]}