## Historical Max Temperature from Tomorrow.io API for Delhi, India

We retrieve the max temperature (i.e., `temperatureMax`) for Delhi, India on [H3 resolution 7](https://h3geo.org) from the [Historical](https://docs.tomorrow.io/reference/historical) data layer of the [Tomorrow.io API](https://docs.tomorrow.io/reference/api-introduction).

In [1]:
import asyncio
import json
import os
from typing import Optional

import geopandas
import h3
import pandas as pd
import shapely
from aiohttp import ClientSession
from shapely.geometry import Polygon

The following are auxiliary function(s). Ideally, these will be packaged in an opportune moment.

In [2]:
def get_h3_tessellation(
    gdf: geopandas.GeoDataFrame, name="shapeName", resolution=10
) -> geopandas.GeoDataFrame:
    mapper = dict()
    tiles = set()

    # TODO: vectorize, if possible
    for idx, row in gdf.iterrows():
        geometry = row["geometry"]

        match geometry.geom_type:
            case "Polygon":
                hex_ids = h3.polyfill(
                    shapely.geometry.mapping(geometry),
                    resolution,
                    geo_json_conformant=True,
                )

                tiles = tiles.union(set(hex_ids))
                mapper.update([(hex_id, row[name]) for hex_id in hex_ids])

            case "MultiPolygon":
                for x in geometry.geoms:
                    hex_ids = h3.polyfill(
                        shapely.geometry.mapping(x),
                        resolution,
                        geo_json_conformant=True,
                    )

                    tiles = tiles.union(set(hex_ids))
                    mapper.update([(hex_id, row[name]) for hex_id in hex_ids])
            case _:
                raise (Exception)

    tessellation = geopandas.GeoDataFrame(
        data=tiles,
        geometry=[Polygon(h3.h3_to_geo_boundary(idx, True)) for idx in tiles],
        columns=["hex_id"],
        crs="EPSG:4326",
    )

    return tessellation

In [3]:
# create the Tomorrow.io API token at https://app.tomorrow.io/development/keys
TOMORROW_API_KEY = os.getenv("TOMORROW_API_KEY")

## Area of Interest: Delhi

In [4]:
DELHI = geopandas.read_file("../data/delhi-shapefiles/Delhi-polygon.shp")

In [5]:
DELHI.explore()

## Tessellation in H3

In this step, we generate a tessellation layer for the **area of interest** using [H3 resolution 7](https://h3geo.org).

In [6]:
TESSELLATION = get_h3_tessellation(DELHI, name="Name", resolution=7)

In [7]:
TESSELLATION.explore()

In [8]:
TESSELLATION["geojson"] = TESSELLATION["geometry"].apply(
    lambda x: json.dumps(shapely.geometry.mapping(x))
)

## Retrieve `Historical` from Tomorrow.io API

> Tomorrow.io's Historical Weather API allows you to query weather conditions (limited to historical data layers ]) by specifying the location (GeoJSON of a Point, LineString or Polygon), fields ("temperature", "windSpeed", ...), timesteps ("1h", "1d") and the startTime and endTime, such that the response include a historical timeline.

    Polygon and Polyline limits:

    Polygon - 10,000 square km and no more than 70km per segment.
    Polyline - 2,000 km long.
    Max number of vertices - 550.
    Timerange is limited to up to 30 days per API call.

    If the location is a Polygon or a Polyline, you can specify whether you want the min/max/avg values throughout that coverage area by adding them as a suffix to any of the available fields (temperatureMax, temperatureMaxTime) and if not specified the response will default to Max.

    The historical archive is based on a reanalysis model that blends past short-range weather forecasts with observations through advanced data assimilation techniques. The historical archive data deviates from the recent historical data -7 days since it is based on a different observation data assimilation system that incorporates a larger set of final observational records.
    Please note that our reanalysis model takes between 7 to 90 days to calculate the data fields. Please see the historical data field Availability.

    See also: https://docs.tomorrow.io/reference/historical-overview

In [9]:
class TomorrowAPIClient:
    """An Asynchronous API client for Tomorrow.io API"

    Parameters
    ----------
    token : str
        Tomorrow.io API token

    Notes
    -----
    For more information, please see https://docs.tomorrow.io
    """

    BASE_URL = "https://api.tomorrow.io/v4"

    def __init__(
        self, session: Optional[ClientSession] = None, token: Optional[str] = None
    ):
        self.session = session or ClientSession()
        self.semaphore = asyncio.BoundedSemaphore(4)
        self.token = token or os.getenv("TOMORROW_TOKEN")

    async def __aenter__(self):
        return self

    async def __aexit__(self, *args):
        await self.close()

    async def close(self):
        await self.session.close()

    async def post(self, url, json, params={}, headers={}):
        params["apikey"] = self.token
        async with (
            self.semaphore,
            self.session.post(
                url, json=json, params=params, headers=headers
            ) as response,
        ):
            return await response.json()

### Creating `intervals`

Let's start in 2021, from January 1st to December 31th. The Tomorrow.io API limits the date range to 30 days. Thus, we create 13 periods of 28 days and add 1 day to the last period.

In [10]:
date_range = pd.date_range("2021-01-01", "2021-12-31", periods=14)
intervals = list(zip(date_range, date_range[1:]))

Fix by adding last day, 

In [11]:
intervals[-1] = (pd.Timestamp("2021-12-03"), pd.Timestamp("2022-01-01"))

In [12]:
intervals

[(Timestamp('2021-01-01 00:00:00'), Timestamp('2021-01-29 00:00:00')),
 (Timestamp('2021-01-29 00:00:00'), Timestamp('2021-02-26 00:00:00')),
 (Timestamp('2021-02-26 00:00:00'), Timestamp('2021-03-26 00:00:00')),
 (Timestamp('2021-03-26 00:00:00'), Timestamp('2021-04-23 00:00:00')),
 (Timestamp('2021-04-23 00:00:00'), Timestamp('2021-05-21 00:00:00')),
 (Timestamp('2021-05-21 00:00:00'), Timestamp('2021-06-18 00:00:00')),
 (Timestamp('2021-06-18 00:00:00'), Timestamp('2021-07-16 00:00:00')),
 (Timestamp('2021-07-16 00:00:00'), Timestamp('2021-08-13 00:00:00')),
 (Timestamp('2021-08-13 00:00:00'), Timestamp('2021-09-10 00:00:00')),
 (Timestamp('2021-09-10 00:00:00'), Timestamp('2021-10-08 00:00:00')),
 (Timestamp('2021-10-08 00:00:00'), Timestamp('2021-11-05 00:00:00')),
 (Timestamp('2021-11-05 00:00:00'), Timestamp('2021-12-03 00:00:00')),
 (Timestamp('2021-12-03 00:00:00'), Timestamp('2022-01-01 00:00:00'))]

### Create `payloads`

In [13]:
payloads = [
    {
        "location": location,
        "fields": ["temperatureMax"],
        "timesteps": ["1d"],
        "startTime": startTime.isoformat(),
        "endTime": endTime.isoformat(),
        "units": "metric",
    }
    for location in TESSELLATION["geojson"]
    for (startTime, endTime) in intervals
]

Just checking the cardinality,

In [14]:
len(payloads) == len(TESSELLATION) * len(intervals)

True

Now, let's call the Tomorrow.io API!

In [15]:
async with TomorrowAPIClient(token=TOMORROW_API_KEY) as client:
    url = "https://api.tomorrow.io/v4/historical"
    headers = {"Accept": "application/json", "Content-Type": "application/json"}

    futures = [client.post(url, json=payload, headers=headers) for payload in payloads]
    data = await asyncio.gather(*futures)

In [16]:
data

[{'code': 429001,
  'type': 'Too Many Calls',
  'message': 'The request limit for this resource has been reached for the current rate limit window. Wait and retry the operation, or examine your API request volume.'},
 {'code': 429001,
  'type': 'Too Many Calls',
  'message': 'The request limit for this resource has been reached for the current rate limit window. Wait and retry the operation, or examine your API request volume.'},
 {'code': 429001,
  'type': 'Too Many Calls',
  'message': 'The request limit for this resource has been reached for the current rate limit window. Wait and retry the operation, or examine your API request volume.'},
 {'code': 429001,
  'type': 'Too Many Calls',
  'message': 'The request limit for this resource has been reached for the current rate limit window. Wait and retry the operation, or examine your API request volume.'},
 {'code': 429001,
  'type': 'Too Many Calls',
  'message': 'The request limit for this resource has been reached for the current rat

## Data Manipulation

In [17]:
dataframes = [
    pd.json_normalize(item["data"]["timelines"][0]["intervals"]) for item in data
]

KeyError: 'data'

Concatenating,

In [None]:
df = pd.concat(dataframes)
df