# Get Gauge-Adjusted Radar Rainfall Data with the Teragon Rainfall API

> *This notebook is a work-in-progress*

*Get timeseries rainfall data for an area of interest within Allegheny County*

---

When radar estimates of rainfall are calibrated with actual rain gauge data, a highly accurate and valuable source of rainfall data can be calculated over large geographic areas. The result is called *Calibrated Radar Rainfall Data*, or *Gauge-Adjusted Radar Rainfall Data (GARRD)*

3 Rivers Wet Weather (3RWW), with support from [Vieux Associates](http://www.vieuxinc.com/), uses calibrated data from the NEXRAD radar located in Moon Township, PA with rain gauge measurements collected during the same time period and rain event for every square kilometer in Allegheny County. The resulting rainfall data is equivalent in accuracy to having 2,276 rain gauges placed across the County.

You can view and explore this data on 3RWW's calibrated radar rainfall data site at [www.3riverswetweather.org/municipalities/calibrated-radar-rainfall-data](http://www.3riverswetweather.org/municipalities/calibrated-radar-rainfall-data)

This notebook walks through how to programmatically access 3RWW's massive repository of high resolution spatiotemporal rainfall data for Allegheny County via the ***Teragon Rainfall Dataset API***. Complete documentation for the API is available at [3rww.github.io/api-docs](https://3rww.github.io/api-docs/?language=Python#teragon-rainfall-dataset-api-10).

## First: Notebook Setup

~~This assumes your Python environment is setup in accordance with the the recommendations in the ***Getting Started*** notebook.~~

In [1]:
import json

We're going to use a few external Python packages to make our lives easier:

In [2]:
# Requests - HTTP requests for humans
import requests
# PETL - an Extract/Transform/Load toolbox
import petl as etl
# sortedcontainers provides a way to have sorted dictionaries (before Python 3.7)
from sortedcontainers import SortedDict
# Python DateUtil (parser) - a helper for reading timestamps
from dateutil.parser import parse
# ArcGIS API for Python - for accessing 3RWW's open reference datasets in ArcGIS Online
from arcgis.gis import GIS
# for displaying things from the ArcGIS Online in this Jupyter notebook
from IPython.display import display

## 1. The basics of getting gauge-adjusted radar rainfall data

Getting rainfall data programmatically is a mostly straightforward endeavor: it requires you to submit a HTTP request with parameters specifying locations and a time range. It returns a `csv`-like plain-text response where time intervals are on the x-axis, locations are the y-axis, and values are rainfall amounts. Complete API documentation is available at [3rww.github.io/api-docs](https://3rww.github.io/api-docs/?language=Python#teragon-rainfall-dataset-api-10).

The challenge comes in formulating the location for this request. Location is specified with a "pixel ID", which translates to a location on a 1-kilometer grid set over Allegheny County, PA. The pixel (or pixels) is a required parameter; finding and entering those raw values is somewhat tedious.

To demonstrate the basics of making a call to the API, we'll first use some pre-selected pixel values; we'll demonstrate how to get pixel locations from geodata later on, and then revisit submitting the request for specific locations.


### Assemble the request payload

We'll use the Python `requests` library to make our calls to the API.

We'll use Hurricane Ivan in 2004 as an example (2004-09-17 03:00 to 2004-09-18 00:00). The request payload for that event looks like this:

In [3]:
payload = {
    'startyear': 2004,
    'startmonth': 9,
    'startday': 17,
    'starthour': 3,
    'endyear': 2004,
    'endmonth': 9,
    'endday': 18,
    'endhour': 0,
    'pixels': '148,134;149,134;148,133;149,133',
    'interval': 'Hourly',
    'zerofill': 'yes'
}

Using the Teragon Rainfall Dataset API only accesspts `POST` requests. Using the Python `requests` library, then, we construct our call like this:

In [4]:
response = requests.post(
    url="http://web.3riverswetweather.org/trp:API.pixel",
    data=payload
)

As mentioned earlier, the API returns a `csv`-like plain-text response where time intervals are on the x-axis, locations are the y-axis, and values are rainfall amounts. You can print the response:

In [5]:
print(response.text)

Timestamp,148-134,,149-134,,148-133,,149-133,
2004/09/17 03:00:00,0.0000,-,0.0000,-,0.0000,-,0.0000,-
2004/09/17 04:00:00,0.0040,-,0.0010,-,0.0050,-,0.0020,-
2004/09/17 05:00:00,0.0030,-,0.0030,-,0.0100,-,0.0030,-
2004/09/17 06:00:00,0.0430,-,0.0400,-,0.0450,-,0.0410,-
2004/09/17 07:00:00,0.1370,-,0.1389,-,0.1300,-,0.1430,-
2004/09/17 08:00:00,0.1090,-,0.1140,-,0.1040,-,0.1080,-
2004/09/17 09:00:00,0.2160,-,0.2200,-,0.1770,-,0.2089,-
2004/09/17 10:00:00,0.4260,-,0.3669,-,0.4610,-,0.4109,-
2004/09/17 11:00:00,0.4520,-,0.4100,-,0.4700,-,0.4560,-
2004/09/17 12:00:00,0.5590,-,0.6290,-,0.4390,-,0.4760,-
2004/09/17 13:00:00,1.4609,-,1.0550,-,0.9660,-,0.6470,-
2004/09/17 14:00:00,1.0889,-,1.1510,-,1.0340,-,1.0679,-
2004/09/17 15:00:00,0.7379,-,0.8900,-,0.6890,-,0.8200,-
2004/09/17 16:00:00,0.9260,-,0.8760,-,0.9860,-,0.8720,-
2004/09/17 17:00:00,0.3690,-,0.3609,-,0.3480,-,0.3409,-
2004/09/17 18:00:00,0.1948,-,0.1950,-,0.1669,-,0.1968,-
2004/09/17 19:00:00,0.1940,-,0.1810,-,0.1700,-,0.1740,-
20

That's a little hard to read, so we'll use the wonderful Python `PETL` library to get something human-readable.

In [6]:
table = etl.fromcsv(etl.MemorySource(response.text.encode()))
etl.vis.displayall(table)

Timestamp,148-134,Unnamed: 2,149-134,Unnamed: 4,148-133,Unnamed: 6,149-133,Unnamed: 8
2004/09/17 03:00:00,0.0,-,0.0,-,0.0,-,0.0,-
2004/09/17 04:00:00,0.004,-,0.001,-,0.005,-,0.002,-
2004/09/17 05:00:00,0.003,-,0.003,-,0.01,-,0.003,-
2004/09/17 06:00:00,0.043,-,0.04,-,0.045,-,0.041,-
2004/09/17 07:00:00,0.137,-,0.1389,-,0.13,-,0.143,-
2004/09/17 08:00:00,0.109,-,0.114,-,0.104,-,0.108,-
2004/09/17 09:00:00,0.216,-,0.22,-,0.177,-,0.2089,-
2004/09/17 10:00:00,0.426,-,0.3669,-,0.461,-,0.4109,-
2004/09/17 11:00:00,0.452,-,0.41,-,0.47,-,0.456,-
2004/09/17 12:00:00,0.559,-,0.629,-,0.439,-,0.476,-


That's better. Note that each pixel column has a column that follows it: the API response includes data quality metadata for every value if it exists. In this example, there isn't any data quality issues noted, thus the `-` following every value.

You can remove those additional columns and clean things up to make working with the data a bit simpler, as follows:

In [7]:
h = list(etl.header(table))
xy_cols = zip(* [iter(h[1:])] * 2)

# make a new header row
new_header = ['Timestamp']
fields_to_cut = []
for each in xy_cols:
    # correct id, assembled from columns
    id_col, note_col = each[0], each[1]
    # assemble new id column, to replace of PX column (which has data)
    # id_col = "{0}{1}".format(px[:3], px[4:])
    # assemble new notes column, to replace of PY column (which has notes)
    notes_col = "{0}-n".format(id_col)
    # add those to our new header (array)
    new_header.extend([id_col, notes_col])
    # track fields that we might want to remove
    fields_to_cut.append(notes_col)

# transform the table
table_cleaned = etl \
    .setheader(table, new_header) \
    .cutout(*tuple(fields_to_cut))  \
    .select('Timestamp', lambda v: v.upper() != 'TOTAL')  \
    .convert('Timestamp', lambda t: parse(t).isoformat())  \
    .replaceall('N/D', None)

etl.vis.displayall(table_cleaned)

Timestamp,148-134,149-134,148-133,149-133
2004-09-17T03:00:00,0.0,0.0,0.0,0.0
2004-09-17T04:00:00,0.004,0.001,0.005,0.002
2004-09-17T05:00:00,0.003,0.003,0.01,0.003
2004-09-17T06:00:00,0.043,0.04,0.045,0.041
2004-09-17T07:00:00,0.137,0.1389,0.13,0.143
2004-09-17T08:00:00,0.109,0.114,0.104,0.108
2004-09-17T09:00:00,0.216,0.22,0.177,0.2089
2004-09-17T10:00:00,0.426,0.3669,0.461,0.4109
2004-09-17T11:00:00,0.452,0.41,0.47,0.456
2004-09-17T12:00:00,0.559,0.629,0.439,0.476


There it is. Export that to CSV with PETL like this:

```python
etl.tocsv(table_cleaned, "path/to/save/your/data.csv")
```

Now what if we want to work with this a key-value store? Try this:

In [8]:
data = SortedDict()
for row in etl.transpose(table_cleaned).dicts():
    inside = SortedDict()
    for d in row.items():
        if d[0] != 'Timestamp':
            if d[1]:
                v = float(d[1])
            else:
                v = d[1]
            inside[d[0]] = v
    data[row['Timestamp']] = inside
print(json.dumps(data, indent=2))

{
  "148-133": {
    "2004-09-17T03:00:00": 0.0,
    "2004-09-17T04:00:00": 0.005,
    "2004-09-17T05:00:00": 0.01,
    "2004-09-17T06:00:00": 0.045,
    "2004-09-17T07:00:00": 0.13,
    "2004-09-17T08:00:00": 0.104,
    "2004-09-17T09:00:00": 0.177,
    "2004-09-17T10:00:00": 0.461,
    "2004-09-17T11:00:00": 0.47,
    "2004-09-17T12:00:00": 0.439,
    "2004-09-17T13:00:00": 0.966,
    "2004-09-17T14:00:00": 1.034,
    "2004-09-17T15:00:00": 0.689,
    "2004-09-17T16:00:00": 0.986,
    "2004-09-17T17:00:00": 0.348,
    "2004-09-17T18:00:00": 0.1669,
    "2004-09-17T19:00:00": 0.17,
    "2004-09-17T20:00:00": 0.06,
    "2004-09-17T21:00:00": 0.067,
    "2004-09-17T22:00:00": 0.3959,
    "2004-09-17T23:00:00": 0.195
  },
  "148-134": {
    "2004-09-17T03:00:00": 0.0,
    "2004-09-17T04:00:00": 0.004,
    "2004-09-17T05:00:00": 0.003,
    "2004-09-17T06:00:00": 0.043,
    "2004-09-17T07:00:00": 0.137,
    "2004-09-17T08:00:00": 0.109,
    "2004-09-17T09:00:00": 0.216,
    "2004-09-17T10:

This provides a timeseries per-pixel.

---

*Note: We've started codifying the above processes in a "wrapper API" available at http://3rww-rainfall-api.civicmapper.com/apidocs/, so you don't have to post-process the data like we just demonstrated. Check it out.*

## 2. Getting reference geodata

As we've seen above, 3RWW's Rainfall Data API is not spatial: it returns rainfall values for locations at points in time, but those locations are only represented by 'Pixel' IDs; it does not provide actual geometry or coordinates for those pixels.

To do anything that is location specific with this data (e.g., query rainfall a specific watershed), you'll want some geodata. 

### Pixels

The pixels are available on [3RWW's Open Data Portal](http://data-3rww.opendata.arcgis.com/) and 3RWW's regular ArcGIS Online site at:

* [data-3rww.opendata.arcgis.com/datasets/228b1584b89a45308ed4256c5bedd43d_1](https://data-3rww.opendata.arcgis.com/datasets/228b1584b89a45308ed4256c5bedd43d_1), and
* [3rww.maps.arcgis.com/home/item.html?id=228b1584b89a45308ed4256c5bedd43d](https://3rww.maps.arcgis.com/home/item.html?id=228b1584b89a45308ed4256c5bedd43d)

...respectively. We can retrieve it programmatically a couple of ways:

* with the [ArcGIS API for Python](https://developers.arcgis.com/python/); or
* by using the [Python Requests library](http://docs.python-requests.org/en/master/) to make a call directly to the Portal's ArcGIS REST API.

We'll show both ways below. 

#### ...using the ArcGIS API for Python

In [9]:
# Establish a connection to your 3RWW's ArcGIS Online portal.
gis = GIS('https://3rww.maps.arcgis.com')

We can search for the feature layer by name:

In [10]:
search_results = gis.content.search('Gauge Adjusted Radar Rainfall Data')
for item in search_results:
    display(item)
garrd_item = search_results[0]

Alternatively, we can use the item `id` to directly find the feature layer:

In [11]:
garrd_id = "228b1584b89a45308ed4256c5bedd43d"
garrd_item = gis.content.get(itemid=garrd_id)
garrd_item

Either way gets us `gaard_item`: a feature layer *collection* item, which contains individual feature layers. This one (we know from clicking on the item above), has both points and polygons variants of the GARRD reference geometry. We're interested in the polygons (grid). Get that as follows:

In [12]:
garrd_item.layers

[<FeatureLayer url:"https://services6.arcgis.com/dMKWX9NPCcfmaZl3/arcgis/rest/services/garrd/FeatureServer/0">,
 <FeatureLayer url:"https://services6.arcgis.com/dMKWX9NPCcfmaZl3/arcgis/rest/services/garrd/FeatureServer/1">]

In [13]:
# it's the second item, index 1
garrd_grid = garrd_item.layers[1]
garrd_grid

<FeatureLayer url:"https://services6.arcgis.com/dMKWX9NPCcfmaZl3/arcgis/rest/services/garrd/FeatureServer/1">

Since we're in a notebook now, the ArcGIS API for Python lets you put that on a map:

In [14]:
m = gis.map('Pittsburgh')
m.add_layer(garrd_grid)
m

MapView(layout=Layout(height='400px', width='100%'))

Finally, we can turn that into a `geojson`-like Python dictionary.

In [15]:
q = garrd_grid.query(out_sr=4326)
garrd_grid_geojson = q.to_geojson

In [16]:
garrd_grid_geojson

'{"type": "FeatureCollection", "features": [{"type": "Feature", "geometry": {"type": "Polygon", "coordinates": [[[-80.1580032723683, 40.6807847789449], [-80.1583258239887, 40.689786868788], [-80.1464994361518, 40.6900318215293], [-80.1461784669015, 40.6810296994275], [-80.1580032723683, 40.6807847789449]]]}, "properties": {"OBJECTID": 1, "PIXEL": "134111", "Shape__Area": 10763910.416626, "Shape__Length": 13123.35958004}}, {"type": "Feature", "geometry": {"type": "Polygon", "coordinates": [[[-80.1461784669015, 40.6810296994275], [-80.1464994361518, 40.6900318215293], [-80.1346729623366, 40.6902755687158], [-80.1343535754907, 40.681273414514], [-80.1461784669015, 40.6810296994275]]]}, "properties": {"OBJECTID": 2, "PIXEL": "135111", "Shape__Area": 10763910.416748, "Shape__Length": 13123.3595800102}}, {"type": "Feature", "geometry": {"type": "Polygon", "coordinates": [[[-80.1343535754907, 40.681273414514], [-80.1346729623366, 40.6902755687158], [-80.1228464029661, 40.6905181103346], [-80.

#### ...using `requests`

This approach is a little more hands on, but works without fuss in vanilla Python environments (unlike the ArcGIS API for Python, which minimally requires an Anaconda Python distribution).

We need to get the service `url` from the item detail page on 3RWW's Open Data Portal, and then construct query parameters for the request as a Python dictionary.

In [17]:
# service URL - note how '/query' is at the end of the URL
service_url = 'https://services6.arcgis.com/dMKWX9NPCcfmaZl3/ArcGIS/rest/services/garrd/FeatureServer/1/query'
# query string parameters
params = {
    'where': '1=1', # Esri's convention for returning everything from the ArcGIS REST API
    'outFields': 'PIXEL', # only include the GARRD 'PIXEL' field
    'outSR': '4326', # project as WGS 1984
    'f': 'geojson' # return as geojson
}

In [18]:
# make the request
response = requests.get(service_url, params=params)
response

<Response [200]>

In [19]:
# this gets us the response as a geojson-like Python dictionary.
garrd_grid_geojson = response.json()

That gets us a `geojson` object of all pixels.