# OpenAQ API Data Query

[OpenAQ](https://openaq.org/) aggregates air qualty datasets worldwide, into a common data object. 

OpenAQ provides an [API](https://docs.openaq.org/docs/introduction) to access these datasets, provided the user submit a query. 

While python packages, such as [py-openaq](https://github.com/dhhagan/py-openaq) exist to use this API, this notebook will query the API directly. 

## API Query Creation

This query was created with the [API Reference](https://docs.openaq.org/reference/measurements_get_v2_measurements_get)

There is also [direct file access](https://docs.openaq.org/docs/accessing-openaq-archive-data) in an S3 bucket, but locations of observations must be known 

In [1]:
import requests

### Bounding Box

Initial attempts at accessing the API were to pass a [bounding box](http://bboxfinder.com/#0.000000,0.000000,0.000000,0.000000) for the Chicago region


In [2]:
chicago_bbox = [-88.885961, 41.274839, -87.117162, 42.477288]

###  Location Name

Location identifers follow EPA guidelines, so Chicago metro is:
`Chicago-Naperville-Joliet`

In [3]:
url = "https://api.openaq.org/v2/locations?limit=1000&page=1&offset=0&sort=desc&parameter_id=2&radius=1000&country=US&city=Chicago-Naperville-Joliet&order_by=lastUpdated&dump_raw=false"
headers = {"accept": "application/json"}
response = requests.get(url, headers=headers)

In [4]:
response

<Response [200]>

In [5]:
response.text

'{"meta":{"name":"openaq-api","license":"","website":"/","page":1,"limit":1000,"found":18},"results":[{"id":2047319,"city":"Chicago-Naperville-Joliet","name":"McCook","entity":null,"country":"US","sources":null,"isMobile":false,"isAnalysis":null,"parameters":[{"id":2,"unit":"µg/m³","count":417,"average":6.8,"lastValue":22.4,"parameter":"pm25","displayName":"pm25 µg/m³","lastUpdated":"2024-01-24T02:00:00+00:00","parameterId":2,"firstUpdated":"2024-01-02T16:00:00+00:00","manufacturers":null}],"sensorType":null,"coordinates":{"latitude":41.80117,"longitude":-87.83194},"lastUpdated":"2024-01-24T02:00:00+00:00","firstUpdated":"2024-01-02T16:00:00+00:00","measurements":417,"bounds":[-87.83194,41.80117,-87.83194,41.80117],"manufacturers":[{"modelName":"N/A","manufacturerName":"OpenAQ admin"}]},{"id":223,"city":"Chicago-Naperville-Joliet","name":"Gary-IITRI","entity":null,"country":"US","sources":null,"isMobile":false,"isAnalysis":null,"parameters":[{"id":10,"unit":"ppm","count":43847,"average

In [6]:
data = response.json()

In [7]:
data

{'meta': {'name': 'openaq-api',
  'license': '',
  'website': '/',
  'page': 1,
  'limit': 1000,
  'found': 18},
 'results': [{'id': 2047319,
   'city': 'Chicago-Naperville-Joliet',
   'name': 'McCook',
   'entity': None,
   'country': 'US',
   'sources': None,
   'isMobile': False,
   'isAnalysis': None,
   'parameters': [{'id': 2,
     'unit': 'µg/m³',
     'count': 417,
     'average': 6.8,
     'lastValue': 22.4,
     'parameter': 'pm25',
     'displayName': 'pm25 µg/m³',
     'lastUpdated': '2024-01-24T02:00:00+00:00',
     'parameterId': 2,
     'firstUpdated': '2024-01-02T16:00:00+00:00',
     'manufacturers': None}],
   'sensorType': None,
   'coordinates': {'latitude': 41.80117, 'longitude': -87.83194},
   'lastUpdated': '2024-01-24T02:00:00+00:00',
   'firstUpdated': '2024-01-02T16:00:00+00:00',
   'measurements': 417,
   'bounds': [-87.83194, 41.80117, -87.83194, 41.80117],
   'manufacturers': [{'modelName': 'N/A',
     'manufacturerName': 'OpenAQ admin'}]},
  {'id': 223,
  

In [8]:
data['results'][0]

{'id': 2047319,
 'city': 'Chicago-Naperville-Joliet',
 'name': 'McCook',
 'entity': None,
 'country': 'US',
 'sources': None,
 'isMobile': False,
 'isAnalysis': None,
 'parameters': [{'id': 2,
   'unit': 'µg/m³',
   'count': 417,
   'average': 6.8,
   'lastValue': 22.4,
   'parameter': 'pm25',
   'displayName': 'pm25 µg/m³',
   'lastUpdated': '2024-01-24T02:00:00+00:00',
   'parameterId': 2,
   'firstUpdated': '2024-01-02T16:00:00+00:00',
   'manufacturers': None}],
 'sensorType': None,
 'coordinates': {'latitude': 41.80117, 'longitude': -87.83194},
 'lastUpdated': '2024-01-24T02:00:00+00:00',
 'firstUpdated': '2024-01-02T16:00:00+00:00',
 'measurements': 417,
 'bounds': [-87.83194, 41.80117, -87.83194, 41.80117],
 'manufacturers': [{'modelName': 'N/A', 'manufacturerName': 'OpenAQ admin'}]}

In [9]:
data

{'meta': {'name': 'openaq-api',
  'license': '',
  'website': '/',
  'page': 1,
  'limit': 1000,
  'found': 18},
 'results': [{'id': 2047319,
   'city': 'Chicago-Naperville-Joliet',
   'name': 'McCook',
   'entity': None,
   'country': 'US',
   'sources': None,
   'isMobile': False,
   'isAnalysis': None,
   'parameters': [{'id': 2,
     'unit': 'µg/m³',
     'count': 417,
     'average': 6.8,
     'lastValue': 22.4,
     'parameter': 'pm25',
     'displayName': 'pm25 µg/m³',
     'lastUpdated': '2024-01-24T02:00:00+00:00',
     'parameterId': 2,
     'firstUpdated': '2024-01-02T16:00:00+00:00',
     'manufacturers': None}],
   'sensorType': None,
   'coordinates': {'latitude': 41.80117, 'longitude': -87.83194},
   'lastUpdated': '2024-01-24T02:00:00+00:00',
   'firstUpdated': '2024-01-02T16:00:00+00:00',
   'measurements': 417,
   'bounds': [-87.83194, 41.80117, -87.83194, 41.80117],
   'manufacturers': [{'modelName': 'N/A',
     'manufacturerName': 'OpenAQ admin'}]},
  {'id': 223,
  

In [10]:
for site in data['results']:
    print(site['name'], site['coordinates']['latitude'], site['coordinates']['longitude'])
    for obs in site['parameters']:
            print(obs['parameter'], obs['lastValue'], obs['lastUpdated'])
    print('\n')

McCook 41.80117 -87.83194
pm25 22.4 2024-01-24T02:00:00+00:00


Gary-IITRI 41.606563 -87.305015
o3 0.004 2024-01-24T02:00:00+00:00
no2 0.0181 2024-01-24T02:00:00+00:00
pm10 11.0 2024-01-24T02:00:00+00:00
bc 0.18 2021-01-02T00:00:00+00:00
so2 0.0 2024-01-24T02:00:00+00:00
pm25 6.9 2024-01-23T03:00:00+00:00


CHI_COM 41.7547 -87.7136
pm25 14.8 2024-01-24T02:00:00+00:00
o3 0.007 2023-11-02T12:00:00+00:00


ALSIP 41.6708 -87.7325
pm25 17.9 2024-01-24T02:00:00+00:00
o3 0.023 2023-11-01T16:00:00+00:00


BRAIDWD 41.2222 -88.1906
o3 0.02 2023-11-07T15:00:00+00:00
pm25 7.8 2024-01-24T02:00:00+00:00


CARY 42.2211 -88.2411
pm25 5.7 2024-01-24T02:00:00+00:00
o3 0.021 2023-11-07T18:00:00+00:00


CHIWAUKEE 42.5047 -87.8111
pm25 9.2 2024-01-24T02:00:00+00:00
o3 0.014 2023-11-01T11:00:00+00:00


CHI_SP 41.9136 -87.7239
pm25 21.3 2024-01-24T02:00:00+00:00


Ogden Dunes 41.617814 -87.199533
pm25 5.2 2024-01-23T03:00:00+00:00
o3 0.002 2024-01-24T02:00:00+00:00


NORTHBRK 42.1406 -87.7994
o3 0.01 2024-01

In [11]:
import xyzservices.providers as xyz
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource
from bokeh.transform import factor_cmap, factor_mark
from bokeh.transform import linear_cmap, log_cmap

import numpy as np

In [12]:
# helper function for coordinate conversion between lat/lon in decimal degrees to web mercator
def lnglat_to_meters(longitude: float, latitude: float) -> tuple[float, float]:
    """ Projects the given (longitude, latitude) values into Web Mercator
    coordinates (meters East of Greenwich and meters North of the Equator).

    """
    origin_shift = np.pi * 6378137
    easting = longitude * origin_shift / 180.0
    northing = np.log(np.tan((90 + latitude) * np.pi / 360.0)) * origin_shift / np.pi
    return (easting, northing)

In [13]:
# Create a dictionary of lat/lons
pm25 = {'name':[], 'lat':[], 'lon':[], 'avg':[], 'lastValue':[], 'lastUpdated':[], 'units':[]}
ozone = {'name':[], 'lat':[], 'lon':[], 'avg':[], 'lastValue':[], 'lastUpdated':[], 'units':[]}
so2 = {'name':[], 'lat':[], 'lon':[], 'avg':[], 'lastValue':[], 'lastUpdated':[], 'units':[]}

In [14]:
for site in data['results']:
    for obs in site['parameters']:
        if obs['parameter'] == 'pm25':
            pm25['name'].append(site['name'])
            pm25['avg'].append(obs['average'])
            pm25['lastValue'].append(obs['lastValue'])
            pm25['lastUpdated'].append(obs['lastUpdated'])
            pm25['units'].append(obs['unit'])
            coords = lnglat_to_meters(site['coordinates']['longitude'], 
                                      site['coordinates']['latitude'])
            pm25['lon'].append(coords[0])
            pm25['lat'].append(coords[1])
        elif obs['parameter'] == 'o3':
            ozone['name'].append(site['name'])
            ozone['avg'].append(obs['average'])
            ozone['lastValue'].append(obs['lastValue'])
            ozone['lastUpdated'].append(obs['lastUpdated'])
            ozone['units'].append(obs['unit'])
            coords = lnglat_to_meters(site['coordinates']['longitude'], 
                                      site['coordinates']['latitude'])
            ozone['lon'].append(coords[0])
            ozone['lat'].append(coords[1])
        elif obs['parameter'] == 'so2':
            so2['name'].append(site['name'])
            so2['avg'].append(obs['average'])
            so2['lastValue'].append(obs['lastValue'])
            so2['lastUpdated'].append(obs['lastUpdated'])
            so2['units'].append(obs['unit'])
            coords = lnglat_to_meters(site['coordinates']['longitude'], 
                                      site['coordinates']['latitude'])
            so2['lon'].append(coords[0])
            so2['lat'].append(coords[1])
        else:
            print('Not Supported Yet: ', obs)
            #print('Not Supported Yet')
    #print(site['coordinates']['latitude'], site['coordinates']['longitude'])

Not Supported Yet:  {'id': 7, 'unit': 'ppm', 'count': 45445, 'average': 0.008949742701870322, 'lastValue': 0.0181, 'parameter': 'no2', 'displayName': 'no2 ppm', 'lastUpdated': '2024-01-24T02:00:00+00:00', 'parameterId': 7, 'firstUpdated': '2016-03-06T19:00:00+00:00', 'manufacturers': None}
Not Supported Yet:  {'id': 1, 'unit': 'µg/m³', 'count': 50480, 'average': 19.168185735832825, 'lastValue': 11.0, 'parameter': 'pm10', 'displayName': 'pm10 µg/m³', 'lastUpdated': '2024-01-24T02:00:00+00:00', 'parameterId': 1, 'firstUpdated': '2016-03-06T19:00:00+00:00', 'manufacturers': None}
Not Supported Yet:  {'id': 11, 'unit': 'µg/m³', 'count': 30258, 'average': 0.6917327648886192, 'lastValue': 0.18, 'parameter': 'bc', 'displayName': 'bc µg/m³', 'lastUpdated': '2021-01-02T00:00:00+00:00', 'parameterId': 11, 'firstUpdated': '2016-03-06T19:00:00+00:00', 'manufacturers': None}


In [15]:
pm25

{'name': ['McCook',
  'Gary-IITRI',
  'CHI_COM',
  'ALSIP',
  'BRAIDWD',
  'CARY',
  'CHIWAUKEE',
  'CHI_SP',
  'Ogden Dunes',
  'NORTHBRK',
  'SCHILPRK',
  'DESPLNS',
  'Hammond-167th St',
  'Naperville',
  'Joliet',
  'Kingery Near-road #1',
  'East Chicago - Marin',
  'Cicero Liberty'],
 'lat': [5131242.049505054,
  5102225.475927335,
  5124305.2240361385,
  5111793.729165387,
  5045171.564705385,
  5194156.906659716,
  5236883.450753147,
  5148045.916607161,
  5103900.650331237,
  5182063.965388965,
  5155738.054781796,
  5170016.375723212,
  5100418.860309329,
  5126797.543824141,
  5090370.475709981,
  5098063.743738284,
  5109207.869309763,
  5140678.201429784],
 'lon': [-9777406.836185357,
  -9718749.813499112,
  -9764233.287644882,
  -9766337.226020874,
  -9817332.684753273,
  -9822954.319038333,
  -9775086.937997224,
  -9765379.878400052,
  -9707007.610971255,
  -9773784.499954944,
  -9782322.704898788,
  -9780875.551518476,
  -9739903.745015066,
  -9813058.01630681,
  -98090

In [24]:
def make_plot(ndata, mapper, palette):
    # let's center on UIC
    chi_lat = 41.86937
    chi_lon = -87.64638

    MARKERS = ['hex', 'circle_x', 'triangle']
    
    EN = lnglat_to_meters(chi_lon, chi_lat)
    dE = 47500 # (m) easting plus-and-minus from map center
    dN = 47500 # (m) northing plus-and-minus from map center

    x_range = (EN[0] - dE, EN[0] + dE) # (m) Easting x_low, x_high
    y_range = (EN[1] - dN, EN[1] + dN) # (m) Northing y_low, y_high

    plot = figure(x_range=x_range, 
                  y_range=y_range,
                  x_axis_type="mercator",
                  y_axis_type="mercator",
                  height=750,
                  width=900,
                  toolbar_location=None,
                  active_scroll='wheel_zoom',
                  title='CROCUS U-IFL Domain'
                  )

    plot.xaxis.axis_label = 'Longitude [Degrees]'
    plot.yaxis.axis_label = 'Latitude [Degrees]'

    plot.add_tile("CartoDB Positron", retina=True)

    cmap = mapper("x", palette=palette, low=1, high=1000)
    axis_type = mapper.__name__.split("_")[0] # linear or log

    # Add the data sources to plot
    source = ColumnDataSource(data=ndata)

    r = plot.scatter(x=source.data['lon'],
                     y=source.data['lat'],
                     alpha=0.8,
                     fill_color=source.data['lastValue'],
                     line_color=None
                    )

    #color_bar = ColorBar( color_mapper=mapper, location=( 0, 0))
    #color_bar = r.construct_color_bar(padding=0,
    #                                  ticker=plot.xaxis.ticker,
    #                                  formatter=plot.xaxis.formatter)

    #plot.add_layout(color_bar, 'below')

    return plot

In [25]:
p1 = make_plot(pm25, linear_cmap, "Viridis256")
show(p1)

In [18]:
source = ColumnDataSource(data=pm25)

In [19]:
source.data

{'name': ['McCook',
  'Gary-IITRI',
  'CHI_COM',
  'ALSIP',
  'BRAIDWD',
  'CARY',
  'CHIWAUKEE',
  'CHI_SP',
  'Ogden Dunes',
  'NORTHBRK',
  'SCHILPRK',
  'DESPLNS',
  'Hammond-167th St',
  'Naperville',
  'Joliet',
  'Kingery Near-road #1',
  'East Chicago - Marin',
  'Cicero Liberty'],
 'lat': [5131242.049505054,
  5102225.475927335,
  5124305.2240361385,
  5111793.729165387,
  5045171.564705385,
  5194156.906659716,
  5236883.450753147,
  5148045.916607161,
  5103900.650331237,
  5182063.965388965,
  5155738.054781796,
  5170016.375723212,
  5100418.860309329,
  5126797.543824141,
  5090370.475709981,
  5098063.743738284,
  5109207.869309763,
  5140678.201429784],
 'lon': [-9777406.836185357,
  -9718749.813499112,
  -9764233.287644882,
  -9766337.226020874,
  -9817332.684753273,
  -9822954.319038333,
  -9775086.937997224,
  -9765379.878400052,
  -9707007.610971255,
  -9773784.499954944,
  -9782322.704898788,
  -9780875.551518476,
  -9739903.745015066,
  -9813058.01630681,
  -98090

In [20]:
source.data['lastValue']

[22.4,
 6.9,
 14.8,
 17.9,
 7.8,
 5.7,
 9.2,
 21.3,
 5.2,
 8.0,
 16.1,
 15.1,
 11.9,
 15.3,
 15.3,
 19.3,
 18.4,
 11.8]

In [21]:
ndata['lon']

NameError: name 'ndata' is not defined

In [None]:
# Define data sources to plot on the Chicago HTML Map

In [29]:
import numpy as np

from bokeh.plotting import figure, show

N = 4000
x = np.random.random(size=N) * 100
y = np.random.random(size=N) * 100
radii = np.random.random(size=N) * 1.5
colors = np.array([(r, g, 150) for r, g in zip(50+2*x, 30+2*y)], dtype="uint8")

TOOLS="hover,crosshair,pan,wheel_zoom,zoom_in,zoom_out,box_zoom,undo,redo,reset,tap,save,box_select,poly_select,lasso_select,examine,help"

p = figure(tools=TOOLS)

p.scatter(x, y, radius=radii,
          fill_color=colors, fill_alpha=0.6,
          line_color=None)

show(p)

In [28]:
colors.shape

(4000, 3)