# OpenAQ
[Explore data](https://explore.openaq.org/#1.2/20/40)

[API docs](https://docs.openaq.org/examples/examples)

This notebook is to get air quality data from OpenAQ

In [48]:
from openaq import OpenAQ
import requests
import json
import pandas as pd

In [25]:
openaq_apikey = 'b9f803d97d9a502959910d8b74cc9d94725b8a85ea028ea5e26b3f8905df0758'

headers = {
    "X-API-Key": openaq_apikey
}

# Daily Averages Data

To get the daily averages data, we need ID of the sensor whose daily average data we need. Every location (station) will have 1+ sensors. We use the locations API to get the list of sensors. I'm interested in the [New Delhi Location (8118)](https://explore.openaq.org/locations/8118)

In [74]:
LOCATION_ID = 8118

BASE_URL = f"https://api.openaq.org/v3/locations/{LOCATION_ID}"
response = requests.get(BASE_URL, headers=headers, params=params)
response.raise_for_status()

data = response.json()
data

{'meta': {'name': 'openaq-api',
  'website': '/',
  'page': 1,
  'limit': 100,
  'found': 1},
 'results': [{'id': 8118,
   'name': 'New Delhi',
   'locality': 'India',
   'timezone': 'Asia/Kolkata',
   'country': {'id': 9, 'code': 'IN', 'name': 'India'},
   'owner': {'id': 4, 'name': 'Unknown Governmental Organization'},
   'provider': {'id': 119, 'name': 'AirNow'},
   'isMobile': False,
   'isMonitor': True,
   'instruments': [{'id': 2, 'name': 'Government Monitor'}],
   'sensors': [{'id': 23534,
     'name': 'pm25 µg/m³',
     'parameter': {'id': 2,
      'name': 'pm25',
      'units': 'µg/m³',
      'displayName': 'PM2.5'}}],
   'coordinates': {'latitude': 28.63576, 'longitude': 77.22445},
   'licenses': [{'id': 33,
     'name': 'US Public Domain',
     'attribution': {'name': 'Unknown Governmental Organization', 'url': None},
     'dateFrom': '2016-01-30',
     'dateTo': None}],
   'bounds': [77.22445, 28.63576, 77.22445, 28.63576],
   'distance': None,
   'datetimeFirst': {'utc': 

We can see that this location only has one PM2.5 sensor of ID 23534. We will use this ID to get its data.


In [78]:
SENSOR_ID = 23534
LIMIT = 1000

BASE_URL = f"https://api.openaq.org/v3/sensors/{SENSOR_ID}/days"

OpenAQ has a response limit of 1000. So on every API request, we'd only get 1000 rows. But OpenAQ uses pagination. So each `page` has 1000 rows. We can increase `page` to get more data. We'll append each pages data to a `master_df`

In [79]:
master_df = pd.DataFrame(columns=['date','pollutant','avg_value'])
master_df

Unnamed: 0,date,pollutant,avg_value


In [90]:
params = {
        "limit": LIMIT,
        "page": 4
    }

response = requests.get(BASE_URL, headers=headers, params=params)
response.raise_for_status()

data = response.json()

In [91]:
for i in range(len(data['results'])):
    pollutant = data['results'][i]['parameter']['name']
    value = data['results'][i]['value']
    date = data['results'][i]['period']['datetimeFrom']['local'].split('T')[0]

    new_row = {"date": date, "pollutant": pollutant, "avg_value": value}
    master_df = master_df._append(new_row, ignore_index=True)

In [93]:
master_df[['date','avg_value']].to_csv('dailyaverages_pm25_8118_23534.csv',index=False)

In [83]:
import warnings
warnings.filterwarnings("ignore")

There is more information beyond daily average in the response

In [94]:
data

{'meta': {'name': 'openaq-api',
  'website': '/',
  'page': 4,
  'limit': 1000,
  'found': 250},
 'results': [{'value': 27.5,
   'flagInfo': {'hasFlags': False},
   'parameter': {'id': 2,
    'name': 'pm25',
    'units': 'µg/m³',
    'displayName': None},
   'period': {'label': '1day',
    'interval': '24:00:00',
    'datetimeFrom': {'utc': '2025-05-12T18:30:00Z',
     'local': '2025-05-13T00:00:00+05:30'},
    'datetimeTo': {'utc': '2025-05-13T18:30:00Z',
     'local': '2025-05-14T00:00:00+05:30'}},
   'coordinates': None,
   'summary': {'min': 13.0,
    'q02': 13.46,
    'q25': 21.75,
    'median': 28.0,
    'q75': 34.0,
    'q98': 41.779999999999994,
    'max': 45.0,
    'avg': 27.458333333333332,
    'sd': 8.118573266532609},
   'coverage': {'expectedCount': 24,
    'expectedInterval': '24:00:00',
    'observedCount': 24,
    'observedInterval': '24:00:00',
    'percentComplete': 100.0,
    'percentCoverage': 100.0,
    'datetimeFrom': {'utc': '2025-05-12T19:30:00Z',
     'local': 