# Fetching air quality data from the FMI open data timeseries API

[API documentation](https://github.com/fmidev/smartmet-plugin-timeseries/blob/master/docs/Using-the-Timeseries-API.md),
[API examples](https://github.com/fmidev/smartmet-plugin-timeseries/blob/master/docs/Examples.md),
[JSON API example call](https://opendata.fmi.fi/timeseries?format=json&groupareas=0&producer=airquality_urban&area=Helsinki&param=time,fmisid,PM10_PT1H_avg,PM25_PT1H_avg,O3_PT1H_avg,CO_PT1H_avg,SO2_PT1H_avg,NO2_PT1H_avg,TRSC_PT1H_avg),
[CSV API call for the fmisid to name mapping](https://opendata.fmi.fi/timeseries?format=ascii&groupareas=0&separator=,&producer=airquality_urban&area=Finland&param=fmisid,name,latitude,longitude&starttime=2022-08-26T08:00:00%2B00:00&endtime=2022-08-26T08:00:00%2B00:00&tz=UTC).

In [4]:
import requests
import pendulum
import pandas as pd
import numpy as np

In [5]:
start_time = pendulum.yesterday('UTC')
start_time = pendulum.now('UTC').subtract(days=5)
end_time = pendulum.tomorrow('UTC')

aq_fields = {
    'fmisid': np.int32,
    'time': np.datetime64,
    'AQINDEX_PT1H_avg': np.float64,
    'PM10_PT1H_avg': np.float64,
    'PM25_PT1H_avg': np.float64,
    'O3_PT1H_avg': np.float64,
    'CO_PT1H_avg': np.float64,
    'SO2_PT1H_avg': np.float64,
    'NO2_PT1H_avg': np.float64,
    'TRSC_PT1H_avg': np.float64,
}

url = 'https://opendata.fmi.fi/timeseries'

params = {
    'format': 'json',
    'precision': 'double',
    'groupareas': '0',
    'producer': 'airquality_urban',
    'area': 'Uusimaa',
    'param': ','.join(aq_fields.keys()),
    'starttime': start_time.isoformat(timespec="seconds"),
    'endtime': end_time.isoformat(timespec="seconds"),
    'tz': 'UTC',
}

data = requests.get(url, params=params).json()

In [6]:
df = pd.DataFrame(data).astype(aq_fields)
df = df.set_index(['fmisid', 'time'])
df[0:10]

Unnamed: 0_level_0,Unnamed: 1_level_0,AQINDEX_PT1H_avg,PM10_PT1H_avg,PM25_PT1H_avg,O3_PT1H_avg,CO_PT1H_avg,SO2_PT1H_avg,NO2_PT1H_avg,TRSC_PT1H_avg
fmisid,time,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
100662,2022-09-08 11:00:00,1.0,8.2,3.7,58.3,,1.6,6.1,
100662,2022-09-08 12:00:00,2.0,6.2,4.2,61.4,,1.2,3.9,
100662,2022-09-08 13:00:00,2.0,8.6,3.1,61.5,,1.5,4.6,
100662,2022-09-08 14:00:00,2.0,8.8,3.5,61.1,,1.4,6.4,
100662,2022-09-08 15:00:00,1.0,5.7,3.7,58.6,,1.3,9.2,
100662,2022-09-08 16:00:00,2.0,5.8,1.8,60.6,,1.3,6.8,
100662,2022-09-08 17:00:00,1.0,6.2,3.6,59.0,,1.1,8.0,
100662,2022-09-08 18:00:00,1.0,5.6,4.2,48.8,,1.3,16.1,
100662,2022-09-08 19:00:00,1.0,7.2,4.7,47.5,,1.2,15.6,
100662,2022-09-08 20:00:00,1.0,7.3,4.2,50.1,,1.0,11.4,


In [7]:
df.to_parquet('data/airquality.parquet', compression='zstd')