# Air Quality Assessment in the Arctic
In this notebook we analyze the air quality in the arctic area using the data from ground stations located in European arctic

* Grundartangi Groef, Iceland (EEA st.52149) 
* Kopavogur Dalsmari, Iceland (EEA st.52109)
* Oulu Pyykoesjaervi, Finland (EEA st. 15557)
* Oulun Keskusta 2, Finland (EEA st. 15609)
* Tromso, Norway (EEA st. 62993) 
* Muonio Sammaltunturi, Finland (FMI st. 101983) 

The datasets are provided by the [Finnish Meteorological Institute (FMI)](https://en.ilmatieteenlaitos.fi/download-observations) and the [European Environmental Agency (EEA)](https://discomap.eea.europa.eu/map/fme/AirQualityExport.htm). The aim of the project is to build a model to predict the state of environmental variables such as the PM10. The prediction will be compared with that from the [CAMS service](https://ads.atmosphere.copernicus.eu/cdsapp#!/dataset/cams-europe-air-quality-forecasts). The prediction provided by the CAMS are the result of an ensemble of 9 air quality numerical models. An air quality model integrates a dynamical model of the atmosphere with the chemical-physical equations of the air components. The CAMS numerical models use as input the data assimilated from a network of ground stations and satellite sensors and return as output a grid that represents the horizontal and vertical distribution of the air quality variables with a prediction of their future state.      


In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## The datasets
Five of the 6 datasets can be downloaded from the EEA website by selecting the name of the country, the polluttant (PM10), and the years. Each station has a code number that can be used to filter out the files to be dowloaded. The dataset of the station managed by the FMI, Muonio Sammaltunturi, can be downloaded from the FMI website by selecting the hourly Air Quality observations of the Muonio Sammaltunturi meteorological institute for the inhalable particles <10 $\mu m$ and the time period.    


In [3]:
path_muonio = 'data/fmi/101983/Muonio_Sammaltunturi.csv'
muonio_df = pd.read_csv(path_muonio)
muonio_df.head(3)

Unnamed: 0,Observation station,Year,Month,Day,Time [Local time],Inhalable particles <10 µm [µg/m3]
0,Muonio Sammaltunturi,2020,1,1,01:00,0
1,Muonio Sammaltunturi,2020,1,1,02:00,0
2,Muonio Sammaltunturi,2020,1,1,03:00,0


In [4]:
year = muonio_df['Year']
month = muonio_df['Month']
day = muonio_df['Day']

def ddigit(value):
    if (value < 10):
        return '0' + str(value)
    else:
        return str(value)

muonio_df['Date'] = year.apply(lambda year: str(year)) + '-' \
                  + month.apply(lambda month: ddigit(month)) + '-' \
                  + day.apply(lambda day: ddigit(day))

In [5]:
muonio_df.head(3)

Unnamed: 0,Observation station,Year,Month,Day,Time [Local time],Inhalable particles <10 µm [µg/m3],Date
0,Muonio Sammaltunturi,2020,1,1,01:00,0,2020-01-01
1,Muonio Sammaltunturi,2020,1,1,02:00,0,2020-01-01
2,Muonio Sammaltunturi,2020,1,1,03:00,0,2020-01-01


We remove some fields that are redundant

In [18]:
muonio_df_prep = muonio_df.drop(['Observation station', 'Year', 'Month', 'Day'], axis=1)

We change the order of the fields

In [47]:
cols = muonio_df_prep.columns.to_list()
cols = cols[-1:] + cols[:-1]
muonio_df_prep1 = muonio_df_prep[cols]
muonio_df_prep1.head(3)

Unnamed: 0,Date,Time [Local time],Inhalable particles <10 µm [µg/m3]
0,2020-01-01,01:00,0
1,2020-01-01,02:00,0
2,2020-01-01,03:00,0


We rename the fields

In [48]:
muonio_df_prep2 = muonio_df_prep1.rename(columns={'Time [Local time]':'Time', 'Inhalable particles <10 µm [µg/m3]':'PM10'})
muonio_df_prep2.head(3)

Unnamed: 0,Date,Time,PM10
0,2020-01-01,01:00,0
1,2020-01-01,02:00,0
2,2020-01-01,03:00,0


We want to aggregate the values of each day so that for each day we have the mean value of PM10 for that day

In [None]:
mean_pm10 = []
for (method, group) in muonio_df_prep2.groupby('Date').sum('PM10'):
    print(group)
print(days)

In [52]:
muonio_df_prep2

Unnamed: 0,Date,Time,PM10
0,2020-01-01,01:00,0
1,2020-01-01,02:00,0
2,2020-01-01,03:00,0
3,2020-01-01,04:00,0.7
4,2020-01-01,05:00,0.2
...,...,...,...
26296,2022-12-31,20:00,1.7
26297,2022-12-31,21:00,1.4
26298,2022-12-31,22:00,1.8
26299,2022-12-31,23:00,1.6


## References
* [Fazzini et al. - Forecasting PM10 Levels Using Machine Learning Models in the Arctic: A Comparative Study](https://www.mdpi.com/2072-4292/15/13/3348)  
* [ECMWF - CAMS Regional: European air quality analysis and forecast data documentation](https://confluence.ecmwf.int/display/CKB/CAMS+Regional%3A+European+air+quality+analysis+and+forecast+data+documentation)