# Data for Montevideo

CSV data was only available up until 2018: https://catalogodatos.gub.uy/dataset/intendencia-montevideo-red-de-monitoreo-de-la-calidad-del-aire-de-montevideo

In the CSVs, however, there was a file named `hn.csv`, which stands for "Humo Negro". Would this be Black Carbon?

So I went to https://montevideo.gub.uy/areas-tematicas/ambiente/calidad-del-aire/informes-semanales-2019, which provides a weekly report on the data. Yes, I went through the PDFs and copy-pasted the data. 

The main drawback is that the PDF only provides the maximum, not the average.

In [1]:
import pandas as pd
import h3
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

base = pd.read_csv('red_base.csv', parse_dates=['Fecha'], dayfirst=True, index_col="Fecha", na_values="ND", dtype=float)
orientada = pd.read_csv('red_orientada.csv', parse_dates=['Fecha'], dayfirst=True,  index_col="Fecha", na_values="ND", dtype=float)
df = pd.concat([base, orientada], axis=1)

In [2]:
df = df[df.index.month == 4]
stations_with_more_than_20_days = df.columns[df.count() > 20]
df = df[stations_with_more_than_20_days]
station_data = pd.read_csv('aire-estaciones.csv', index_col="ESTACION")

There seems to be many stations in the same area.
According to `aire-descripcion-campos.csv`, the X and Y are the "Localizacion de la estacion (UTM- Zona 21)"

However from the stations that have more than 20 days of data, only Curva de Maronas is present. I'll take 'La Tablada', 'Palacio Legislativo', 'Bella Vista' from Google.

## Converting the lat longs
https://ocefpaf.github.io/python4oceanographers/blog/2013/12/16/utm/

In [3]:
from pyproj import Proj
myProj = Proj("+proj=utm +zone=21 +south +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
lon, lat = myProj(station_data['X'].values, station_data['Y'].values, inverse=True)
station_data['lat'] = lat
station_data['lon'] = lon

In [4]:
stations = pd.DataFrame([
    ["Curva de Maroñas", station_data.loc['Curva de Maronas'].lat, station_data.loc['Curva de Maronas'].lon],
    ['La Tablada',-34.8215517,-56.2426536], 
    ['Palacio Legislativo', -34.8912001,-56.1888251], 
    ['Bella Vista',-34.8773524,-56.2056377]
], columns=["Name", "Lat", "Lon"])
stations = stations.set_index("Name")
stations['h3id'] = stations.apply(lambda row: h3.geo_to_h3(row.Lat, row.Lon, 9), axis=1)
h3dict = stations.h3id.to_dict()

In [10]:
finaldf = df.reset_index().melt(id_vars='Fecha')
finaldf['h3id'] = finaldf.variable.apply(lambda s: h3dict[s])
# convert ug/m3 to ppb
# The conversion assumes an ambient pressure of 1 atmosphere and a temperature of 25 degrees Celsius.
# https://www2.dmu.dk/atmosphericenvironment/expost/database/docs/ppm_conversion.pdf
finaldf['NO2'] = finaldf.value/1.88
finaldf['time'] = pd.to_datetime(finaldf.Fecha)

In [11]:
finaldf[['h3id', 'time', 'NO2']].to_csv('y_data.csv', index=False)