-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SEP data workflow: Italian Air Pollution datasets #18
Comments
The Italian Air Pollution dataset concerning PM 10 is available on the FTP area of the project. |
French Data are available from European Environmental Agency. The source data model is consistent with respect to both the Italian and French data except for the data conversion of geo spatial coordinates to administrative units. |
The French dataset about the PM10 mentioned in the previous post and which was uploaded to the FTP server in its initial version contains the geographic coordinates; it has been enriched with the Municipality value through a script in java using the specific service/API. |
Data extraction
Step1: data source website
Step2: Select DATA panel. Data are organized in a set of tables
Step3: Scroll to the requested table, named “Tabella 1 – PM10. Stazioni di monitoraggio: dati e parametri statistici per la valutazione della qualità dell'aria (2019)”
Step4: Download link available on the left bottom at the end of the table . Downloaded data are in xls format
The downloaded file is not compliant with the required Data Structure.
Data transformation
The downloaded file has the following Data Structure:
“Regione”,”Provincia”,”Comune”,”Nome della stazione Tipo di zona”,”Tipo di stazione”,”Giorni di superamento di 50 µg/m3”,”Valore medio annuo³ [µg/m³]”,”Rendimento [%]”,”Rispetta copertura minima”,”sufficiente distribuzione temporale nell'anno”,”numero_dati_validi”,”TIPO DI DATI 4”,”Codice zona”,”Nome zona”
Metadata are referenced in a time series and Variable regarding year 2019 has been used in the script.
Data Load
The transformed file has been uploaded into INTERSTAT GraphDB repository sep-test
GraphDB allows direct link to the resources by a permalink, but the raw data needs a little reworking to be accessed directly.
Further data files available
Same procedure can be used to import other data from Data Source Website
AMBIENT AIR QUALITY: NITROGEN DIOXIDE NO2
AMBIENT AIR QUALITY: TROPOSPHERIC OZONE O3
AMBIENT AIR QUALITY: PARTICULATE PM2.5
These files have not been uploaded to GraphDB repository yet
Transformation script in R language
processing_ETL_AIR.R.txt
The text was updated successfully, but these errors were encountered: