Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Challenge 34 - Regional to Urban Air Quality Mapper #14

Open
RubenRT7 opened this issue Feb 20, 2024 · 13 comments
Open

Challenge 34 - Regional to Urban Air Quality Mapper #14

RubenRT7 opened this issue Feb 20, 2024 · 13 comments
Labels
ECMWF New feature or request Helmoltz Zentrum Hereon Software Development Software development for Earth Sciences applications

Comments

@RubenRT7
Copy link
Contributor

RubenRT7 commented Feb 20, 2024

Challenge 34 - Regional to Urban Air Quality Mapper

Stream 3 - Software Development for Earth Sciences applications

Goal

Develop an application capable of improving (downscaling) the quality of regional-scale pollutant concentrations at ground level to urban-scale concentrations for various urban areas in Europe. The minimum outcome would be a downscaling application for the CAMS European Air Quality Reanalysis based for example on land use regression in combination with ground-based measurements for urban areas in Europe. The results should also be visualized as a set of maps comparing regional and urban concentrations and additional information on the comparison against measurements. A more ambitious target would include the integration of satellite data products, other datasets or Machine Learning approaches into the downscaling methodology. As a necessary step: the downscaled pollutant concentrations must be evaluated against available ground-based measurements to assess the performance and quality of the urban concentrations versus regional concentrations.

Mentors and skills

  • Mentors:
    • Helmholtz-Zentrum Hereon: Martin Ramacher, Johannes Bieser
    • ECMWF: Miha Razinger
  • Skills required:
    • Python or any other programming language suitable for spatial data processing (interpolation, projection, re/gridding, …)
    • Experience with any land use regression or other downscaling methods is an advantage but not necessary
    • Plotting/Visualization of maps and statistical indicators
    • (Optional) Experience in developing GUI or API applications

Challenge description

The problem
Regional-scale atmospheric composition products typically have a spatial resolution of many kilometres, for example, the CAMS European Air Quality reanalysis with a resolution of 10 km x 10 km (1). While this resolution is suitable for regional analyses and forecasting of air quality, the air quality in urban areas is not well represented. This can lead to over and underestimations of pollutant concentrations, especially in the vicinity of roads, industrial areas or at a city’s boundaries. While it is possible to simulate pollutant concentrations on urban scales with grid resolutions of 100 to 1000 m, such simulations are expensive in terms of time and computational power.

Approaches
Downscaling approaches, to achieve meaningful results from regional scale pollutant concentration for urban areas, have proven to be an efficient and robust source of air quality information. There exist many methods for downscaling regional-scale to urban-scale concentrations, such as interpolation in combination with measurements, land use regression approaches up to data fusion approaches that consider multiple sources of spatially resolved land-use, socio-economic or measurement data and could also include Machine Learning techniques, to achieve urban-scale pollutant concentrations.

Goals
The goal of this challenge, to create an application that is based on a suitable downscaling technique to achieve urban-scale pollutant concentrations with a high resolution (e.g. 100 x 100 m2, ideally higher) for any urban area in Europe. There exist a variety of concepts and methods, as well as suitable and open-source datasets (CORINE, UrbanAtlas, OSM, etc.) and measurements (AIRBASE, low-cost sensor networks, satellite data) that can be applied to achieve this goal. But also other datasets and methods that lead to the same goal are welcome.

An important step in the development is the evaluation of downscaled pollutant concentrations with available measurements and the comparison with regional concentrations.

We would be also interested in how the project results compare with other CAMS model downscaling and model calibration activities, like CAMS European air quality forecasts optimised at observation sites dataset (2) and the downscaling activities in the framework of CAMS National Collaboration Programme (NCP) (3).

A desired outcome would therefore be a GUI or command line application that would produce a collection of maps or time series plots of regional and urban concentrations for a selected region or list of cities in combination with computed evaluation indicators (BIAS, RMSE, MQI/MQO; following evaluation methodology based on the FAIRMODE recommendations (4)).

Expected outcomes:

  • Downscaling application that produces results in either textual or graphical format
  • Evaluation of the results
  • Visualization

Sources:
(1) https://ads.atmosphere.copernicus.eu/cdsapp#!/dataset/cams-europe-air-quality-reanalyses
(2) https://ads.atmosphere.copernicus.eu/cdsapp#!/dataset/cams-europe-air-quality-forecasts-optimised-at-observation-sites
(3) https://atmosphere.copernicus.eu/cams-national-collaboration-programme (see "CAMS air quality products downscaled at national level")
(4) https://gmd.copernicus.org/articles/16/6029/2023/

@EsperanzaCuartero EsperanzaCuartero changed the title Challenge 14 - Regional to Urban Air Quality Mapper Challenge 17 - Regional to Urban Air Quality Mapper Feb 22, 2024
@EsperanzaCuartero EsperanzaCuartero added the Software Development Software development for Earth Sciences applications label Feb 22, 2024
@EsperanzaCuartero EsperanzaCuartero changed the title Challenge 17 - Regional to Urban Air Quality Mapper Challenge 34 - Regional to Urban Air Quality Mapper Feb 23, 2024
@iyui1223
Copy link

iyui1223 commented Mar 6, 2024

I am thinking about utilizing Google's Air Quality API to first statistically downscale the grid data into 500m x 500m mesh, and then using ground-based observations (e.g. using car equipped sensor data) to make a more detailed estimation using a similar algorithm for quantitative precipitation estimation. This approach only works for cities which has detailed ground-based observation. Is this acceptable? Or approach from finding statistical correlation between landcover and pollutant emission works better?

@martinottopaul
Copy link

Hi iyui1223, thank you for your questions, this sounds like a promising approach.

Do you know if the Google Air Quality API is free to use for other applications, like the one we are envisioning for this challenge? You might need to check what license the data from the Google Air Quality API comes with.

In general, we are looking for a proposal that works in every European city (as given in the Copernicus Urban Atlas, https://land.copernicus.eu/en/products/urban-atlas), so one could come up with a combined approach between Google Air Quality API, available sensor data, land cover data and emissions or different approaches for areas with and without available measurements.

@iyui1223
Copy link

iyui1223 commented Mar 6, 2024

Thank you martinottopaul for your advice. Expectedly, I found that Google's Air Quality API charges you per data amount.

On the other hand, I also found that many of detailed ground-based observation from google cars comes as free (both in charge and terms) and readily available in csv format.
(e.g. for the Hamburg city https://repos.hcu-hamburg.de/handle/hcu/893)

I believe I can interpolate this super-detailed data in an adequate manner to create maybe100m x 100m grid data for the covered patch of Hamburg city. It may at least give the possible maximum values per block as all observations are set upon busy streets. (it seems trustable, but I may have to read the documentation beforehand to decide)

I can then use this Hamburg data as verification to train the statistical downscaling model, which may use satellite emission monitoring/land cover data, and the CAMS dataset. I'll sleep over this thought, and may try cooking up a realistic summer project out of it or try come up with another idea.

@RubenRT7 RubenRT7 added the ECMWF New feature or request label Mar 7, 2024
@tauheed05
Copy link

Hey there, just needed a clarification.
When the problem statement says 'application', does it mean a web application, a mobile application, or a desktop application?

@martinottopaul
Copy link

Hi tauheed05,
it does not need to be a web application or a mobile application. The term application refers to anything from a bundle of scripts to be used as a tool. Of course it would be nice to have an easy-to-use GUI. But this can considered to be as optional. The challenge is packed with methodological and technical requirements, which are more important to tackle compared to an easy-to-use or pretty web/mobile/desktop application.
Does this answer your question? Looking forward to your proposal :)

@tauheed05
Copy link

tauheed05 commented Mar 27, 2024

@martinottopaul Thanks for the clarification!

@OnurSahin20
Copy link

Hello, we can statistically downscale the air quality data using the trained machine learning model. We need static and dynamic auxiliary variables that are related to the targeted variable. I downscaled the SMAP surface soil moisture before, which is strongly corelated with land surface temperature. I assume maps of the population, distribution of industry, and agriculture, and energy sources could be used as auxiliary parameters. Are there rasterized maps for Europe (population, distribution of industry, etc.) that are possibly related to air quality?

@martinottopaul
Copy link

martinottopaul commented Mar 28, 2024

Hi @OnurSahin20
sounds promising and yes: there are many datasets and (urban) land use categories; some of them gridded and some not (but you could rasterize them?). With a little digging, you might also find gridded data on economic statistics (although I am not sure on this). I suggest you have a look at Copernicus Urban Atlas (https://land.copernicus.eu/en/products/urban-atlas) or CORINE land cover (https://land.copernicus.eu/en/products/corine-land-cover) as well as the Global Human Settlement Layer (https://ghsl.jrc.ec.europa.eu/datasets.php). And I think this is just the tip of the iceberg.

@r-maiwald
Copy link

Hi Martin,
the challenge sounds like a very exciting project!
We started to look at the measurement data from the EEA to estimate a project schedule for our proposal. Looking at single stations and their positions, we were wondering if it would be necessary to adapt the data depending on the sensor height. Do you reckon that it would be necessary to convert the measurements depending on the height? The CAMS Reanalysis vertical coverage says „surface“ as the lowest level. But especially for cities, I would expect a strong gradient for air pollution with the height. Have you dealt with this in the past? 🙂

@martinottopaul
Copy link

Hi @r-maiwald,
it's a common approach to use measurement data from different heights for ground surface evaluation, as long as they are within a suitable range that can be considered surface. When I remember right, than surface in CAMS Reanalysis is up to 50m. So, if you want to evaluate the surface layer you should drop sensors that are above 50m.
And yes, you are right. There are strong vertical gradients in urban areas, depending on sources and building structures that have influence on turbulence. Nevertheless, such gradients are in the focus of CFD (computational fluid dynamics) models and sometimes are somewhat parametrized in urban-scale modeling as well. But especially CDF is computationally very expensive and nowadays not applicable to entire cities but only building blocks.
To sum it up: For this challenge it is sufficient to apply all available measurements in the urban areas that can be assigned to surface concentrations, based on the max. height of the surface layer in CAMS Reanalysis. I hope this helps.

@r-maiwald
Copy link

Thanks @martinottopaul, then we will include in the proposal only a filter by max. height of the surface layer 👍

@iyui1223
Copy link

Research Proposal for Challenge 34.docx

Seems like many are interested in this project.
I've already submitted my proposal, but unfortunately, I could not find any collaborator from the same institution (an operational weather modelling team).
If anyone commented here got their project accepted, how do you say to team up together?
At least for my project, additional member who has good solid background in either Atmospheric Transportation Processes and/or Data Analysis is more than welcome. I've made my proposal with a minimalistic goal, so that I can finish it in a few months within my own capacity. But it as well included a few optional data/analysis methods to explore if extra time/hands are available (e.g. including additional proxy data of human activities for statistical model training). I can do basically everything what I've wrote in this proposal, and if you find my hand useful in your project, I am happy to join in.

If you are in on this invitation, I am glad if you notify me by replying to this thread, preferably by the time of announcement of chosen proposals so that the I (or other chosen team) can ask if it is OK to the ECMWF mentors at the first meeting :-)

@martinottopaul
Copy link

Hi @iyui1223, thanks for reaching out. We'll currently evaluating all the proposals and are taking your idea of joining another team into account.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ECMWF New feature or request Helmoltz Zentrum Hereon Software Development Software development for Earth Sciences applications
Projects
None yet
Development

No branches or pull requests