Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed Recipes for NA-Cordex #160

Open
bonnland opened this issue Jul 21, 2022 · 0 comments
Open

Proposed Recipes for NA-Cordex #160

bonnland opened this issue Jul 21, 2022 · 0 comments

Comments

@bonnland
Copy link

bonnland commented Jul 21, 2022

Source Dataset

This dataset has the same structure as the Euro-Cordex dataset. The domain is North America instead of Europe.
A pangeo-forge recipe for Euro-Cordex will likely be quite similar to the recipe for NA-Cordex, provided the target Zarr store structure seems appropriate.

  • Link to the website / online documentation for the data: https://na-cordex.org/
  • The file format: NetCDF
  • How are the source files organized? (e.g. one file per day):
    • one file per variable+simulator+gridresolution+bias-correction+scenario combination
  • How are the source files accessed (e.g. FTP):
    • via Globus on NCAR's Glade filesystem, Glade location: /glade/collections/cdg/data/cordex/data/
  • Any special steps required to access the data (e.g. password required):
    • According to @rabernat, Globus 5 allows anonymous access to Glade

Transformation / Alignment / Merging

One Zarr store per variable+gridresolution+bias-correction combination. In other words, data are aggregated over simulators and scenarios. Each scenario has a time range and a rate of carbon emission; for example, one scenario covers the years 2006-2100, with the RCP 8.5 emission levels.

The GitHub repo with Pangeo-based python code for creating Zarr stores by hand is in the following GitHub location:

Although a single notebook is used to create these Zarr files, it reads in parameters from config.yaml to produce a particular subset of Zarr stores, as it was simply too difficult to handle edge cases easily. That is, certain Zarr stores required special chunking strategies to deal with smaller or larger number of simulation runs, for example. There were many such edge cases to deal with in different dimensions of the dataset.

Glade location for NetCDF data:

  • /glade/collections/cdg/data/cordex/data/

Output Dataset

Instructions for accessing the hand-created versions of the Target Output Zarr stores can found here:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant