Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the MERRA2 model support #35

Closed
jlmaurer opened this issue Apr 30, 2020 · 17 comments
Closed

Add the MERRA2 model support #35

jlmaurer opened this issue Apr 30, 2020 · 17 comments
Assignees
Labels
enhancement New feature or request not urgent

Comments

@jlmaurer
Copy link
Collaborator

No description provided.

@jlmaurer jlmaurer added enhancement New feature or request help wanted Extra attention is needed labels Apr 30, 2020
@piyushrpt
Copy link
Collaborator

Here is some code from pyaps - might help get started. nc4 might be most relevant.
http://earthdef.caltech.edu/projects/pyaps/repository/entry/merra.py

@dbekaert
Copy link
Owner

dbekaert commented Apr 30, 2020

An example of an actual download is here, although looking to leverage OpenDAP for retrieval of the specific layers and some slice&dice operations, not a complete file download.

MERRA-2 data

there are pressure and model outputs. We should aim to use the model level files.

  • Preferred inst6_3d_ana_Nv: 3d,6-Hourly,Instantaneous,Model-Level,Analysis,Analyzed Meteorological Fields V5.12.4 (M2I6NVANA 5.12.4)
  • inst6_3d_ana_Np: 3d,6-Hourly,Instantaneous,Pressure-Level,Analysis,Analyzed Meteorological Fields V5.12.4 (M2I6NPANA 5.12.4)

OpenDAP for slice and dice of data

We made an initial test with OpenDAP but were in contact with GSFC to resolve some issue. The idea is to contruct a URL which has all the subsetting included. Here is relavant documentation:

Below i capture the example we worked on for MERRA-2 with Ray:

import pydap.client
import pydap.cas.urs
url = 'https://goldsmr5.gesdisc.eosdis.nasa.gov:443/opendap/MERRA2/M2I6NPANA.5.12.4/1980/01/MERRA2_100.inst6_3d_ana_Np.19800101.nc4'
session = pydap.cas.urs.setup_session('username', 'password', check_url=url)
ds = pydap.client.open_url(url, session=session)
ds['T'][:]

The problem experienced when getting the temperature:

        <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
        <html><head>
        <title>302 Found</title>
        </head><body>
        <h1>Found</h1>
        <p>The document has moved <a href="https://urs.earthdata.nasa.gov/oauth/authorize/?scope=uid&app_type=401&client_id=e2WVk8Pw6weeLUKZYOxvTQ&response_type=code&redirect_uri=http%3A%2F%2Fgoldsmr5.gesdisc.eosdis.nasa.gov%2Fdata-redirect&state=aHR0cHM6Ly9nb2xkc21yNS5nZXNkaXNjLmVvc2Rpcy5uYXNhLmdvdi9vcGVuZGFwL01FUlJBMi9NMkk2TlBBTkEuNS4xMi40LzE5ODAvMDEvTUVSUkEyXzEwMC5pbnN0Nl8zZF9hbmFfTnAuMTk4MDAxMDEubmM0LmRvZHM%2FVFswOjE6M11bMDoxOjQxXVswOjE6MzYwXVswOjE6NTc1XQ">here</a>.</p>
        </body></html>

Feedback from the GSFC colleagues was the following link:
https://wiki.earthdata.nasa.gov/display/EL/How+To+Access+Data+With+PyDAP

@dbekaert
Copy link
Owner

Also there is a lot of similarity between MERRA-2 and the GMAO models, both leverage OpenDAP and also have same filename, format and layers. So we should be able to re-use it.

That say we are working closely with GMAO team, so we could try to get that to work with openDAP first, as now we have their support in case of questions.

@dbekaert dbekaert added this to the Basic Model Support milestone Apr 30, 2020
@dbekaert dbekaert removed the help wanted Extra attention is needed label Jun 11, 2020
@dbekaert dbekaert changed the title Add the MERRA2 model and update Add the MERRA2 model support Jun 13, 2020
@leiyangleon
Copy link
Collaborator

@dbekaert could you ask the MERRA-2 team for the updated url? Jeremy and I both encountered the 302 found error, saying the "the document has moved", when using the OpenDAP to retrieve the data.

@dbekaert
Copy link
Owner

@leiyangleon might just be a typo in the documentation. See the parent folder of the URL: https://goldsmr5.gesdisc.eosdis.nasa.gov/opendap/MERRA2/M2I6NPANA.5.12.4/1980/01/

Also, the URL changes depending on the year its being accessed; so one would need to construct it based on the user request. See the above link that @piyushrpt provided or look within the code of TRAIN to get an idea on URLS.

@leiyangleon
Copy link
Collaborator

leiyangleon commented Jun 24, 2020

@dbekaert the above URL link is the same as the one @jlmaurer and I tried. It won't work. I also looked at the URLs that you and Piyush mentioned above (in Pyaps) and it is indeed as you mentioned year-month-date-hr dependent. So I need to get the right format of these. Well I am guessing for example, year=1980, month=01, date=19800101 and hr=00. However, it would be really helpful to have a documentation or updated URL database. Could you direct me to either of these resources? Otherwise I have no idea/experience what I am manipulating and expecting.

@jlmaurer
Copy link
Collaborator Author

Hey @leiyangleon I just went to the URL and copied this link an example file:
https://goldsmr5.gesdisc.eosdis.nasa.gov/opendap/MERRA2/M2I6NPANA.5.12.4/1980/01/MERRA2_100.inst6_3d_ana_Np.19800101.nc4.html. Perhaps let's try to experiment with pydap to get this to work. maybe a permission issue with opendap or something.

@dbekaert
Copy link
Owner

We reversed engineered the link to know what it its.
See here for TRAIN and merra-2: https://github.com/dbekaert/TRAIN/blob/6c93feb95ae95eaf4c8468e89ec0b8325eac946f/matlab/aps_merra_files.m#L149

The baseurl can change depending on the year. See the Pyaps or TRAIN code for it.
Then YYYY/MM/MERRA2_100.inst6_3d_ana_Np.YYYYMMDD.nc4

@dbekaert
Copy link
Owner

@leiyangleon looked a bit in more detail on the internet.
The MERRA-2 data is also on GES DISC so its behind Earthdata log-in.

  1. This means that there needs to be a .netrc file with Earthdata credentials.
  2. In addition the users needs to add the following application to their earthdata profile: GES DISC

I tried it now:

In [7]: import pydap.client 
   ...: import pydap.cas.urs 
   ...: url = 'https://goldsmr5.gesdisc.eosdis.nasa.gov:443/opendap/MERRA2/M2I6NPANA.5.12.4/1980/01/MERRA2_100.inst6_3d_ana_Np.19800101.nc4' 
   ...: session = pydap.cas.urs.setup_session('username', 'password', check_url=url) 
   ...: ds = pydap.client.open_url(url, session=session) 

In [6]: ds['T'][:]                                                                                                                                                            

Out[6]: 
<BaseType with data array([[[[9.9999999e+14, 9.9999999e+14, 9.9999999e+14, ...,
          9.9999999e+14, 9.9999999e+14, 9.9999999e+14],
         [9.9999999e+14, 9.9999999e+14, 9.9999999e+14, ...,
          9.9999999e+14, 9.9999999e+14, 9.9999999e+14],
         [9.9999999e+14, 9.9999999e+14, 9.9999999e+14, ...,
          9.9999999e+14, 9.9999999e+14, 9.9999999e+14],
         ...,

Can you add documentation to the RAIDER README.md as well?
Perhaps have a new file linked from the main page with in difffernt sections the details for each model what users would need to do in addition to just running the code.
e.g. ECMWF, how do users get an account and how to they set up there .cdsapi.
e.g. MERRA2, how to get Earthdata account, the need for making a .netrc file, the need for adding in the application to their profile.
i.e.

@dbekaert
Copy link
Owner

@jlmaurer: When a user submits a call and asks for a region of interest, do you automatically expand the model bbox with extra nodes prior to calling the download code or does @leiyangleon need to account for that himself?

@leiyangleon the example provided above does not slice and dice yet. Next step would be to slice and dice based on the bbox, the utc time etc using opendap; and then just load the complete array.

@leiyangleon
Copy link
Collaborator

@dbekaert Thanks for spotting the problem. That is probably the reason why I cannot get access to the specific data values. Sounds good. Will update the docs then.

@leiyangleon
Copy link
Collaborator

@dbekaert @jlmaurer problem solved after adding the application NASA GESDISC DATA ARCHIVE to my earth data profile as @dbekaert suggested. I am going to write a file linked from the main page to account for this.

@dbekaert thanks for the notes. Yes, we will slice the array based on the bounding box and utc time, etc.

@leiyangleon
Copy link
Collaborator

leiyangleon commented Jul 21, 2020

@dbekaert @jlmaurer I checked the above-mentioned pressure-level (inst6_3d_ana_Np, M2I6NPANA 5.12.4) and model-level (inst6_3d_ana_Nv, M2I6NVANA 5.12.4) data in the MERRA2 docs. Please see attached the two figures extracted from the docs. For the pressure level, surface pressure and Geopotential height are provided, which can be used to calculate the pressure and height; for model level, only pressure thickness is provided for each model layer without height information.

I further checked the docs, which says all of the 72 model layers come with fixed 1-D vector of top pressure of each layer, and the pressure can be calculated with this top pressure plus the layer pressure thickness (DELP, provided in the model level data) all the way from the top layer (index=0) down to the bottom one (index=71). So it seems we can compute the pressure information using this way; however, in the docs, it also says this is good for the current version of data, but may be changed later. So it is recommended to use the 3D pressure data as distributed, which is only available for pressure-level.

Also, even though we can find the pressure for model-level, we still need the 1-D or 3-D height, which is not available for model-level data but does exist in pressure-level. By referring to the docs again, it says all of these model-level data are defined on a terrain-following hybrid sigma-p coordinate or hybrid-sigma, see below
For the MERRA-2 products documented here, all model-level fields are on a hybrid-sigma coordinate, and their vertical location could be obtained from the “ak-bk” relationship as well. But this may change in future GMAO systems. We thus recommend that users rely on the reported 3D pressure distribution, and not use those computed from the “ak” and “bk”.

So I need your input about how we are gonna proceed, e.g.

  • pressure-level or model-level?

  • if model-level, how to compute height using ak and bk for the hybrid-sigma coordinate? There is no information in the docs about this calculation. BTW, I can find ak and bk from the metadata.

Screen Shot 2020-07-21 at 3 34 18 PM

Screen Shot 2020-07-21 at 3 34 23 PM

@dbekaert
Copy link
Owner

@leiyangleon can you replace the model to point to inst3_3d_asm_Nv instead of inst6_3d_ana_Nv.
That will make it the same as for GMAO. Note that this is a 3Hourly model output.

I will verify with the atmospheric scientist to see if this is the best one to use.

@leiyangleon
Copy link
Collaborator

@leiyangleon can you replace the model to point to inst3_3d_asm_Nv instead of inst6_3d_ana_Nv.
That will make it the same as for GMAO. Note that this is a 3Hourly model output.

I will verify with the atmospheric scientist to see if this is the best one to use.

@dbekaert Yes, I can use that instead. That one looks almost identical to the GMAO model.

@leiyangleon
Copy link
Collaborator

I am closing this issue because MERRA-2 mode support has be completed as here.

@jlmaurer
Copy link
Collaborator Author

Closing as addressed by #112

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request not urgent
Projects
None yet
Development

No branches or pull requests

4 participants