Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible clarification of GRIB 2D Datasets with regards to CF #152

Closed
lesserwhirls opened this issue Nov 20, 2019 · 4 comments
Closed

Possible clarification of GRIB 2D Datasets with regards to CF #152

lesserwhirls opened this issue Nov 20, 2019 · 4 comments
Labels
enhancement New feature or request

Comments

@lesserwhirls
Copy link
Collaborator

lesserwhirls commented Nov 20, 2019

The 2D GRIB collection datasets currently do something that is not recommended by CF. It's not a requirement, but it is recommended.

As an easy to see example (not a TDS issue, as the code lives in netCDF-Java in the grib module, but easier to see through the TDS), if we look at the cdl representation of the GFS 80km dataset on thredds-test (cdl shown here via cdmremote), we see things like the following:

netcdf grib/NCEP/GFS/CONUS_80km/TwoD {
  dimensions:
    reftime = 124;
    time2 = 41;

  variables:
    double reftime(reftime=124);
      :units = "Hour since 2019-10-20T00:00:00Z";

    double time2(reftime=124, time2=41);
      :units = "Hour since 2019-10-20T00:00:00Z";

  float Pressure_surface(reftime=124, time2=41, y=65, x=93);
      :units = "Pa";
      :coordinates = "reftime time2 y x ";

The main issue here, I think, is with the use of time1 (and other multidimensional coordinate variables), specifically that the variable time1 is two dimensional, and there exists a dimension with the same name. The CF spec says:

We recommend that the name of a multidimensional coordinate variable should not match the name of any of its dimensions because that precludes supplying a coordinate variable for the dimension. This practice also avoids potential bugs in applications that determine coordinate variables by only checking for a name match between a dimension and a variable and not checking that the variable is one dimensional.

(see http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#coordinate-system).

Strictly speaking, what we do here is CF compliant, but it can cause confusion for clients that do not check to see if the variable of a matching variable/dimension name pair meet the requirement of being a coordinate variable (that is, the variable is 1D), and so not recommended.

One thing we could do to help clarify the situation is to simply rename the time* dimensions to something like time*_dim, or rename the time* variable to something like valid_time*.

As a side note, if we did change the name of either the dimensions or the variables, it might be nice if we could introduce a new 1D variable with the same dimension name that is something simply like "record_number". That's ugly, but what it would do would allow OPeNDAP to stop exposing a 2D Map for the variables in our "Full" GRIB collections, which is very much not allowed by the OPeNDAP spec (The Maps for a Grid must be 1D). I'll open an issue over on https://github.com/Unidata/tds to capture that bug, as that's certainly a bug related to the CDM -> DAP2 data model translation.

@lesserwhirls lesserwhirls added the enhancement New feature or request label Nov 20, 2019
@dopplershift
Copy link
Member

I think fixing this would also make xarray happier.

@ethanrd
Copy link
Member

ethanrd commented Nov 20, 2019

Should definitely change so the var/dim names don't match on a 2D variable.

I don't particularly like the idea of adding a place holder 1D coordinate variable that has no real meaning. Are they forecast hours?

I think you are right about this being a bug in the DAP2 code. The DAP2 spec in Section 3.3.3 Grids refers to "map vectors" and says they "MUST have the same number of elements and the same name as the corresponding dimension of the Array". That seems to imply 1D only. I'm hesitating because the DAP4 spec says "[e]ach map variable MUST have a rank no more than that of the array" but I think that was a DAP4 addition to try and deal with 2D auxiliary coordinate variables.

@lesserwhirls
Copy link
Collaborator Author

The new place holder 1D coordinate variable would have no real meaning, so yeah, I think we leave it out. The TDS DAP code would might need to create it as a Map though? I can get the dimension/variable name issue addressed, and then we can see if our DAP2 code will need fixed.

@JohnLCaron
Copy link
Collaborator

JohnLCaron commented Dec 3, 2019

Theres no reason for them to use the same name, they are generated independently and dont track each other. validtime would be a good name, since thats what it is.

Not sure why DAP2 has them as map vectors. Probably trying to shoehorn the more general notion of coordinates into the simpler DAP2 coordinate model. Should be removed AFAIK.

Often the time coordinate can be "orthogonalized" into forecast time x offset hour. You could imagine than most (all?) 2D coordinate should be functions, rather than arrays. Interestingly enough, they sort of have the same representations, namely "validtime(forecast, offset)", though one is an index, the other a value. (remember "Index spaces considered harmful" ;^)

lesserwhirls added a commit to lesserwhirls/netcdf-java that referenced this issue May 4, 2020
lesserwhirls added a commit to lesserwhirls/netcdf-java that referenced this issue May 5, 2020
lesserwhirls added a commit to lesserwhirls/netcdf-java that referenced this issue May 5, 2020
lesserwhirls added a commit to lesserwhirls/netcdf-java that referenced this issue May 5, 2020
lesserwhirls added a commit to lesserwhirls/netcdf-java that referenced this issue May 5, 2020
lesserwhirls added a commit to lesserwhirls/netcdf-java that referenced this issue May 6, 2020
lesserwhirls added a commit to lesserwhirls/netcdf-java that referenced this issue Jun 16, 2020
JohnLCaron added a commit to JohnLCaron/netcdf-java that referenced this issue Jul 25, 2020
Should be backported to 5, although it appears? it (un)luckily doesnt affect grid identification.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants