Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nwdFracLut in Emon #535

Closed
wachsylon opened this issue Aug 15, 2019 · 6 comments · Fixed by #567
Closed

nwdFracLut in Emon #535

wachsylon opened this issue Aug 15, 2019 · 6 comments · Fixed by #567
Projects

Comments

@wachsylon
Copy link
Collaborator

wachsylon commented Aug 15, 2019

Hi,
when I try to produce nwdFracLut from Emon, I receive

C Traceback:
! In function: 

!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Error: You are defining variable 'nwdFracLut' (table Emon)  with 4 dimensions, when it should have 5
!
!!!!!!!!!!!!!!!!!!!!!!!!!

after I call cmor_variable.
I skip the registration of axis typenwd since it is a singleton.

Are you able to produce nwdFracLut or reproduce the error?

Best regards,
Fabi

@wachsylon
Copy link
Collaborator Author

Ok, I made some progresses with that variable. There seems to be a problem with a standard_name check. If I change in CMIP6_coordinate.json

        "typenwd": {
            "standard_name": "area_type", 

into

        "typenwd": {
            "standard_name": "type", 

I am able to create a file of nwdFracLut. However, in the header, type used time as dimension:

	char type(time) ;
		type:long_name = "Non-Woody Vegetation area type" ;
		type:standard_name = "type" ;

Please try to reconstruct it, I made a test file in #539

@taylor13
Copy link
Collaborator

@wachsylon This looks like a bug in CMOR. Please let us know if somehow you've managed to fix the problem. I don't think changing the standard_name to "type" is legal.

@mauzey1
Copy link
Collaborator

mauzey1 commented Oct 22, 2019

I have identified three issues occurring with nwdFracLut.

There is an issue with the standard_name check in the section below.

cmor/Src/cmor_variables.c

Lines 1477 to 1541 in e0eac64

/* -------------------------------------------------------------------- */
/* ok it could be a dummy but we need to check if the user */
/* already defined this dummy dimension or notd */
/* -------------------------------------------------------------------- */
l = -1;
for (k = 0; k < olndims; k++) {
if (cmor_has_axis_attribute(laxes_ids[k],
VARIABLE_ATT_STANDARDNAME) ==
0) {
cmor_get_axis_attribute(laxes_ids[k],
VARIABLE_ATT_STANDARDNAME, 'c',
&msg);
} else {
strcpy(msg, "nope");
}
if (strcmp(msg,
cmor_tables[CMOR_TABLE].
axes[refvar.dimensions[i]].standard_name)
== 0) {
/* -------------------------------------------------------------------- */
/* ok user did define this one on its own */
/* -------------------------------------------------------------------- */
l = k;
break;
}
}
/* -------------------------------------------------------------------- */
/* ok it is a singleton dimension */
/* -------------------------------------------------------------------- */
if (l == -1) {
j -= 1;
/* -------------------------------------------------------------------- */
/* ok then we create a dummy axis that we will add at the end */
/* of the axes */
/* -------------------------------------------------------------------- */
cmor_axis_def_t *pAxis;
pAxis = &cmor_tables[CMOR_TABLE].axes[refvar.dimensions[i]];
if (pAxis->bounds_value[0] != 1.e20) {
cmor_axis(&k,
pAxis->id,
pAxis->units,
1,
&pAxis->value,
'd', &pAxis->bounds_value[0], 2, "");
} else {
cmor_axis(&k,
pAxis->id,
pAxis->units,
1, &pAxis->value, 'd', NULL, 0, "");
}
laxes_ids[olndims + lndims] = k;
lndims += 1;
}
}

Since both the landUse and typenwd dimension in nwdFracLut have the standard name of area_type, it will mistake one for the other and prevent the "dummy" axis from being created for the singleton. I'm not sure what was the intention with this check only looking at the standard name for the dimensions. We should change it to also check the id attribute of the dimensions.

There is also an issue with the NetCDF dimension strlen for the string values of both landUse and typenwd. The strlen dimension gets created for landUse by finding the length of the longest string in that dimension.

cmor/Src/cmor.c

Lines 3545 to 3556 in e0eac64

l = 0;
for (j = 0; j < pAxis->length; j++) {
strncpy(msg, pAxis->cvalues[j], CMOR_MAX_STRING);
k = strlen(msg);
if (k > l) {
l = k;
}
}
/* -------------------------------------------------------------------- */
/* ok so now i can create the dummy dim strlen */
/* -------------------------------------------------------------------- */
ierr = nc_def_dim(ncid, "strlen", l, &tmp_dims[1]);

However, in a section that creates a singleton dimension, another strlen dimension will be created for typenwd.

cmor/Src/cmor.c

Lines 4236 to 4245 in e0eac64

if (cmor_tables[cmor_axes[j].ref_table_id].axes
[cmor_axes[j].ref_axis_id].type == 'c') {
ierr =
nc_def_dim(ncid, "strlen",
strlen(cmor_tables[cmor_axes[j].ref_table_id].axes
[cmor_axes[j].ref_axis_id].cvalue), &k);
ierr =
nc_def_var(ncid, cmor_axes[j].id, NC_CHAR, 1, &k,
&nc_singletons[i]);
} else {

When the strlen dimension is already defined, defining another dimension of the same name doesn't work. It does not overwrite the original value of the dimension. This will cause issues if the only landUse values are "pastures", "crops", and "urban" since their string lengths are shorter than that of typenwd's "herbaceous_vegetation". I am trying to find a way to change the value of the dimension if the new value is greater than the old value.

The third issue seems to be due to how the values for landUse are passed to cmor_axis in the test.

    char *landUse[] = {"primary_and_secondary_landpastures"
                       "                  crops                     urban                     "};
    ierr |= cmor_axis(&axes_ids[3], "landUse", "", 4, landUse, 'c', 
                NULL, strlen("primary_and_secondary_land"), "");

The test stored garbled data for the landUse axis in the NetCDF file. I thought the issue was from the array, array length, and string length being mismatched. However, when I changed the array to like the section below I got the same results.

    char *landUse[] = {"primary_and_secondary_land", "pastures", "crops", "urban"};
    ierr |= cmor_axis(&axes_ids[3], "landUse", "", 4, landUse, 'c', 
                NULL, strlen("primary_and_secondary_land"), "");

What worked was creating a 2D char array and then copying the values into the array.

    char landUse[4][CMOR_MAX_STRING];
    strncpy(landUse[0], "primary_and_secondary_land", CMOR_MAX_STRING);
    strncpy(landUse[1], "pastures", CMOR_MAX_STRING);
    strncpy(landUse[2], "crops", CMOR_MAX_STRING);
    strncpy(landUse[3], "urban", CMOR_MAX_STRING);
    ierr |= cmor_axis(&axes_ids[3], "landUse", "", 4, landUse, 'c', 
                NULL, CMOR_MAX_STRING, "");

@mauzey1 mauzey1 added this to To do in 3.6.0 via automation Nov 13, 2019
@taylor13
Copy link
Collaborator

Wow! An impressive analysis of the problem. Let me know if I can help.

@mauzey1
Copy link
Collaborator

mauzey1 commented Dec 17, 2019

Getting back to this issue, I have some questions about the strlen dimension for the string axes and singletons. Currently, CMOR will make a NetCDF dimension called strlen that is set to the length of the string with the greatest length in a string value axis, such as "primary_and_secondary_land" in landUse. However, CMOR will also try to create a strlen dimension for a string singleton value.

If a NetCDF dimension already exists, then an error will occur if you try to create another dimension with the same name. A dimension can be renamed but not be resized.

Here are some solutions:

  • Use CMOR_MAX_STRING as the value for strlen when the dimension needs to be defined. This would be the easiest solution.
  • Gather all string type dimensions and find the greatest string length. That should be done prior to defining all of the dimensions for the variable. This would likely require making a new function that is called prior to calling cmor_define_dimensions and create_singleton_dimensions.
  • Create separate string length dimensions for each string value dimension. Ex. create landUse_strlen and typenwd_strlen for their respective dimensions.

@wachsylon @taylor13 @durack1 What are your thoughts on this problem?

@taylor13
Copy link
Collaborator

nwdFracLut is the only variable where the problem arises. All other CMIP6 variables are a function of at most a single dimension with associated "label" coordinates.

The first (easiest) proposed solution above is aesthetically not very satisfying because strings would have lots of extra "spaces". The second solution looks pretty difficult, but would affect only nwdFracLut and no other variables. The third option would be inconsistent with specifications in goo.gl/neswPr , which says the dimension should be named "strlen".

Another option would be: For cases (only one found in CMIP6) where two string dimensions are defined, name for the 2nd one its string length as strlen2. For nwdFracLut, the dimensions listed in the CMOR table are (in FORTRAN ordering): longitude latitude landUse time typenwd. In this case landUse would have a string length of "strlen", but typenwd would have a string length of "strlen2"

Unless this last option is easier to implement than your second option, I would go with your 2nd option. Happy to discuss further.

@mauzey1 mauzey1 moved this from To do to Done in 3.6.0 Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
3.6.0
  
Done
Development

Successfully merging a pull request may close this issue.

3 participants