Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive directory output issue / error vs warning #540

Closed
sashakames opened this issue Sep 6, 2019 · 15 comments · Fixed by #583
Closed

Recursive directory output issue / error vs warning #540

sashakames opened this issue Sep 6, 2019 · 15 comments · Fixed by #583
Projects

Comments

@sashakames
Copy link
Collaborator

sashakames commented Sep 6, 2019

I've ran PrePARE to recurse a subdirectory. When this happens, some messages are ambiguous regarding which file they pertain to. These for the most part must be warning messages but its isn't entirely clear. One such message is the following, and I ran again after listing the files and providing each via a sh loop:

ESC[2;31;47mC Traceback:
! In function: _CV_checkParentExpID
! ESC[0m

ESC[1;31;47m!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Error: Your input attribute "parent_mip_era" is not defined
! properly for historical
! Please refer to the CMIP6 documentations.
!
!
!!!!!!!!!!!!!!!!!!!!!!!!!ESC[0m

ESC[95m
Number of files scanned: 1ESC[0mESC[1;32m
Number of file with error(s): 0ESC[0m

The message indicates an error, but the summary info at the bottom says otherwise. Is the message really a warning, or should the bottom count it as an Error?

@mauzey1
Copy link
Collaborator

mauzey1 commented Sep 6, 2019

@sashakames
The error number should get incremented at this line in PrePARE.

self.errors += cmip6_cv.check_parentExpID(table)

However, the function that should be returning the error count is returning 0 for parent_mip_era.

cmor/Src/cmor_CV.c

Lines 1326 to 1351 in 829a78a

// parent_mip_era
if (cmor_has_cur_dataset_attribute(PARENT_MIP_ERA) != 0) {
snprintf(msg, CMOR_MAX_STRING,
"Your input attribute \"%s\" is not defined \n! "
"properly for %s \n! "
"Please refer to the CMIP6 documentations.\n! ",
PARENT_MIP_ERA, szExperiment_ID);
cmor_handle_error(msg, CMOR_CRITICAL);
} else {
cmor_get_cur_dataset_attribute(PARENT_MIP_ERA, szValue);
if (strcmp(CMIP6, szValue) != 0) {
snprintf(msg, CMOR_MAX_STRING,
"Your input attribute \"%s\" defined as \"%s\" "
"will be replaced with \n! "
"\"%s\" as defined in your Control Vocabulary file.\n! ",
PARENT_MIP_ERA, szValue, CMIP6);
cmor_handle_error(msg, CMOR_WARNING);
cmor_set_cur_dataset_attribute_internal(PARENT_MIP_ERA,
szValue, 1);
}
}
}
}
cmor_pop_traceback();
return (0);

@durack1 @taylor13 @doutriaux1 Should cmor_CV_checkParentExpID be returning 1 if it encounters an error with parent_mip_era?

@mauzey1 mauzey1 added this to To do in 3.6.0 via automation Oct 8, 2019
@taylor13
Copy link
Collaborator

The checks performed in function cmor_CV_checkParentExpID should be as described in section 3 of https://goo.gl/NmuENr:

  1. Check consistency of experiment_id / parent_experiment_id found in file against the pair specified in CMIP6_CV.json.

  2. If parent_experiment_id is “no parent”, then the following global attributes are not required, but if in the file, must be set as follows:

branch_method – “no parent”
branch_time_in_child – a double precision number
branch_time_in_parent –0.0, stored as a double precision number
parent_activity_id – “no parent”
parent_experiment_id – “no parent”
parent_mip_era – “no parent”
parent_source_id – “no parent”
parent_time_units – “no parent”
parent_variant_label – “no parent”

  1. If parent_experiment_id is not “no parent”, then check that in the file
  • all of the global_attributes listed in 2 above are defined
  • parent_mip_era is "CMIP6"
  • parent_activity_id is consistent with parent_experiment_id (as specified by the activities associated with each experiment_id in CMIP6_CV.json)

@mauzey1 mauzey1 moved this from To do to In progress in 3.6.0 Feb 21, 2020
@mauzey1
Copy link
Collaborator

mauzey1 commented Feb 24, 2020

@sashakames Could you provide the NetCDF file(s) that you used when you encountered this issue?

@mauzey1
Copy link
Collaborator

mauzey1 commented Feb 27, 2020

@sashakames @durack1 @taylor13
The reason why the errors are not being counted is due to the function cmor_CV_checkParentExpID not returning an non-zero error code, as mentioned in #540 (comment).

I have tried adding the return value of -1 to make the returned error code non-zero.

    cmor_handle_error(msg, CMOR_CRITICAL); 
    cmor_pop_traceback();
    return (-1);

This allows cmor_CV_checkParentExpID to return an error when PrePARE processes a file with parent attributes undefined. However, this has caused an issue in one of the Python tests. test_python_CMIP6_CV_parenttimeunits.py will fail due to an issue in cmor.write.

======================================================================
ERROR: testCMIP6 (__main__.TestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "Test/test_python_CMIP6_CV_parenttimeunits.py", line 63, in testCMIP6
    cmor.write(ivar, data[i:i])
  File "/home/mauzey1/anaconda3/envs/cmor_dev/lib/python3.7/site-packages/CMOR-3.5.0-py3.7-linux-x86_64.egg/cmor/pywrapper.py", line 850, in write
    time_vals, time_bnds, store_with)
_cmor.CMORError: Problem with 'cmor.write'. Please check the logfile (if defined).

----------------------------------------------------------------------
Ran 1 test in 0.244s

FAILED (errors=1)

log output

! In function: _CV_checkParentExpID
! �[0m

�[1;31;47m!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Error: Your  "parent_time_units" set to "days since 1980-01" is invalid. 
! Please refer to the CMIP6 documentations.
! 
!
!!!!!!!!!!!!!!!!!!!!!!!!!�[0m


�[2;31;47mC Traceback:
! In function: �[0m

�[1;31;47m!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Error: Cannot continue until you fix the errors listed above: -1
!
!!!!!!!!!!!!!!!!!!!!!!!!!�[0m

cmor.write is failing as when errors are detected in the global attributes, causing the function to return an error before going any further.

cmor/Src/cmor.c

Lines 4504 to 4511 in 01ad655

if (ierr != 0) {
sprintf(ctmp,
"Cannot continue until you fix the errors listed above: %d",
ierr);
cmor_handle_error_var(ctmp, CMOR_CRITICAL, var_id);
cmor_pop_traceback();
return (1);
}

All of the other tests appear to be passing. If cmor.write returning an error in above test is the expected behavior, then I will change the test to expect this behavior.

@taylor13
Copy link
Collaborator

I'm pretty sure that "days since 1980-01" should probably trigger a fatal error because it's mssing the "day". A correct unit would be "days since 1980-1-1" or "days since 1980-01-01". If I'm right, then the test traps the error as it should. I wonder, however, if there are subsequent checks performed under this test that now get skipped because it already has encountered a fatal error?

@mauzey1
Copy link
Collaborator

mauzey1 commented Feb 27, 2020

@taylor13
The output of the test before the changes shows one error and one warning after closing CMOR.

C Traceback:
! In function: _CV_checkParentExpID
! 

!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Error: Your  "parent_time_units" set to "days since 1980-01" is invalid. 
! Please refer to the CMIP6 documentations.
! 
!
!!!!!!!!!!!!!!!!!!!!!!!!!


C Traceback:
In function: cmor_close_variable
! 

!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Warning: while closing variable 0 (masso, table Omon)
! we noticed you wrote 0 time steps for the variable,
! but its time axis 0 (time) has 5 time steps
!
!!!!!!!!!!!!!!!!!!!!!!!!!

! ------
! CMOR is now closed.
! ------
! During execution we encountered:
!   1 Warning(s)
!   1 Error(s)
! ------
! Please review them.
! ------
! 

@taylor13
Copy link
Collaborator

I think the "Warning" is not the point of this test (since the test code's name is test_python_CMIP6_CV_parenttimeunits.py, which suggests it's meant to uncover the error it flags. So I think the new behavior is in fact what we expect, so go ahead and make any changes necessary.
Thanks!

@mauzey1
Copy link
Collaborator

mauzey1 commented Feb 28, 2020

Another issue that I am finding in the code is the use of 1 or -1 as an error code returned by functions. I used -1 for the error code returned by cmor_CV_checkParentExpID. PrePARE will increment an error count using either a 1 or the returned error from a function like cmip6_cv.check_parentExpID. This can lead to situations where the error code of the cmip6_cv functions gets clobbered by the error checking in PrePARE.

For example, a file thad had both the attribute parent_variant_label being undefined and the attribute standard_name being incorrect for the file's variable was not recognized as having errors when processed through PrePARE.

I remedied this in PrePARE by using a bitwise OR instead of an add when combining the errors in the file. Since PrePARE doesn't output the number of errors per file, we can just replace it with a boolean. However, I think we should make the error codes in CMOR be consistent.

@taylor13
Copy link
Collaborator

Before proceeding, perhaps we should get some overview from Charles and Denis on what the strategy has been for error handling. I vaguely remember some distinction between -1 and +1.

@mauzey1
Copy link
Collaborator

mauzey1 commented Mar 4, 2020

@taylor13
The changes in #583 are meant to resolve the problem that @sashakames encountered where PrePARE didn't increment its count of files with errors when processing a file with the missing attribute parent_mip_era.

These changes include:

  • Making errors detected in cmor_CV_checkParentExpID return a value of -1. There were several error checks that didn't have a return value.
  • Increasing the error count in PrePARE by checking if the error code returned by the cmip6_cv functions are non-zero rather than adding the error code to the count. This prevents different error codes from clobbering each other causing erroneous results.
  • Modify the test test_python_CMIP6_CV_parenttimeunits.py to handle the exception thrown when cmor.write stops due to an error detected in cmor_CV_checkParentExpID.

@durack1
Copy link
Contributor

durack1 commented Mar 4, 2020

@doutriaux1 @dnadeau4 can you guys chime in here RE: error codes (-1, +1)?

@doutriaux1
Copy link
Collaborator

@durack1 I have no recollection of CMOR2.0 having positive and negative value errors. I believe it was all one sign back then.

@mauzey1 mauzey1 moved this from In progress to Done in 3.6.0 Mar 4, 2020
@dnadeau4
Copy link
Collaborator

dnadeau4 commented Mar 4, 2020

I used +1 to count how many errors occurred, CMOR2 used to exit at the first error, CMOR3 will keep going like a compiler and accumulate the number of errors.

This usually happens when there is some mistake made in the user metadata and cannot be match with the rules in the CV.

@taylor13
Copy link
Collaborator

taylor13 commented Mar 4, 2020

Seems like it may be o.k. making errors positive.

I would note that some functions, as coded in FORTRAN (at least in early versions of CMOR), return a positive integer, which serves as an index (or handle) pointing to a variable or a dimension or a zfactor, but if an error is encountered, a negative integer is returned. The CMOR documentation explains this.

@mauzey1
Copy link
Collaborator

mauzey1 commented Mar 4, 2020

There are some functions in the Fortran API that return a negative error such as cmor_axis. The Fortran code will invert the error returned by the C function.

ierr = cmor_axis_cff_double(axis_id,trim(table_entry)//char(0),trim(units)//char(0),l,&
coord_vals(1),bnds(1),2,interv)
deallocate(bnds)
if (ierr.eq.0) then
ierr = axis_id
else
ierr = -ierr
endif

Looking at the C code for cmor_axis shows only one error returning 1.

cmor/Src/cmor_axes.c

Lines 1639 to 1643 in 87979a0

if (cmor_naxes == CMOR_MAX_AXES - 1) {
cmor_handle_error("Too many axes defined", CMOR_CRITICAL);
cmor_pop_traceback();
return (1);
}

All the other return values are 0 for cmor_axis. There are other CMOR_CRITICAL errors in that function that don't return any value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
3.6.0
  
Done
Development

Successfully merging a pull request may close this issue.

6 participants