-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CMOR segfaults with mip cmor tables and CMIP6Plus CV.json #718
Comments
We've worked out what is wrong in the minimal example being used here -- credit to @piotr-florek-mohc for debugging this The CMIP6Plus_CV.json contains Once this is corrected then CMOR behaves itself again. One option might be to prescan every JSON file read and fail if a The other fixes to get the above to work are; correction to mip table id in the |
Just noting the associated issue PCMDI/mip-cmor-tables#27 |
@matthew-mizielinski @durack1 Since the issues occurring with the example can be fixed within the MIP tables, should we close this issue? |
@mauzey1 thanks for circling. As it's a segfault, it would be great to catch this issue and ensure that such a poorly defined use case doesn't segfault. We also wanted to make sure that the structure of the json input files made sense, and then CMOR could read this more sensible format - in the case of a list type vs space-separated string type. So it would be good to make some tweaks to CMOR to deal with the segfault, and separately deal with the json type formats as well |
How should we handle |
In a situation where poorly constructed tables/jsons inputs are provided, it would be great to note the problem and throw a traceback, rather than a segfault - if CMOR doesn't know about it, then we shouldn't have to deal with it. @matthew-mizielinski may have a subtly different take which I would like to hear |
the output from CMOR when it segfaults is pretty unhelpful, and it is pretty easy to sink a lot of time trying to work out where the issue is. For example, I think the example I attached implies that the tables are at fault, in that the failing call is I don't think there is a valid situation where a Note that I don't think that this is a critical thing to fix and release immediately. We'll get something into any CVs/MIP tables code to prevent null's appearing, but given we've identified this issue it would be good to cover it off. |
I was looking at where the segfault was happening in CMOR and found it within the Lines 74 to 89 in 047fd2c
However, I noticed in this function that there is a case where it will ignore a null value found inside a JSON object.Lines 33 to 35 in 047fd2c
I'm guessing there have been cases where null was found in tables, but just wasn't anticipated to be found inside arrays.
Either way, I agree that there shouldn't be a valid case for having |
@mauzey1 perfect, I think such a check would make the software more robust, particularly as we're anticipating increased use over time with people text editing json files in the worst cases - if it has a problem reading the file, it should point out the issue to whatever granularity is relatively easily possible, and exit, throwing the error and alerting a user to where to look to fix it |
The additional tweak is to the format of the CMIP6 "tas": {
"frequency": "mon",
"modeling_realm": "atmos",
"standard_name": "air_temperature",
"units": "K",
"cell_methods": "area: time: mean",
"cell_measures": "area: areacella",
"long_name": "Near-Surface Air Temperature",
"comment": "near-surface (usually, 2 meter) air temperature",
"dimensions": "longitude latitude time height2m",
"out_name": "tas",
"type": "real",
"positive": "",
"valid_min": "",
"valid_max": "",
"ok_min_mean_abs": "",
"ok_max_mean_abs": ""
}, https://github.com/PCMDI/cmip6-cmor-tables/blob/main/Tables/CMIP6_Amon.json#L1151-L1168 CMIP6Plus "tas": {
"cell_measures": "area: areacella",
"cell_methods": "area: time: mean",
"comment": "near-surface (usually, 2 meter) air temperature",
"dimensions": [
"longitude",
"latitude",
"time",
"height2m"
],
"frequency": "mon",
"long_name": "Near-Surface Air Temperature",
"modeling_realm": [
"atmos"
],
"ok_max_mean_abs": 295.0,
"ok_min_mean_abs": 255.0,
"out_name": "tas",
"positive": "",
"standard_name": "air_temperature",
"type": "real",
"units": "K",
"valid_max": 350.0,
"valid_min": 170.0
}, @wolfiex ping |
I vaguely recall that a json list type doesn't care about order, but for dimensions and modeling_realm we definitely do care. Is this a problem? |
Getting back to this issue, I have so far made some changes for finding null values in JSON tables and throwing an error message when found. Are there other invalid values that we should look for in tables? When I run CMOR with the mip-cmor-tables, I get a lot of warning messages for the attributes |
Some of these might be best as their own issues, but the small tweaks we've discussed are;
|
In the attached zip there is an example where using the CMIP6Plus CVs or mip cmor tables causes a segfault in CMOR in the call to
cmor.load_table
(python api)Note that I've had to amend the dimension field in the mip-cmor-tables to give something that CMOR v3.7.2 can read.
I think there must be an issue in the CVs file, but the segfault prevents me testing this further
minimal_example.zip
The text was updated successfully, but these errors were encountered: