Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checking out table versions (e.g. cmip6-cmor-tables) alongside CMOR install #676

Open
durack1 opened this issue Oct 6, 2022 · 11 comments
Open
Assignees
Milestone

Comments

@durack1
Copy link
Contributor

durack1 commented Oct 6, 2022

No description provided.

@durack1
Copy link
Contributor Author

durack1 commented Oct 6, 2022

Conversation with @sashakames noted that in order for PrePARE to currently validate data, we need to match the data_specs_version attribute to the cmip6-cmor-tables version that CMOR (or other software) used in to generate the file(s). We hit an issue with INM data where version 01.00.29 was used to create, but PrePARE 01.00.33 (latest) was used to validate and a single issue appeared.

It would be great if CMOR checked out the appropriate version when it first encounters this data_specs_version identifier, then keeps this available to the software if there is a next time that this version is required during publication/PrePARE checking.

We can obviously roll this forward in PCMDI/mip-cmor-tables#3

@taylor13
Copy link
Collaborator

taylor13 commented Oct 7, 2022

We might assume that the tables are backward compatible (with only new variables added, but not changes to old variables). Any info. on the single issue raised by PrePARE?

@durack1
Copy link
Contributor Author

durack1 commented Oct 25, 2022

@sashakames it seems we've solved the seats issue, so tagging you here - which will likely require @mauzey1 at some stage

@sashakames
Copy link
Collaborator

Here's the issue raised if that's what you are after:

Your file contains "standard_name":"effective_radius_of_cloud_liquid_water_parti
cle_at_liquid_water_cloud_top" and
CMIP6 tables requires "standard_name":"effective_radius_of_cloud_liquid_water_pa
rticles_at_liquid_water_cloud_top".

@mauzey1
Copy link
Collaborator

mauzey1 commented Oct 26, 2022

@durack1 So we want PrePARE to warn users that the data_specs_version of a file doesn't match the one in the tables being used by PrePARE? Should that just be a warning, or an error?

@durack1
Copy link
Contributor Author

durack1 commented Oct 26, 2022

@mauzey1 that would be a nice addition, as a warning.

Thinking about this, ideally, it would be great to always check against the latest version of the tables, so that at no point in time do we allow issues that are known (and fixed in the latest tables) to be published.

The checkout of the latest table versions by PrePARE/CMOR install/runtime was the original focus of this issue.

@taylor13
Copy link
Collaborator

thanks, @sashakames for the error message. "Your table" has a typo ("particle" instead of "particles").
Since this involves a standard_name (which must be found at http://cfconventions.org/Data/cf-standard-names/79/build/cf-standard-name-table.html , I think an "error" should be raised. The table with the error should be corrected before proceeding (even if it is the "master" table that's wrong).

@durack1
Copy link
Contributor Author

durack1 commented Oct 27, 2022

@taylor13 just circling around on this. The issue was that the tables that the original data was written with included the error, and so they had created CMOR and cmip6-cmor-table validated data. The issue was when PrePARE was used to validate the data during publishing, and an updated and corrected version of the tables was used - so that was why I was suggesting a warning rather than an error and exit

@taylor13
Copy link
Collaborator

Got it. You suggest we not burden users with fixing a problem that was only recently discovered (between the time they wrote the data with CMOR and the time it was checked by PrePARE. There might be at least 2 cases when we might not want to be that lenient:

  1. When a user has prepared output without using the CMOR tables and without CMOR, and has made an error in defining the standard_name.
  2. When a user has been lazy and not obtained the latest CMOR table before writing output. (The table used might be one made available when the data request was still evolving, long before the time that the user is writing data.) When publishing that data, the updated tables are applied and the error is discovered. Shouldn't we object for such sloppy application of CMOR and require the data be rewritten?

All that being said, I wouldn't strongly object to making this a warning (rather than an error) since it doesn't involve the DRS (i.e., the attributes that determine unique file names and directory structures).

@sashakames
Copy link
Collaborator

sashakames commented Oct 27, 2022

My thought on the warning message would be that PrePARE would include a check of the table version against the global attribute found in the file. If there is a mismatch then warn the user in addition to any errors encountered as a result of the mismatch. I agree that PrePARE shouldn't pass data where there are errors as there is no distinction between a minor error like a typo in the above example versus something where the cf name was completely garbled.

Additionally, we had the notion of a minimum data_spec_version. While this is something that would be helpful to have in the publisher (on the client side) ultimately it is better to enforce server side. I'll raise this when requirements for publication services are discussed.

Following the warning the users may have the opportunity to downgrade their table version (provided it is still valid, meets the minimum) and publish.

@durack1 durack1 added this to the 4.0/Future milestone Apr 7, 2024
@durack1
Copy link
Contributor Author

durack1 commented Apr 7, 2024

It would be useful to consider this alongside the future CMOR4 release - as we are standardizing on the mip-cmor-tables across projects, certainly makes sense to "ship" CMOR with the inputs it requires to run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants