Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification of rules for positioning comment and interval in cell methods strings #274

Open
martinjuckes opened this issue Jun 10, 2020 · 6 comments
Labels
defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors

Comments

@martinjuckes
Copy link
Contributor

One of the contributors to the CMIP6 archive has contributed data with a cell_methods string of the form: area: time: mean (interval: 1 month) where sea_ice. CF Checker 4.0.0 reports this as an error ERROR: (7.3): Invalid syntax for cell_methods attribute.

The relevant paragraph of the convention, the first in sub-section 7.3.2, states that "To indicate more precisely how the cell method was applied, extra information may be included in parentheses () after the identification of the method." The author of the data file is interpreting the reference to "after the identification of the method" as meaning directly after mean in the cell_methods string. This is consistent with the usage of the word "method" throughout section 7.3.

The conformance document, on the other hand, specifies that the the parentheses should come at the end of the phrase specifying the method, i.e. area: time: mean where sea_ice (interval: 1 month), and the CF Checker, as we expect, follows the conformance document.

My personal preference would be to follow the interpretation of the conformance document, provided this is consistent with the original intention of the sub-section in 7.3, but this would, I believe, require some clarification of the text, e.g. "To indicate more precisely how the cell method was applied, extra information may be included in parentheses () at the end of the phrase identifying the method."

@martinjuckes martinjuckes added the question Further information is requested or discussion invited label Jun 10, 2020
@JonathanGregory
Copy link
Contributor

I agree with @martinjuckes that the conformance document has the intended interpretation. If that it is so, the text of the convention could be clarified as a defect issue.

@taylor13
Copy link

I think I prefer having the optional (and sometimes non-standard) parenthetical information coming at the "end of the phrase identifying the method"; breaking up the main elements of the method seems unnecessary. I would prefer, for example "mean where sea_ice (and in the immediate vicinity of polar bears)" to "mean (in the immediate vicinity of polar bears) where sea_ice".
Perhaps we should allow the parenthetical information to appear anywhere within the method description, but recommend that it often most naturally would be placed at the end of the phrase.

@martinjuckes
Copy link
Contributor Author

@taylor13 : apologies for the long silence on this issue. With the approach your approach, would you permit multiple comments, e.g. mean (in the immediate vicinity of polar bears) where sea_ice (excluding swimming bears)?

I can see the attraction of making a mis-placed comment a warning rather than an error, but we need to be sure that it does not introduce more complexity for software parsing the strings.

@taylor13
Copy link

taylor13 commented Oct 7, 2021

I hadn't thought of that being necessary. On reflection, I don't really see a need to allow the parenthetical information to appear anywhere except at the end of any phrase specifying a method (but of course each method phrase could have parenthetical information, so multiple parenthetical statements could occur in cell_methods). Thus, in your example above you might have: time: mean where sea_ice (but only sampled when polar bears are on the ice in the immediate vicinity). As others have said, I think this is what we originally had in mind. It is too bad that for some CMIP6 data the checker throws an error, but I suspect that any other software currently reading CMIP6 files will not actually try to interpret information in parentheses, so perhaps we shouldn't be concerned.

I agree we should worry about difficulty parsing, but don't think anything discussed above would be too difficult.

@JonathanGregory
Copy link
Contributor

The discussion of this issue indicates that it should be addressed as a defect in the conformance document, so I am changing the label accordingly.

@JonathanGregory JonathanGregory added defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors and removed question Further information is requested or discussion invited labels Sep 26, 2023
@JonathanGregory
Copy link
Contributor

Dear @martinjuckes

Thanks for raising this issue in 2021. No objections were raised, and enough support was expressed to approve this proposal. It requires a PR to change the first sentence in Sect 7.3.2, for which you've already suggested wording. Are you able to prepare a PR?

Best wishes and thanks

Jonathan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors
Projects
None yet
Development

No branches or pull requests

3 participants