Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification of the use of long_name, standard_name, cf_role and non-standard attributes #501

Closed
JonathanGregory opened this issue Jan 8, 2024 · 4 comments · Fixed by #506
Labels
change agreed Accepted for inclusion in the next version defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors

Comments

@JonathanGregory
Copy link
Contributor

JonathanGregory commented Jan 8, 2024

Dear all

This issue was prompted by Daniel @neumannd's comment on the question in #211. I am proposing the following small changes to remedy what appear to be defects in the document. If you have any concerns, please comment before 29th January.

Section 3 of the conformance document has the recommendation

All variables should use either the long_name or the standard_name attributes to describe their contents. Exceptions are boundary and climatology variables.

but that recommendation isn't made in the preamble to Section 3 of the conventions, which says only that CF supports long_name and introduces standard_name. The conformance statement should belong instead Section 3.2, which says

it is highly recommended that either [the long_name] or the standard_name attribute ... be provided to make the file self-describing

Unfortunately this text doesn't say what kinds of variable the statement applies to. To clarify the situation, I suggest the following changes:

  • We delete the existing recommendation at the start of Section 3 of the conformance document.

  • We insert instead a recommendation for Section 3.2: "All data variables and variables containing coordinate data should use either the long_name or the standard_name attributes to describe their contents."

  • We amend the third sentence of Section 3.2 of the conventions document to read "But it is highly recommended that either this or the standard_name attribute defined in the next section be provided for all data variables and variables containing coordinate data, in order to make the file self-describing," where the bold is new.

This is consistent with Appendix A. The long_name and standard_name can also be used for boundary variables, so long as the attribute value is identical to the one on the parent coordinate variable. That's shown by "BI'' in Appendix A, and was clarified in the latest version of CF.

While studying the relevant parts of the standard for #211, I noticed some problems with cf_role as well. The second paragraph of section 9.5 says "[The cf_role] attribute has no other function in the CF convention (despite its general-sounding name), and its only permitted values are timeseries_id, profile_id, and trajectory_id." That was true before the latest version, but now it has a function in the new Section 5.9 for UGRID as well.

I propose we change the first two sentences of the paragraph to read "Where feasible, one of the coordinate or auxiliary coordinate variables of a discrete sampling geometry should have an attribute named cf_role, whose only permitted values for this purpose are timeseries_id, profile_id, and trajectory_id. (Despite its general-sounding name, this attribute has only one other function, namely in Section 5.9.)"

In Appendix A, the entry for cf_role refers to Section 9.5 and Section 5.9. Section 5.9 contains an example which shows cf_role, but does not mention it in the text. I propose that we change this entry to refer instead to Section 9.5 only (not 5.9), and add "See also Appendix K", because Appendix K lists the functions of the attribute for UGRID, and these are not described by Appendix A.

Finally, the second paragraph of Appendix A says, "if use of an attribute is restricted to certain kinds of variables this is indicated as follows". I don't think that is correct. CF does not prohibit any attribute, as Section 2.6 says. If you like, you could use standard_name as a global attribute, for example, but that isn't a CF use of it. I propose that we delete this phrase, and instead insert a new sentence, "CF does not prohibit an attribute named here to be used otherwise than shown here, but its meaning in such a case is not defined by CF."

9th Jan. Another version of this sentence, less legalistic and perhaps easier to understand: CF does not prohibit any of these attributes from being attached to variables of different kinds from those listed as their "Use" in this table, but their meanings are not defined by CF if they are used in these other ways.

Cheers

Jonathan

@JonathanGregory JonathanGregory added the defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors label Jan 8, 2024
@davidhassell
Copy link
Contributor

Dear @JonathanGregory.
I am happy with all of your proposed changes. Thank you for taking the time to sort them all out.
David

@JonathanGregory
Copy link
Contributor Author

Dear @davidhassell et al.

Three weeks have passed with no objection, meaning these changes are agreed. I have created PR #506 to implement them. Please could someone check and merge?

In doing this, I had to create a new section (3.2) in the conformance document. All the existing sections have two things at the start, which I believe are Asciidoctor ID attributes, e.g.

[[section-8]]
[[standard-name]]
=== 3.3 Standard Name

I believe that these IDs (section-8 and standard-name) could be used for cross-referencing within the document. Is that correct? There are few such cross references, using <<name ID>>. I can't find any instance of the numeric IDs of the form section-number being used, though. I haven't assigned one to my new section, because it would require renumbering all the subsequent ones. Do we need the numeric IDs? If not, I think we should delete them—as a separate issue, in order not to delay and confuse this one!

Thanks

Jonathan

@JonathanGregory
Copy link
Contributor Author

Please could someone check and merge #506 to resolve this issue - @davidhassell or @larsbarring, perhaps? It has been overtaken by a couple of other PRs since I prepared it, and I have just resolved the conflict in history.adoc, so it's good to go, I believe.

@martinjuckes
Copy link
Contributor

martinjuckes commented Mar 18, 2024

@JonathanGregory : apologies for a late comment and missing your first email about this. Responding to your 2nd email this evening, I have a concern that additional changes are needed in section 1.4 and there is a potential ambiguity about how the revised recommendation "All data variables and variables containing coordinate data should use either the long_name or the standard_name attributes to describe their contents." relates to recommendations for coordinate variables in sections 1.4 and 4 which say it is highly recommended that coordinate variables include the standard_name.

Sorry if I have missed something in the discussion above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
change agreed Accepted for inclusion in the next version defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants