Remove restrictions on netCDF object names #237

Dave-Allured · 2020-01-23T22:25:31Z

Title: Remove restrictions on netCDF object names

Moderator:

Moderator Status Review: New issue, 2020 January 23

Requirement Summary: None.

Technical Proposal Summary: Remove CF 1.7 section 2.3 restrictions on characters in names of variables, attributes, etc. Resolve ambiguous use of such restrictions.

Benefits

Support international usage.
Allow special characters in names.
Remove ambiguity over requirement versus preference.
Simplify CF rules.
Simplify conformance checking.
Improve compliance for some existing data sets.

Caveats

Breaks compliance with COARDS name rules, but is a superset of them.
Some existing softwares can not handle non-traditional characters. They would need upgrades, but only when presented with new files using expanded character set.

Status Quo: Object names are now restricted to a traditional yet limited character set which does not accommodate many non-western languages, nor other desired naming patterns.

Detailed Proposal: Change the first paragraph of 2.3 Naming Conventions as follows. The remainder of 2.3 is left unchanged.

Current version (1.8 draft):

Variable, dimension, attribute and group names should begin with a letter and be composed of letters, digits, and underscores. Note that this is in conformance with the COARDS conventions, but is more restrictive than the netCDF interface which allows use of the hyphen character. The netCDF interface also allows leading underscores in names, but the NUG states that this is reserved for system use.

Proposed:

Variable, dimension, attribute, and group names are not generally restricted by this convention. Any names that are acceptable to the netCDF library may be used. The most notable rules from netCDF are ASCII or UTF-8 character set, forward slash "/" not allowed, and names should not begin with underscore or certain other special characters. Refer to file format specs in the NUG for more details.

(Edit: Added forward slash "/" after following comments were posted.)

JimBiardCics · 2020-01-24T14:23:06Z

While I generally approve of relaxing the character set restrictions, I think we may need to consider certain patterns that should either be reserved or restricted. As an example, the use of slashes ('/') in names wreaks havoc with group path formalisms that are already in place outside of CF. In addition to the prohibition on having leading underscores that is mentioned in the proposal, the netCDF-LD project (@marqh) is making use of doubled underscores within a name as a mechanism for marking namespaces. There may be other cases "in the wild" where certain patterns are in use, and I think we should be careful to avoid causing problems by being overly loose here.

I suggest that, at minimum, we should disallow the use of slashes ('/') or backslashes ('') in names, and should call out two or more sequential underscores ('__') as reserved.

steingod · 2020-01-27T09:45:17Z

I support the constraint indicated above. Especially allowing slashes and backslashes in names will be confusing.

erget · 2020-01-28T08:22:41Z

Agreed, I think it would be best if the restrictions were presented in a table for readability.

marqh · 2020-01-28T11:25:12Z

We may get some benefit form considering other standardisation activity in this domain?

RFC3986 defines the generic syntax for the Universal Resource Identifier (URI)
https://tools.ietf.org/html/rfc3986

As netCDF variables are resources that are being identified within the domain of a netCDF file, could we benefit from just adopting RFC3986?

This has a reserved character section:
https://tools.ietf.org/html/rfc3986#section-2.2

Disclaimer: I have not cross referenced this in detail with the NUG to examine consistency or problem areas (potential for contribution if useful)
First glance, these look pretty similar.

If these are consistent, then adopting the NUG definition unchanged looks sensible to me. It already mandates against the use of a '/' character, which is the most problematic one for me, given groups and variable identity within groups.

I'd like to see an explicit reference to the relevant NUG section in the text or linked, as I had to search a bit and I know what I'm looking for
I think:
https://www.unidata.ucar.edu/software/netcdf/docs/netcdf_data_set_components.html#Permitted
is stable enough for a standards document
(@ethanrd do you agree this is a stable URI for the resource please?)

mark

JimBiardCics · 2020-01-28T14:00:18Z

@marqh I like the overall suggestion of RFC3986. I think we should not adopt the "% encoding" concept of RFC3986. And, again, I think we should reserve leading "" characters (per NUG) and multiple sequential "" characters (per netCDF-LD). Are there any other special character sequences in the wild that anyone is aware of — in UGRID or Radial perhaps?

I notice that the NUG section you referenced implies that space characters are allowed as long as they are not at the end of a variable name. Do we want to allow internal spaces?

marqh · 2020-01-28T14:43:09Z

@marqh I like the overall suggestion of RFC3986. I think we should not adopt the "% encoding" concept of RFC3986. And, again, I think we should reserve leading "" characters (per NUG) and multiple sequential "" characters (per netCDF-LD). Are there any other special character sequences in the wild that anyone is aware of — in UGRID or Radial perhaps?

I agree, @JimBiardCics, that adoption of %encoding is not a path I would want to walk. it's perhaps a useful cross reference, but points like this suggest against including some specific use of RFC3986 within CF

I notice that the NUG section you referenced implies that space characters are allowed as long as they are not at the end of a variable name. Do we want to allow internal spaces?

internal spaces!?!? really

if we can stop that, then that is a good thing. Why would the NUG allow variable names with spaces in them??

my reading of

The names of dimensions, variables and attributes (and, in netCDF-4 files, groups, user-defined types, compound member names, and enumeration symbols) consist of arbitrary sequences of alphanumeric characters, underscore '_', period '.', plus '+', hyphen '-', or at sign '@', but beginning with an alphanumeric character or underscore. However names commencing with underscore are reserved for system use.

lead me to view space as not allowed. However the following:

Beginning with versions 3.6.3 and 4.0, names may also include UTF-8 encoded Unicode characters as well as other special characters, except for the character '/', which may not appear in a name.
Names that have trailing space characters are also not permitted.

Could someone from a Unidata background confirm or deny that in netCDF4, a space may be used within a variable name?

zklaus · 2020-01-28T15:21:59Z

I have zero Unidata authority, but I'd like to state the obvious: Unicode is complicated.
This may already account for the somewhat vague formulation in the NUG if one takes a look at the list of whitespace characters in unicode. Indeed, whether one wants to go with a blacklist or a whitelist approach, it may be a good idea to think and write in terms of Unicode character categories (cf here or here).

ngalbraith · 2020-01-28T17:04:40Z

I'm afraid I'm the odd man out here - I don't think the list of benefits in the original issue stacks up against the costs; in fact some of them don't seem to BE benefits. Maybe some use cases would be helpful ... Could you elaborate on how this change would support international usage?

Is improved compliance for some existing data sets really a goal? What's in these data sets that needs to be described with a name that begins with a number or contains spaces or special characters?

Maybe this is a selfish concern - we use Matlab's built-in netCDF library, and I'm not sure how that would deal with this change. If it's really needed for some specific reason, we'll deal with it, but absent that explanation, this is just a headache for a lot of CF users.

ethanrd · 2020-01-28T17:10:32Z

Is there a user asking for this extension, a particular use case that needs addressing? CF has generally tried to avoid extensions that seem like a good idea but don’t have a current use case.

Having said that, if we do move forward, I think we should be very cautious. Not only is Unicode very complicated as @zklaus points out, so are the rules around reserved character sets in URLs (and in which part of the URL) and file systems. Extending the set of characters allowed to include those reserved characters means they will need to be properly encoded when used in URLs (e.g., OPeNDAP and OGC WCS). Which, it turns out, isn’t as easy as it might seem.

Also, this or similar proposals/discussions have come up before, I think several times but so far I've only found these two:

A 2014 discussion on the email list (the initial email is here) focused mainly on expanding the set of characters allowed to include ‘@’, ‘+’, ‘-’, and ‘.’ with some mention of Unicode coming fairly late in the discussion.
Trac Ticket #157 suggested moving from “should” to “must” on the current set of allowed characters.

ethanrd · 2020-01-28T17:28:09Z

@WardF and @lesserwhirls - Could you address the question of whether whitespace characters are allowed in netCDF variable names?

MTG-Formats · 2020-01-28T17:50:27Z

Having blank spaces in names would break other CF conventions like use of the ancillary variables attribute.

"The attribute ancillary_variables is used to express these types of relationships. It is a string attribute whose value is a blank separated list of variable names. "

How to parse this?
float q_error_limit(time)
q_error_limit:standard_name = "specific humidity standard error" ;
q_error_limit:units = "g/g" ;

taylor13 · 2020-01-28T18:32:09Z

I must be missing something, but if a variable is named, for example, "a-b", and one uses that in a computer code, how is it interpreted? How is that variable distinguished from the operation: subtract variable "b" from variable "a"? Don't "+", "-", "/", "*", " " all have this problem?

JimBiardCics · 2020-01-28T18:48:54Z

@taylor13 Your code would have to parse the variable name into code. Until you did something like that, it is just a string.

taylor13 · 2020-01-28T19:12:28Z

As a user of data, I usually like the names of my variables (in my codes) to be the same as their names in the netCDF file. With the current naming convention for CF, this is always possible, I think. If, however certain restrictions were removed, as suggested above, this would no longer be true.
I would echo others and ask what particular use cases are driving this?

Dave-Allured · 2020-01-28T23:10:44Z

Well, thank you for all yout thoughtful responses. I see that we are rehashing the 2014 discussion, and probably others. Thanks @ethanrd for finding that. There are good arguments pro and con there, and it is worth reading.

The difference is that only 4 extra characters were proposed in 2014. I simply want to legalize all the other 137 thousand!

Is there a user asking for this extension, a particular use case that needs addressing? CF has generally tried to avoid extensions that seem like a good idea but don’t have a current use case.

No, I do not have a current use case. This is a recurring issue, so I thought this comprehensive approach would be beneficial. Past use cases were mentioned or implied in the 2014 discussion, and in trac 157.

NetCDF developers put some care into expanded name capability, 12 years ago. However, CF restrictions are copied virtually unchanged from 25 year old COARDS rules, which were probably based on ASCII only. CF is overdue to allow the full naming range for creative purposes by all scientific users.

Name quoting is generally easy and well supported in most modern programming languages. This takes care of UTF-8, math symbols, and other active characters. IMO, naming freedom should outweigh exactly matching names of program variables.

ngalbraith · 2020-01-29T17:52:34Z

@taylor13 Your code would have to parse the variable name into code. Until you did something like that, it is just a string.

Not everyone writes their own netCDF translators, and some packages no doubt take the variable and attribute names from the netCDF variable and attribute names. Those who use these packages are least likely to be in a position to accommodate this change.

When I have a minute I'll give it a try with the Matlab netCDF interface. I'd be much happier to spend the time on it if there was more than 'creative purposes' for a reason. The trac ticket has an example of isotopes with names that begin with a number, which has some weight, but the work around for that seems simple compared to what would be needed by someone using code that auto-assigns variable names.

On the other hand, most folks probably work with multiple standards; OceanSITES would no doubt maintain the variable name restriction, if CF doesn't.

zklaus · 2020-01-30T08:51:04Z

I agree that it would be good to have use cases.

@ngalbraith is also right that not everyone is writing their CF code based on naked netCDF access. Indeed, I consider such an approach foolish, since CF is far too rich by now to stand a series chance of getting it right.

However, while using the netCDF variable name as a program variable name might be excused in small, not reused code that only ever will deal with, say tas, it is inexcusable in general-purpose library code. How would such a variable enter the namespace without the program knowing its name beforehand? Ultimately, the only way is via the equivalent of eval(var_name). Such code is prone to breakage no matter what restrictions we put on the character set since it would always leave open the possibility of having reserved words of the particular programming language as variable names. Another serious problem is that it opens the possibility to maliciously crafted variable names: How about var_name='system("rm -rf .")'?

Hence, I don't think the argument that all netCDF variable names should be permissible program variable names in all programming languages should guide the design of CF.

DocOtak · 2020-01-30T18:47:24Z

I had the same thoughts as @zklaus when thinking about the security implications of what I could only imagine was an eval(var_name). I've even seen some of the matlab code which does exactly this to load all the variable into a matlab namespace. I'd even go so far as to recommend that the CF document itself warn against doing this...

martinjuckes · 2020-11-17T09:48:25Z

I agree that some use cases would be helpful. I'm not sure about the specific proposal that initiated the discussion, but I do agree with the thought behind it that we should have a considered and reasoned policy on this, rather than just having a frozen-in rule based on past library constraints.

One reason that we might want to depart from the full freedom allowed in NetCDF is that we have, in CF, a range of different attributes to describe a variable. The long_name is designed to hold human readable text, the standard_name and units which both have strongly constrained values.

Some application libraries need, in places, identifiers with a restricted character set. For example, I can construct a collections.namedtuple with name tas, but not with name tas.Amon because, in python "Type names and field names can only contain alphanumeric characters and underscores" (cited from an error message generated by collections.namedtuple). Could this be considered as a use case for having place in the convention to specify, for CF objects, an identifier which is composed of "alphanumeric characters and underscores"? The variable name is the de facto place which many people use for this kind of identifier (perhaps because of legacy packages).

Note that the standard_name fits the character restriction, but does not fit the use case because different variables may have the same standard_name.

Another potential use case is for identifiers of concepts described in RDF Turtle which has a character restriction on object names, broader, I think, than "alphanumeric characters and underscores", but definitely narrower than 137 thousand available of UTF-8.

The desire to have a simple identifier is linked, in my mind at least, to the concept of a namespace, which is being discussed in the context of NetCDF (see NetCDF-ld and discussion on namespace delimiters). I don't this is simply a matter of upgrading software to make it accept generic strings: there is a wide range of applications that exploit identifiers constructed from a limited character set in order to enable the use of identifiers within an text string.

zklaus · 2020-11-23T13:14:54Z

One potential use-case that always came to my mind without an actual example at hand Is the native names of weather stations, say a temperature time-series from the Umeå station, where the variable name contains the station name.

What makes this particularly interesting is that it seems to be permitted already under current CF conventions, since under CF-1.8, Section 2.3 Naming Conventions it says:

Variable, dimension, attribute and group names should begin with a letter and be composed of letters, digits, and underscores. [...] Languages other than English are permitted for variables, dimensions, and non-standardized attributes.

martinjuckes · 2020-11-23T15:37:48Z

HI @zklaus : good point about the existing rules.

Regarding your use case; wouldn't that use case be covered by setting the long_name to "Temperature time-series from the Umeå station"? The current convention appears to permit "Umeå_station", but not "Umeå station" (blanks not allowed).

The cfchecker (4.0) takes a narrower view of what is allowed, restricting variable names to string matching the python regex: '^[a-zA-Z][a-zA-Z0-9_]*$'.

zklaus · 2020-11-23T16:05:30Z

Yes, that might be a good way to encode the information. What I wanted to say is this: I find it very plausible that in a national weather service a group sits together and decides to code their station data using variable names tas_station-name with a number of non ascii letters in the station names. Furthermore, that would appear to be perfectly valid CF.

So I think being more explicit about what is meant by "letter" would be good, even if that means saying that only ascii letters are allowed.

JonathanGregory · 2024-01-08T18:12:22Z

I believe that this issue is waiting for the outcome of #477 - is that right, @Dave-Allured?

Dave-Allured · 2024-01-08T18:33:26Z

@JonathanGregory, no, this issue is not waiting on #477. This issue #237 is a free-standing proposal to remove all CF-specific restrictions on Netcdf object names. In my view, this #237 is currently an open discussion, and waiting vaguely on a general consensus.

larsbarring · 2024-01-09T12:40:22Z

Early on in this thread there were references to work on "Netcdf-LD", and I found a github repo. Anyone know the current status of this proposal in general, and in relation to OGC? Maybe @marqh or @ethanrd?

I am asking because of the comment that

... at minimum, we should disallow the use of slashes ('/') or backslashes ('') in names, and should call out two or more sequential underscores ('__') as reserved.

ethanrd · 2024-01-09T16:30:40Z

Hi Lars @larsbarring - I believe this OGC netCDF-LD GH repo is the more current one. It provides a link to the OGC netCDF-LD draft specification.

The OGC process involves a public comment period before proceeding to a vote. If I'm remembering correctly, the specification went out for public comment but hasn't yet gone out for a vote. Mark @marqh may be able to provide more details.

sethmcg · 2024-02-28T17:29:15Z

In my view, this #237 is currently an open discussion, and waiting vaguely on a general consensus.

If this is waiting on general consensus to come to a resolution, I'll jump in and say that I oppose this proposal.

A lot of very serious interoperability and security concerns have been raised about the idea of removing all restrictions on naming, and I don't see any benefits that outweigh them. Moreover, we don't have an actual motivating use case; this is an anticipatory change, which CF generally tries to avoid.

I'm open to motivated proposals that extend the allowed set of characters in a specific and more limited way, such as #477 (which has been accepted and is just waiting for a PR), but I think the discussion there demonstrates why it's important to be conservative and carefully discuss all the impacts of adding new allowed characters.

larsbarring · 2024-03-01T10:09:21Z

I fully agree with @sethmcg.

Moreover, the opening sentence of Section 2.3 reads

Variable, dimension, attribute and group names should begin with a letter and be composed of letters, digits, and underscores. By the word letters we mean the standard ASCII letters uppercase A to Z and lowercase a to z. By the word digits we mean the standard ASCII digits 0 to 9, and similarly underscores means the standard ASCII underscore _.

where the operative word is should, which, if we interpret it as being in uppercase according to BCP14/RFC2119 means:

SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.

This interpretation of "should" strikes me as a reasonable balance between strictness/limitations and openness/flexibility. If the CF Community moves to introduce BCP14 in the Conventions document there is of course the possibility that the word should is replaced by MUST, but that is a good time to revisit this issue.

JonathanGregory · 2024-05-16T08:26:54Z

The opening sentence of Section 2.3 states that

Variable, dimension, attribute and group names should begin with a letter and be composed of letters, digits, and underscores.

Lars is correct that the word "should" here is a recommendation, as is clarified by Sect 2.3 of the conformance document. The conformance document further clarifies it

This corresponds to ASCII characters in the decimal ranges (65-90), (97-122), (48-57), and (95). The corresponding Unicode codepoints are (U+0041-U+005A), (U+0061-U+007A), (U+0030-U+0039), and (U+005F).

and both the standard and the conformance document add (again, as a recommendation)

ASCII period (.) and ASCII hyphen (-) may also be included in attribute names only.

which results from the agreed proposal #477 of @Dave-Allured.

@larsbarring and @sethmcg have expressed views against a blanket removal of restrictions on the characters to be used in CF-netCDF object names. I agree that removing all restrictions would not be consistent with the usual CF approach. Normally, we consider specific proposals to change the status quo, motivated by present use cases. Are the other views on this question? It would be good to reach a consensus. Thanks.

larsbarring · 2024-05-16T12:36:38Z

The sections @JonathanGregory points at essentially provide whitelist of explicitly allowed characters, all other characters are not recommended (or recommended against) but not explicitly disallowed. But throughout this conversation there have been several remarks that some characters should indeed be explicitly disallowed. This could easily be done by amending the text in section 2.3 to list which character and character ranges CF explicitly disallows, i.e. creating a blacklist. All other characters would then belong to a "greylist" where users are on their own and cannot expect the same level of interoperability and support from common libraries and software tools.

Dave-Allured · 2024-05-17T01:06:22Z

the word "should" here is a recommendation, as is clarified by Sect 2.3 of the conformance document

This wording with "should" is confusing and unfriendly in context of that opening paragraph on netCDF object names. Witness multiple tickets filed to remove character restrictions which did not really exist. If that were simply reworded to clearly express the allowed versus recommended character sets, that would be sufficient. CF is for scientists and programmers, not lawyers.

JonathanGregory · 2024-05-17T21:55:22Z

We've already agreed elsewhere that we will check all the "must", "should" etc. words to make them conform to BCP-14, in which "should" indicates a recommendation. In this case, our interpretation has apparently changed. The text in sect 2.3

Variable, dimension, attribute and group names should begin with a letter and be composed of letters, digits, and underscores.

has been the same since CF version 1.0. However, up to version 1.7 of the conformance document this was listed as a requirement

Variable, dimension and attribute names must begin with a letter and be composed of letters, digits, and underscores.

In version 1.8 of the conformance document it turned into a recommendation

Variable, dimension and attribute names should begin with a letter and be composed of letters, digits, and underscores.

That change was made by @davidhassell in 2a44ccc and c3fa6fd. Do you remember why this change was made, David?

According to principle 9 of sect 1.2, we shouldn't revert to making it a requirement:

Because many datasets remain in use for a long time after production, it is desirable that metadata written according to previous versions of the convention should also be compliant with and have the same interpretation under later versions.

Therefore I propose that we change the first sentence of 2.3 to read

It is recommended that variable, dimension, attribute and group names should begin with a letter and be composed of letters, digits, and underscores.

which makes it consistent with the present conformance document. I believe that all those who've contributed recently think that this is what the text should mean. Are you content with making this change?

davidhassell · 2024-05-18T07:44:03Z

Hello @JonathanGregory,

That change was made by @davidhassell in 2a44ccc and c3fa6fd. Do you remember why this change was made, David?

Those commits were from PR #227 that fixed issue #226 (Correct the wording in the conformance document section 2.3 "Naming Conventions").

Thanks, David

JonathanGregory · 2024-05-20T11:19:46Z

Dear @davidhassell

Thanks. I didn't remember about #226, where we previously decided that "should" was intended mean a recommendation. Since the discussion above shows that it is open to question, I believe that my proposal to change the text in sect 2.3 would be helpful, from

Variable, dimension, attribute and group names should begin with a letter and be composed of letters, digits, and underscores.

to

It is recommended that variable, dimension, attribute and group names should begin with a letter and be composed of letters, digits, and underscores.

I'm relabelling this issue as a defect, meaning that the above change will be adopted three weeks from now (10th June) if no-one disagrees before then.

Best wishes

Jonathan

larsbarring · 2024-05-20T12:03:10Z

Well aware of my ever so often much too "free and relaxed interpretation" of English spelling and grammar, I nevertheless venture to ask if it would be possible to somehow exclude the "should" in the suggested wording:

It is recommended that variable, dimension, attribute and group names ~~should~~ begin with a letter and be composed of letters, digits, and underscores.

?

davidhassell · 2024-05-20T12:21:09Z

Hi Lars,

That sounds like a good suggestion. BCP14 says

SHOULD   This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.

so providing both words (... recommended ... should ...) doesn't add anything beyond using just one of them.

JonathanGregory · 2024-05-20T12:51:24Z

It's true that "should" doesn't convey any information, given "recommended" for clarity. It would be OK in English to say

It is recommended that variable, dimension, attribute and group names begin with a letter and be composed of letters, digits, and underscores.

where begin is a subjunctive (a vestigial feature of English grammar). That's not such a common construction though. Maybe some readers might find it obscure? What do you think of

It is recommended for variable, dimension, attribute and group names to begin with a letter and be composed of letters, digits, and underscores.

or

Variable, dimension, attribute and group names are recommended to begin with a letter and be composed of letters, digits, and underscores.

taylor13 · 2024-05-20T14:12:58Z

I too recommend that we should avoid both "should" and "recommend" in the same sentence. :) . Personally, I prefer the first of the 3 options appearing in the previous post (with the subjunctive construct). I don't find it confusing. Perhaps I'm just a vestige of a disappearing generation, so as a second choice I might slightly prefer "are recommended to begin", but that seems a bit awkward to me.

MTG-Formats · 2024-05-21T08:03:21Z

Is there some reference that can be added where users can read the disadvantages/problems they may have if they don't follow the recommendations?

JonathanGregory · 2024-05-21T12:29:20Z

Is there some reference that can be added where users can read the disadvantages/problems they may have if they don't follow the recommendations?

There is quite a lot of discussion of pros and cons earlier in this issue. Jonathan

JonathanGregory · 2024-06-17T10:59:55Z

Four weeks have passed without objection to the proposed remedy for the defect. Therefore we've agreed to make the change, and I've prepared pull request 526 to implement it. The PR replaces the existing sentence in 2.3

Variable, dimension, attribute and group names should begin with a letter and be composed of letters, digits, and underscores.

with the wording preferred by @larsbarring and Karl @taylor13

It is recommended that variable, dimension, attribute and group names begin with a letter and be composed of letters, digits, and underscores.

to indicate that this is not a requirement, but a recommendation, as shown by the conformance document. Please could someone check and merge this PR e.g. @larsbarring or @davidhassell?

In addition, I am labelling this issue for consideration as a FAQ, in view of the question from Tim @MTG-Formats "Is there some reference that can be added where users can read the disadvantages/problems they may have if they don't follow the recommendations?" It seems to me that if someone has time it would be useful to summarise the early discussion about the advantages of sticking to the convention in the FAQ, or at least we could refer to this issue as a reference from the FAQ.

Thanks to all for contributions to this issue and to @Dave-Allured for raising it.

PS Discussion 323 on creating a character blacklist is also relevant.

larsbarring · 2024-06-17T11:36:44Z

@JonathanGregory I have just approved and merged the PR. But it just struck me that the label change agreed is both correct as we did agree on some changes, and incorrect as the changes we agreed on are rather the opposite of the initial suggestion. I wonder whether it would be prudent/relevant/informative to actually use both labels, change agreed and agreement not to change ? This may seem as confusing for a reader, but it may seem even more confusing if the reader is searching the conventions text for CF support for a wide range of Unicode characters.

JonathanGregory · 2024-06-17T12:01:54Z

Dear @larsbarring

Thanks for merging the PR. I see your point about change agreed. It was intended to be the opposite of agreement not to change. They should be mutually exclusive. The aim is to indicate why the issue was closed. To avoid this possible confusion, I suggest we should rename change agreed to something clearer, which indicates that some change was agreed, although not necessarily what was originally proposed. That's often the case, of course. The renamed label would appear in all the places where change agreed currently appears i.e. it's the same identity, just different text.

Best wishes

Jonathan

JonathanGregory · 2024-06-17T12:54:32Z

What do you think of convention was changed? Does it avoid the problem you raised?

Dave-Allured added the enhancement Proposals to add new capabilities, improve existing ones in the conventions, improve style or format label Jan 23, 2020

ethanrd mentioned this issue Jan 28, 2020

Feature supporting of different charsets Unidata/netcdf-java#184

Merged

davidhassell mentioned this issue May 11, 2020

Planning for the 2020 CF meeting: Santander, 9-11 June cf-convention/discuss#35

Closed

ethanrd mentioned this issue Nov 9, 2020

Clarify the set of characters allowed in netCDF object names Unidata/netcdf#32

Open

Dave-Allured mentioned this issue Nov 16, 2020

Character set permitted for variable and attribute names. #307

Closed

davidhassell mentioned this issue Jun 25, 2021

Inviting suggestions for breakout discussion topics at the 2021 CF workshop cf-convention/discuss#114

Closed

pp-mo mentioned this issue Mar 18, 2022

Relax varnames pp-mo/ugrid-checks#48

Merged

larsbarring mentioned this issue Aug 3, 2023

Naming Conventions and periods (.) cf-convention/discuss#256

Closed

pp-mo mentioned this issue May 1, 2024

Iris incorrectly failing valid netcdf names SciTools/iris#5929

Open

JonathanGregory added defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors and removed enhancement Proposals to add new capabilities, improve existing ones in the conventions, improve style or format labels May 20, 2024

JonathanGregory mentioned this issue Jun 17, 2024

Clarify that the character set given in section 2.3 for variable, dimension, attribute and group names is a recommendation, not a requirement #526

Merged

JonathanGregory linked a pull request Jun 17, 2024 that will close this issue

Clarify that the character set given in section 2.3 for variable, dimension, attribute and group names is a recommendation, not a requirement #526

Merged

JonathanGregory added change agreed Accepted for inclusion in the next version frequently asked question This issue or similar has been raised before and it should be considered for inclusion in the FAQ labels Jun 17, 2024

larsbarring closed this as completed in #526 Jun 17, 2024

Remove restrictions on netCDF object names #237

Remove restrictions on netCDF object names #237

Comments

Dave-Allured commented Jan 23, 2020 • edited

JimBiardCics commented Jan 24, 2020

steingod commented Jan 27, 2020

erget commented Jan 28, 2020

marqh commented Jan 28, 2020

JimBiardCics commented Jan 28, 2020

marqh commented Jan 28, 2020

zklaus commented Jan 28, 2020

ngalbraith commented Jan 28, 2020

ethanrd commented Jan 28, 2020

ethanrd commented Jan 28, 2020

MTG-Formats commented Jan 28, 2020

taylor13 commented Jan 28, 2020

JimBiardCics commented Jan 28, 2020

taylor13 commented Jan 28, 2020

Dave-Allured commented Jan 28, 2020 • edited

ngalbraith commented Jan 29, 2020

zklaus commented Jan 30, 2020

DocOtak commented Jan 30, 2020

martinjuckes commented Nov 17, 2020

zklaus commented Nov 23, 2020

martinjuckes commented Nov 23, 2020

zklaus commented Nov 23, 2020

JonathanGregory commented Jan 8, 2024

Dave-Allured commented Jan 8, 2024

larsbarring commented Jan 9, 2024

ethanrd commented Jan 9, 2024

sethmcg commented Feb 28, 2024

larsbarring commented Mar 1, 2024

JonathanGregory commented May 16, 2024

larsbarring commented May 16, 2024

Dave-Allured commented May 17, 2024

JonathanGregory commented May 17, 2024

davidhassell commented May 18, 2024

JonathanGregory commented May 20, 2024

larsbarring commented May 20, 2024

davidhassell commented May 20, 2024

JonathanGregory commented May 20, 2024

taylor13 commented May 20, 2024

MTG-Formats commented May 21, 2024

JonathanGregory commented May 21, 2024

JonathanGregory commented Jun 17, 2024 • edited

larsbarring commented Jun 17, 2024

JonathanGregory commented Jun 17, 2024

JonathanGregory commented Jun 17, 2024

Dave-Allured commented Jan 23, 2020 •

edited

Dave-Allured commented Jan 28, 2020 •

edited

JonathanGregory commented Jun 17, 2024 •

edited