Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopting BCP 14 (RFC 2119, RFC 8174) requirement levels in CF #258

Open
DocOtak opened this issue Aug 17, 2023 · 9 comments
Open

Adopting BCP 14 (RFC 2119, RFC 8174) requirement levels in CF #258

DocOtak opened this issue Aug 17, 2023 · 9 comments
Labels
enhancement Proposals to improve the tables or format of standard names or other controlled vocabulary

Comments

@DocOtak
Copy link
Member

DocOtak commented Aug 17, 2023

In the discussion of #256, the idea of formally adopting RFC 2119 requirement level key words was raised by myself and, perhaps foolishly 馃槈, I offered to start the effort if desired. This received immediate support from @larsbarring and @JonathanGregory. So this new issue where I will describe what this is, what changes would need to be adopted, and what I think the steps might be to getting this implemented. Adopting this will likely touch every section of the CF document and impact new and existing proposed changes to CF.

What is BCP 14?

RFC 2119 defines a set of key words that are used to indicate requirement levels of a specification. Requirement levels range from absolute requirements (i.e. not following this recommendation means this file is NOT CF) to things that are truly optional (e.g. the use of standard names in on data variables). RFC 2119 was updated by RFC 8174 to clarify that only when a key word appears in all upper case type does it mean it is a requirement level key word. e.g. "the attribute standard_name may exist on a variable" has no requirement level where "the attribute standard_name MAY exist on a variable" does. Collectively, these two standards are combined into BCP 14 (best common practice).

Changes to the CF document

There are two changes that would need to be made to the CF document:

  1. Include the following text near the start of the document as per BCP 14:

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

  2. Update all the instances of the instances of the above key words to be in their all capital form when they are intended to be interpreted as a requirement level.

Point 2 is where the real effort lays. Here is a breakdown of how many times these words currently appear in CF 1.10:

Key Word Frequency
MUST 161
MUST NOT 8
REQUIRED 69
SHALL 14
SHALL NOT 3
SHOULD 119
SHOULD NOT 13
RECOMMENDED 43
NOT RECOMMENDED 3
MAY 236
OPTIONAL 93

Not all these occurrences actually indicate the intent to make a requirement and not all the usages of the above words indicate the intent as defined by BCP 14. The conformance document/section will come in quite handy as the various sections are updated with these key words.

Implementation

Naturally following the Rules for CF Conventions Changes, implementation starts with this discussion. There should be feedback here (this very issue as per rules) but I also feel this is such a large change that discussion should be had at the 2023 CF Workshop (#243) in early October before any "final" decisions are made.

Practically, if we do decide that we will adopt BCP 14, because the key words specifically need to appear in all capital letters we can adopt it immediately with the text in point 1 above (perhaps with a caveat that it is a work in progress) and the CF conventions would be unchanged in meaning. Then a gradual update the text of the document in small pieces to have the all capitol keywords. This would alleviate the need for a single very large change set to be prepared and could be done in multiple separate pull requests.

@DocOtak DocOtak added the enhancement Proposals to improve the tables or format of standard names or other controlled vocabulary label Aug 17, 2023
@erget
Copy link
Member

erget commented Aug 17, 2023

@DocOtak you have my respect for proposing an improvement that would genuinely add value - while being a ton of work! I think this is a good idea. If we decide to go down this route, my recommendation would indeed be either to

  • update the whole document at once, so there's no confusion as to whether or not the key words are ambiguous, or
  • keep track of what documents have already been reviewed for promoting keywords to all-caps or not, so that we don't have islands of ambiguity that we forget about.

The first variant would likely be more reader-friendly, but would prove a lot of work and would need to be done in a short time frame in order to keep us from struggling with merging in all changes that get accepted in parallel.

Very happy to discuss this in greater detail at the upcoming workshop.

@larsbarring
Copy link

larsbarring commented Aug 17, 2023

@DocOtak, Andrew, again thank you, for this excellent summary and outline of the work needed! I agree that this is a undertaking that first should (perhaps even MUST ;-) be be discussed at the CF2023 workshop.

And I agree with @erget that it is important to keep the work on updating the text separate enough to not confuse readers of the "official" Working Draft. Not knowing much about github I still imagine that this could be handled by having a totally separate working branch/fork/repo (or whatever it might be called) and then when we are reasonably close to finished the changes can be systematically moved to the main Working Draft?

@sadielbartholomew
Copy link
Member

I agree it would be best to discuss this at the forthcoming workshop, but to add, I think it is a really good idea overall due to the significant reduction of ambiguity it will lead to, and so I am happy to register my support (and possibly, offer to help implement if extra help is needed and if I can find time). 馃挴

@JonathanGregory
Copy link
Contributor

Dear @DocOtak

I would say your offer was generous and valiant, rather than foolish. Thanks again. This is very helpful.

It sounds like BCP 14 has more levels than CF does. The conformance document will clarify which are requirements and recommendations. I suppose these can be equated with two of the BCP levels. We will have to discuss whether other optional things that aren't explicitly recommended, just allowed, and consequently not mentioned in the conformance document, are all in the same level. Also, as I mentioned before, there may be requirements and recommendations on the same levels as those in the conformance document, but they're not in that document because they cannot be checked automatically.

Like Daniel @erget, I would prefer the all-at-once update, but I agree it will take time to prepare. Following Lars's idea, could it be done in a branch of the public repo? I presume people can make pull requests to a branch (other than main) of the public repo from branches in their own repos - right? It would be necessary to pull updates from main to the branch periodically and some of them would conflict, but not most, I guess.

Best wishes

Jonathan

@erget
Copy link
Member

erget commented Aug 18, 2023

I think we can figure out the details of the update itself clarified once we have the substantive discussion out of the way, but in a nutshell I do think that we can do an update as you @JonathanGregory propose - public branch where we're working on the BCP 14 updates, against which pull requests could be submitted. Should we decide to ask people to continue submitting PRs against the already-published or already-agreed draft, we could still integrate at the end - in the end we just need to decide how we want to approach that particular release. In all cases, I think it makes sense to work in open view.

@DocOtak
Copy link
Member Author

DocOtak commented Aug 18, 2023

Things can absolutely take place in a publicly in a branch. I don't think merging in changes from the main branch would be too difficult, there are several options: rebase, merge commits, etc... and I don't think making this change incrementally vs one large changeset change the effort involved for those of us who would be making the actual modifications. Considerations about how external readers might be affected should take precedence in the decisions in my opinion.

@JonathanGregory
Copy link
Contributor

Dear Andrew @DocOtak

It occurs to me that DEPRECATED should also appear in the list of words to be changed. We have used this word to mean the same as "not recommended". By both "deprecated" and "not recommended" we mean "recommended against".

Cheers

Jonathan

@larsbarring
Copy link

I did a little bit of playing around with some ideas and put together some python code that paints the selected keywords. Basically it takes the .adoc files located in the parent directory (which is where they are as usual) and adds some asciidoc coloring tags and saves the results in the same directory as the python code. And if you have asciidoctor installed it creates html in a subdirectory. All is available in my fork under branch BCP14, in particular in the subdirectory BCP-14.

Perhaps this could be a starting point to build on, but I have rather limited experience of github and do not know how to set up a branch that will efficient for collaboration.

Another thing that I thought of is what words/phrases to work on. BCP14 has a clearly defined list. I think that @DocOtak wanted to restrict the work to only these. I like this because this is a rather extensive undertaking anyway. On the other hand, if we are going to do all this work, I think that it might at least be useful to be aware of closely related words/phrases so we don't miss some important context. In the python code I have these preliminary lists (which have to be checked/updated !!):

BCP14 = [
    "MUST NOT", "SHALL NOT", "SHOULD NOT",
    "MUST", "REQUIRED", "SHALL", "SHOULD",
    "RECOMMENDED", "MAY", "OPTIONAL"
]
EXTENDED_BCP14 = [
    "NOT RECOMMENDED", "RECOMMENDS* NOT","RECOMMENDS*",
    "CAN NOT", "COULD NOT", "CAN", "COULD", 
    "DEPRECATED", "HAVE TO"
]

Here is a screen-clip of how it looks like:


image

@JonathanGregory
Copy link
Contributor

Dear @larsbarring

Thanks for doing this. I agree with you, and I'd nominate permitted and required as two more words we should consider. Your screenshot is a good example of the need for context. The word recommended clarifies that those two statements are both recommendations, not requirements. The first of them appears as a recommendation in the conformance document (Sect 2.3). We are not saying that "names MUST NOT be distinguishable only by case" or that "names MUST be obviously meaningful". The word should by itself has various meanings. E.g. in "The crs_wkt attribute should comprise a text string that conforms to the WKT syntax" (Sect 5.6.1.), the word should means MUST. This statement is given as a requirement in the conformance document (Sect 5.6).

Best wishes

Jonathan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Proposals to improve the tables or format of standard names or other controlled vocabulary
Projects
None yet
Development

No branches or pull requests

5 participants