Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode terminology and representation updates. #39

Merged
merged 23 commits into from
Sep 28, 2023
Merged

Unicode terminology and representation updates. #39

merged 23 commits into from
Sep 28, 2023

Conversation

gkellogg
Copy link
Member

@gkellogg gkellogg commented Aug 23, 2023

@gkellogg gkellogg added the spec:editorial Minor issue or proposed change in the specification (markup, typo, informative text) label Aug 23, 2023
@gkellogg gkellogg requested review from domel and TallTed August 23, 2023 20:50
Copy link
Member

@TallTed TallTed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few, relatively speaking

spec/index.html Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated
characters.</li>
<li>Literals delimited by <code>'''</code> may not contain the sequence of characters
<code>'''</code>.</li>
<li>Literals delimited by a sequence of three <code>'</code> (apostrophe, code point <code class="codepoint">U+0027</code>) may not contain a sequence of three <code>'</code> (apostrophe, code point <code class="codepoint">U+0027</code>).</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks better to actually show the syntax '''.

Copy link
Member

@TallTed TallTed Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps...? (Also taking into account the need to wrap these sequences...)

Suggested change
<li>Literals delimited by a sequence of three <code>'</code> (apostrophe, code point <code class="codepoint">U+0027</code>) may not contain a sequence of three <code>'</code> (apostrophe, code point <code class="codepoint">U+0027</code>).</li>
<li>Literals delimited by a sequence of three '<code>'</code>' (apostrophe,
code point <code class="codepoint">U+0027</code>), i.e.,
'<code>'''</code>', may not contain such a sequence.</li>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or

Suggested change
<li>Literals delimited by a sequence of three <code>'</code> (apostrophe, code point <code class="codepoint">U+0027</code>) may not contain a sequence of three <code>'</code> (apostrophe, code point <code class="codepoint">U+0027</code>).</li>
<li>Literals delimited by a sequence of three "<code>'</code>" (apostrophe,
code point <code class="codepoint">U+0027</code>), i.e.,
"<code>'''</code>", may not contain such a sequence.</li>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

different colors are insufficient

Putting "..." around ''' makes it confusing, especially if you factor in, not being colour sensitive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do tend to quote single characters within the spec, although not entirely consistency right now. The EBNF also quotes these as tokens, so there is precedent. I'll update so that all the characters or sequences in this list are quoted, and generally update to favor the use of double quotes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed these to make them simpler, but still precise. I think we're getting carried away with being overly precise to the point that it makes it actually harder to understand.

Literals delimited by "'''" (sequence of three apostrophes, code point U+0027) may not contain "'''".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the grammar table, STRING_LITERAL* has, for example:

The characters between the outermost '"""'s

No colours.

Two of the boxes have open parens with no close.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See 0cd2f05.

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated
Comment on lines 483 to 485
<li>Literals delimited by a sequence of three <code>&quot;</code> (quotation mark, code point <code class="codepoint">U+0022</code>) may not contain a sequence of three <code>&quot;</code> (quotation mark, code point <code class="codepoint">U+0022</code>).</li>
<li>Literals delimited by <code>"""</code> may not contain the sequence of characters
<code>"""</code>.</li>
Copy link
Member

@TallTed TallTed Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing redundant bullet. Adjusting newer text to match even newer suggestion by @afs.

Suggested change
<li>Literals delimited by a sequence of three <code>&quot;</code> (quotation mark, code point <code class="codepoint">U+0022</code>) may not contain a sequence of three <code>&quot;</code> (quotation mark, code point <code class="codepoint">U+0022</code>).</li>
<li>Literals delimited by <code>"""</code> may not contain the sequence of characters
<code>"""</code>.</li>
<li>Literals delimited by a sequence of three '<code>&quot;</code>'
(quotation mark, code point <code class="codepoint">U+0022</code>), i.e.,
'<code>&quot;&quot;&quot;</code>', may not contain such a sequence.</li>

spec/index.html Outdated
<tr id="handle-STRING_LITERAL_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_QUOTE" >STRING_LITERAL_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '"'s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_LONG_SINGLE_QUOTE"><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_LONG_SINGLE_QUOTE">STRING_LITERAL_LONG_SINGLE_QUOTE</a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "'''"s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_LONG_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_LONG_QUOTE" >STRING_LITERAL_LONG_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '"""'s (sequence of three quotation mark characters, <code class="codepoint">U+0022</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
Copy link
Contributor

@afs afs Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned elsewhere, I think quoting quotes makes the spec less clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, you would like to see the following used universally: outermost <code>'</code>s (apostrophe, <code class="codepoint">U+0027</code>) and not quote characters or tokens in the text?

spec/index.html Outdated
<tr id="handle-STRING_LITERAL_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_QUOTE" >STRING_LITERAL_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '"'s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_LONG_SINGLE_QUOTE"><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_LONG_SINGLE_QUOTE">STRING_LITERAL_LONG_SINGLE_QUOTE</a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "'''"s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_LONG_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_LONG_QUOTE" >STRING_LITERAL_LONG_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '"""'s (sequence of three quotation mark characters, <code class="codepoint">U+0022</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>) are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>) are taken, after <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences are replaced with the characters that they represent, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Show resolved Hide resolved
Copy link
Contributor

@afs afs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The quoting for the grammar has become inconsistent with the text.

1 - keywords
2 - Quoting quotes and lack of directly saying 3-quote forms makes it unclear what to look for in the grammar

Copy link
Member

@TallTed TallTed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fear the excess whitespace that was retained throughout this document, while the linebreaks that made that whitespace make sense as indents were removed, is making things more difficult to comprehend and address.

I suggest that with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped be replaced by after <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences are replaced with the characters that they represent, throughout, and that similar phrasing be used in place of unescaped in any other occurrence.

I will have further comment after the above are done to this PR, which I will submit as a subsequent review.

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated
<tr id="handle-STRING_LITERAL_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_QUOTE" >STRING_LITERAL_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '"'s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_LONG_SINGLE_QUOTE"><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_LONG_SINGLE_QUOTE">STRING_LITERAL_LONG_SINGLE_QUOTE</a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "'''"s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_LONG_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_LONG_QUOTE" >STRING_LITERAL_LONG_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '"""'s (sequence of three quotation mark characters, <code class="codepoint">U+0022</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>) are taken, after <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences are replaced with the characters that they represent, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated
Comment on lines 1392 to 1393
<tr id="handle-STRING_LITERAL_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_QUOTE" >STRING_LITERAL_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '<code>"</code>'s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_LONG_SINGLE_QUOTE"><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_LONG_SINGLE_QUOTE">STRING_LITERAL_LONG_SINGLE_QUOTE</a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'''</code>"s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<tr id="handle-STRING_LITERAL_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_QUOTE" >STRING_LITERAL_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '<code>"</code>'s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_LONG_SINGLE_QUOTE"><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_LONG_SINGLE_QUOTE">STRING_LITERAL_LONG_SINGLE_QUOTE</a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'''</code>"s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_QUOTE"><td style="text-align:left;"><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_QUOTE">STRING_LITERAL_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '<code>"</code>'s are taken, after <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences are replaced with the characters that they represent, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
<tr id="handle-STRING_LITERAL_LONG_SINGLE_QUOTE"><td style="text-align:left;"><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_LONG_SINGLE_QUOTE">STRING_LITERAL_LONG_SINGLE_QUOTE</a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'''</code>"s are taken, after <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences are replaced with the characters that they represent, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Show resolved Hide resolved
gkellogg and others added 9 commits September 23, 2023 12:54
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
@gkellogg
Copy link
Member Author

The rebased branch addresses the various points raised by @afs and @TallTed. General, we avoid quotes around characters and tokens now. The BNF is also updated to use a consistent quote style.

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated
@@ -252,10 +254,10 @@ <h3>Object Lists</h3>
<p>
As with predicates often objects are repeated with the same subject and predicate.
The <a href="#grammar-production-objectList">objectList production</a>
matches a series of objects separated by '<code>,</code>' following a predicate.
matches a series of objects separated by <code>,</code> following a predicate.
This expresses a series of RDF Triples with the corresponding subject and predicate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "corresponding" is the wrong word here, but I can't quickly come up with the right one... Perhaps later, or someone else will.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This expresses a series of RDF Triples with the corresponding subject and predicate
This expresses a series of RDF Triples with the same subject and predicate

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
@gkellogg
Copy link
Member Author

I accepted the last suggestions for spelling out code points for each character used as a token. But, I feel that this is really messing up the flow of the document, and repeating these code point values excessively. I think a future editorial round should consider describing the character tokens explicitly in a "How to Read this Document" section and the character use be made links back into this list. For example:

# How to Read this Document

The following characters and tokens are used throughout this document and have specific Unicode code points:

<dl>
<dt><code id="cp-colon">:</code></dt>
<dd>"colon", code point <code class="codepoint">U+003A</code>)</dd>
</dl>

Then subsequent uses of that character token could be like the following:

A <span id="prefixed-name"><dfn>prefixed name</dfn></span> is a prefix label and a local part,  separated by a <a href="#cp-colon"><code>:</code></a>.

This would be a pattern to add to the Editor's Guide to follow in other specifications.

@TallTed
Copy link
Member

TallTed commented Sep 28, 2023

Some of the bits that seem redundant in my latest suggestions, including within the same sentence, could be rephrased. For instance --

 <p>While the <code>@prefix</code> and <code>@base</code> directives
        require a trailing <code>.</code> (full stop, <code class="codepoint">U+002E</code>) after the IRI,
        the equivalent <code>PREFIX</code> and <code>BASE</code>
        must not have a trailing <code>.</code> (full stop, <code class="codepoint">U+002E</code>) after the IRI part of the directive.
        The <code>PREFIX</code> and <code>BASE</code> are case-insensitive
        and can be written as <code>prefix</code> and <code>base</code>
        or use mixed case.

--- might work as --

 <p>While the <code>@prefix</code> and <code>@base</code> directives
        <em class="rfc2119">MUST</em>,
        the equivalent <code>PREFIX</code> and <code>BASE</code>
        <em class="rfc2119">MUST NOT</em>, have a trailing <code>.</code> (full stop,
        <code class="codepoint">U+002E</code>) after the IRI part of the directive.
        The <code>PREFIX</code> and <code>BASE</code> are case-insensitive
        and can be written as <code>prefix</code> and <code>base</code>
        or use mixed case.

I could probably live with the "How to Read this Document" section.

My concern remains that MANY of the visually "clear" and "obvious" punctuation characters are, in fact, visually ambiguous, especially but not only to readers who are more accustomed to non-Latin character sets, no matter that Latin character sets are typical for W3C and or SDO documents.

(The EBNF presentations will remain quite problematic, but I don't have a good way to fix those, short of using names and/or code points for all such characters as I suggested previously, which I'll grant tend to make the EBNF much harder to read, which is the only reason I haven't fought for them.)

@gkellogg
Copy link
Member Author

For instance --

 <p>While the <code>@prefix</code> and <code>@base</code> directives
        require a trailing <code>.</code> (full stop, <code class="codepoint">U+002E</code>) after the IRI,
        the equivalent <code>PREFIX</code> and <code>BASE</code>
        must not have a trailing <code>.</code> (full stop, <code class="codepoint">U+002E</code>) after the IRI part of the directive.
        The <code>PREFIX</code> and <code>BASE</code> are case-insensitive
        and can be written as <code>prefix</code> and <code>base</code>
        or use mixed case.

--- might work as --

 <p>While the <code>@prefix</code> and <code>@base</code> directives
        <em class="rfc2119">MUST</em>,
        the equivalent <code>PREFIX</code> and <code>BASE</code>
        <em class="rfc2119">MUST NOT</em>, have a trailing <code>.</code> (full stop,
        <code class="codepoint">U+002E</code>) after the IRI part of the directive.
        The <code>PREFIX</code> and <code>BASE</code> are case-insensitive
        and can be written as <code>prefix</code> and <code>base</code>
        or use mixed case.

Leaving this for now; this is a Note, so normative keywords MUST and MUST NOT are not in force. Thus, the use of "must not" instead of "MUST NOT". I'll change to "do not", which I think is better informative language.

@gkellogg gkellogg merged commit e882b8a into main Sep 28, 2023
2 checks passed
@gkellogg gkellogg deleted the unicode branch September 28, 2023 20:52
@TallTed
Copy link
Member

TallTed commented Sep 29, 2023

[@gkellogg] Leaving this for now; this is a Note, so normative keywords MUST and MUST NOT are not in force. Thus, the use of "must not" instead of "MUST NOT". I'll change to "do not", which I think is better informative language.

I wish I'd seen this before you merged it. I think the same strength of language should be used on both PREFIX and @prefix, so I suggest --

<p>While the <code>@prefix</code> and <code>@base</code> directives
        require a trailing <code>.</code> (full stop, 
        <code class="codepoint">U+002E</code>) after the IRI,
        the equivalent <code>PREFIX</code> and <code>BASE</code> directives
        require that there be no trailing <code>.</code> (full stop, 
        <code class="codepoint">U+002E</code>) after the IRI.
        The <code>PREFIX</code> and <code>BASE</code> are case-insensitive
        and can be written as <code>prefix</code> and <code>base</code>
        or use mixed case.

-- which I'll make into a PR if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:editorial Minor issue or proposed change in the specification (markup, typo, informative text)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants