Unicode terminology and representation updates. #39

gkellogg · 2023-08-23T20:50:32Z

Preview | Diff

TallTed

Just a few, relatively speaking

spec/index.html

afs · 2023-09-19T07:09:27Z

spec/index.html

          characters.</li>
-        <li>Literals delimited by <code>'''</code> may not contain the sequence of characters
-          <code>'''</code>.</li>
+        <li>Literals delimited by a sequence of three <code>'</code> (apostrophe, code point <code class="codepoint">U+0027</code>) may not contain a sequence of three <code>'</code> (apostrophe, code point <code class="codepoint">U+0027</code>).</li>        


It looks better to actually show the syntax '''.

Perhaps...? (Also taking into account the need to wrap these sequences...)

Suggested change

<li>Literals delimited by a sequence of three <code>'</code> (apostrophe, code point <code class="codepoint">U+0027</code>) may not contain a sequence of three <code>'</code> (apostrophe, code point <code class="codepoint">U+0027</code>).</li>

<li>Literals delimited by a sequence of three '<code>'</code>' (apostrophe,

code point <code class="codepoint">U+0027</code>), i.e.,

'<code>'''</code>', may not contain such a sequence.</li>

or

Suggested change

<li>Literals delimited by a sequence of three <code>'</code> (apostrophe, code point <code class="codepoint">U+0027</code>) may not contain a sequence of three <code>'</code> (apostrophe, code point <code class="codepoint">U+0027</code>).</li>

<li>Literals delimited by a sequence of three "<code>'</code>" (apostrophe,

code point <code class="codepoint">U+0027</code>), i.e.,

"<code>'''</code>", may not contain such a sequence.</li>

different colors are insufficient

Putting "..." around ''' makes it confusing, especially if you factor in, not being colour sensitive.

We do tend to quote single characters within the spec, although not entirely consistency right now. The EBNF also quotes these as tokens, so there is precedent. I'll update so that all the characters or sequences in this list are quoted, and generally update to favor the use of double quotes.

I changed these to make them simpler, but still precise. I think we're getting carried away with being overly precise to the point that it makes it actually harder to understand.

Literals delimited by "'''" (sequence of three apostrophes, code point U+0027) may not contain "'''".

In the grammar table, STRING_LITERAL* has, for example:

The characters between the outermost '"""'s

No colours.

Two of the boxes have open parens with no close.

See 0cd2f05.

spec/index.html

TallTed · 2023-09-19T14:03:28Z

spec/index.html

+        <li>Literals delimited by a sequence of three <code>&quot;</code> (quotation mark, code point <code class="codepoint">U+0022</code>) may not contain a sequence of three <code>&quot;</code> (quotation mark, code point <code class="codepoint">U+0022</code>).</li>
        <li>Literals delimited by <code>"""</code> may not contain the sequence of characters
          <code>"""</code>.</li>


Removing redundant bullet. Adjusting newer text to match even newer suggestion by @afs.

Suggested change

<li>Literals delimited by a sequence of three <code>"</code> (quotation mark, code point <code class="codepoint">U+0022</code>) may not contain a sequence of three <code>"</code> (quotation mark, code point <code class="codepoint">U+0022</code>).</li>

<li>Literals delimited by <code>"""</code> may not contain the sequence of characters

<code>"""</code>.</li>

<li>Literals delimited by a sequence of three '<code>"</code>'

(quotation mark, code point <code class="codepoint">U+0022</code>), i.e.,

'<code>"""</code>', may not contain such a sequence.</li>

afs · 2023-09-20T07:53:43Z

spec/index.html

-      <tr id="handle-STRING_LITERAL_QUOTE"            ><td style="text-align:left;"            ><a class="type lexicalForm"  href="#grammar-production-STRING_LITERAL_QUOTE"            >STRING_LITERAL_QUOTE            </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '"'s   are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
-      <tr id="handle-STRING_LITERAL_LONG_SINGLE_QUOTE"><td style="text-align:left;"            ><a class="type lexicalForm"  href="#grammar-production-STRING_LITERAL_LONG_SINGLE_QUOTE">STRING_LITERAL_LONG_SINGLE_QUOTE</a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "'''"s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
-      <tr id="handle-STRING_LITERAL_LONG_QUOTE"       ><td style="text-align:left;"            ><a class="type lexicalForm"  href="#grammar-production-STRING_LITERAL_LONG_QUOTE"       >STRING_LITERAL_LONG_QUOTE       </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '"""'s (sequence of three quotation mark characters, <code class="codepoint">U+0022</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
+      <tr id="handle-STRING_LITERAL_SINGLE_QUOTE"     ><td style="text-align:left;"            ><a class="type lexicalForm"  href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE"     >STRING_LITERAL_SINGLE_QUOTE     </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s  (apostrophe, <code class="codepoint">U+0027</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>


As mentioned elsewhere, I think quoting quotes makes the spec less clear.

So, you would like to see the following used universally: outermost <code>'</code>s (apostrophe, <code class="codepoint">U+0027</code>) and not quote characters or tokens in the text?

afs · 2023-09-20T07:56:10Z

spec/index.html

-      <tr id="handle-STRING_LITERAL_QUOTE"            ><td style="text-align:left;"            ><a class="type lexicalForm"  href="#grammar-production-STRING_LITERAL_QUOTE"            >STRING_LITERAL_QUOTE            </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '"'s   are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
-      <tr id="handle-STRING_LITERAL_LONG_SINGLE_QUOTE"><td style="text-align:left;"            ><a class="type lexicalForm"  href="#grammar-production-STRING_LITERAL_LONG_SINGLE_QUOTE">STRING_LITERAL_LONG_SINGLE_QUOTE</a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "'''"s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
-      <tr id="handle-STRING_LITERAL_LONG_QUOTE"       ><td style="text-align:left;"            ><a class="type lexicalForm"  href="#grammar-production-STRING_LITERAL_LONG_QUOTE"       >STRING_LITERAL_LONG_QUOTE       </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '"""'s (sequence of three quotation mark characters, <code class="codepoint">U+0022</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
+      <tr id="handle-STRING_LITERAL_SINGLE_QUOTE"     ><td style="text-align:left;"            ><a class="type lexicalForm"  href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE"     >STRING_LITERAL_SINGLE_QUOTE     </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s  (apostrophe, <code class="codepoint">U+0027</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>


Suggested change

<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>) are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

Suggested change

<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>) are taken, after <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences are replaced with the characters that they represent, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

spec/index.html

afs

The quoting for the grammar has become inconsistent with the text.

1 - keywords
2 - Quoting quotes and lack of directly saying 3-quote forms makes it unclear what to look for in the grammar

TallTed

I fear the excess whitespace that was retained throughout this document, while the linebreaks that made that whitespace make sense as indents were removed, is making things more difficult to comprehend and address.

I suggest that with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped be replaced by after <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences are replaced with the characters that they represent, throughout, and that similar phrasing be used in place of unescaped in any other occurrence.

I will have further comment after the above are done to this PR, which I will submit as a subsequent review.

spec/index.html

TallTed · 2023-09-20T14:52:20Z

spec/index.html

-      <tr id="handle-STRING_LITERAL_QUOTE"            ><td style="text-align:left;"            ><a class="type lexicalForm"  href="#grammar-production-STRING_LITERAL_QUOTE"            >STRING_LITERAL_QUOTE            </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '"'s   are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
-      <tr id="handle-STRING_LITERAL_LONG_SINGLE_QUOTE"><td style="text-align:left;"            ><a class="type lexicalForm"  href="#grammar-production-STRING_LITERAL_LONG_SINGLE_QUOTE">STRING_LITERAL_LONG_SINGLE_QUOTE</a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "'''"s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
-      <tr id="handle-STRING_LITERAL_LONG_QUOTE"       ><td style="text-align:left;"            ><a class="type lexicalForm"  href="#grammar-production-STRING_LITERAL_LONG_QUOTE"       >STRING_LITERAL_LONG_QUOTE       </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '"""'s (sequence of three quotation mark characters, <code class="codepoint">U+0022</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
+      <tr id="handle-STRING_LITERAL_SINGLE_QUOTE"     ><td style="text-align:left;"            ><a class="type lexicalForm"  href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE"     >STRING_LITERAL_SINGLE_QUOTE     </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s  (apostrophe, <code class="codepoint">U+0027</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>


Suggested change

<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

<tr id="handle-STRING_LITERAL_SINGLE_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_SINGLE_QUOTE" >STRING_LITERAL_SINGLE_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'</code>"s (apostrophe, <code class="codepoint">U+0027</code>) are taken, after <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences are replaced with the characters that they represent, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

spec/index.html

TallTed · 2023-09-20T19:07:26Z

spec/index.html

      <tr id="handle-STRING_LITERAL_QUOTE"            ><td style="text-align:left;"            ><a class="type lexicalForm"  href="#grammar-production-STRING_LITERAL_QUOTE"            >STRING_LITERAL_QUOTE            </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '<code>"</code>'s   are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
      <tr id="handle-STRING_LITERAL_LONG_SINGLE_QUOTE"><td style="text-align:left;"            ><a class="type lexicalForm"  href="#grammar-production-STRING_LITERAL_LONG_SINGLE_QUOTE">STRING_LITERAL_LONG_SINGLE_QUOTE</a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'''</code>"s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>


Suggested change

<tr id="handle-STRING_LITERAL_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_QUOTE" >STRING_LITERAL_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '<code>"</code>'s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

<tr id="handle-STRING_LITERAL_LONG_SINGLE_QUOTE"><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_LONG_SINGLE_QUOTE">STRING_LITERAL_LONG_SINGLE_QUOTE</a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'''</code>"s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

<tr id="handle-STRING_LITERAL_QUOTE"><td style="text-align:left;"><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_QUOTE">STRING_LITERAL_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '<code>"</code>'s are taken, after <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences are replaced with the characters that they represent, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

<tr id="handle-STRING_LITERAL_LONG_SINGLE_QUOTE"><td style="text-align:left;"><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_LONG_SINGLE_QUOTE">STRING_LITERAL_LONG_SINGLE_QUOTE</a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'''</code>"s are taken, after <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences are replaced with the characters that they represent, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

spec/index.html

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

…itive.

…ples.

gkellogg · 2023-09-23T21:01:18Z

The rebased branch addresses the various points raised by @afs and @TallTed. General, we avoid quotes around characters and tokens now. The BNF is also updated to use a consistent quote style.

spec/index.html

TallTed · 2023-09-28T01:53:44Z

spec/index.html

@@ -252,10 +254,10 @@ <h3>Object Lists</h3>
    <p>
      As with predicates often objects are repeated with the same subject and predicate.
      The <a href="#grammar-production-objectList">objectList production</a>
-      matches a series of objects separated by '<code>,</code>' following a predicate.
+      matches a series of objects separated by <code>,</code> following a predicate.
      This expresses a series of RDF Triples with the corresponding subject and predicate


I think "corresponding" is the wrong word here, but I can't quickly come up with the right one... Perhaps later, or someone else will.

Suggested change

This expresses a series of RDF Triples with the corresponding subject and predicate

This expresses a series of RDF Triples with the same subject and predicate

spec/index.html

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

gkellogg · 2023-09-28T05:50:54Z

I accepted the last suggestions for spelling out code points for each character used as a token. But, I feel that this is really messing up the flow of the document, and repeating these code point values excessively. I think a future editorial round should consider describing the character tokens explicitly in a "How to Read this Document" section and the character use be made links back into this list. For example:

# How to Read this Document

The following characters and tokens are used throughout this document and have specific Unicode code points:

<dl>
<dt><code id="cp-colon">:</code></dt>
<dd>"colon", code point <code class="codepoint">U+003A</code>)</dd>
</dl>

Then subsequent uses of that character token could be like the following:

A <span id="prefixed-name"><dfn>prefixed name</dfn></span> is a prefix label and a local part,  separated by a <a href="#cp-colon"><code>:</code></a>.

This would be a pattern to add to the Editor's Guide to follow in other specifications.

TallTed · 2023-09-28T15:34:44Z

Some of the bits that seem redundant in my latest suggestions, including within the same sentence, could be rephrased. For instance --

 <p>While the <code>@prefix</code> and <code>@base</code> directives
        require a trailing <code>.</code> (full stop, <code class="codepoint">U+002E</code>) after the IRI,
        the equivalent <code>PREFIX</code> and <code>BASE</code>
        must not have a trailing <code>.</code> (full stop, <code class="codepoint">U+002E</code>) after the IRI part of the directive.
        The <code>PREFIX</code> and <code>BASE</code> are case-insensitive
        and can be written as <code>prefix</code> and <code>base</code>
        or use mixed case.

--- might work as --

 <p>While the <code>@prefix</code> and <code>@base</code> directives
        <em class="rfc2119">MUST</em>,
        the equivalent <code>PREFIX</code> and <code>BASE</code>
        <em class="rfc2119">MUST NOT</em>, have a trailing <code>.</code> (full stop,
        <code class="codepoint">U+002E</code>) after the IRI part of the directive.
        The <code>PREFIX</code> and <code>BASE</code> are case-insensitive
        and can be written as <code>prefix</code> and <code>base</code>
        or use mixed case.

I could probably live with the "How to Read this Document" section.

My concern remains that MANY of the visually "clear" and "obvious" punctuation characters are, in fact, visually ambiguous, especially but not only to readers who are more accustomed to non-Latin character sets, no matter that Latin character sets are typical for W3C and or SDO documents.

(The EBNF presentations will remain quite problematic, but I don't have a good way to fix those, short of using names and/or code points for all such characters as I suggested previously, which I'll grant tend to make the EBNF much harder to read, which is the only reason I haven't fought for them.)

gkellogg · 2023-09-28T20:48:58Z

For instance --

 <p>While the <code>@prefix</code> and <code>@base</code> directives
        require a trailing <code>.</code> (full stop, <code class="codepoint">U+002E</code>) after the IRI,
        the equivalent <code>PREFIX</code> and <code>BASE</code>
        must not have a trailing <code>.</code> (full stop, <code class="codepoint">U+002E</code>) after the IRI part of the directive.
        The <code>PREFIX</code> and <code>BASE</code> are case-insensitive
        and can be written as <code>prefix</code> and <code>base</code>
        or use mixed case.

--- might work as --

 <p>While the <code>@prefix</code> and <code>@base</code> directives
        <em class="rfc2119">MUST</em>,
        the equivalent <code>PREFIX</code> and <code>BASE</code>
        <em class="rfc2119">MUST NOT</em>, have a trailing <code>.</code> (full stop,
        <code class="codepoint">U+002E</code>) after the IRI part of the directive.
        The <code>PREFIX</code> and <code>BASE</code> are case-insensitive
        and can be written as <code>prefix</code> and <code>base</code>
        or use mixed case.

Leaving this for now; this is a Note, so normative keywords MUST and MUST NOT are not in force. Thus, the use of "must not" instead of "MUST NOT". I'll change to "do not", which I think is better informative language.

TallTed · 2023-09-29T21:49:11Z

[@gkellogg] Leaving this for now; this is a Note, so normative keywords MUST and MUST NOT are not in force. Thus, the use of "must not" instead of "MUST NOT". I'll change to "do not", which I think is better informative language.

I wish I'd seen this before you merged it. I think the same strength of language should be used on both PREFIX and @prefix, so I suggest --

<p>While the <code>@prefix</code> and <code>@base</code> directives
        require a trailing <code>.</code> (full stop, 
        <code class="codepoint">U+002E</code>) after the IRI,
        the equivalent <code>PREFIX</code> and <code>BASE</code> directives
        require that there be no trailing <code>.</code> (full stop, 
        <code class="codepoint">U+002E</code>) after the IRI.
        The <code>PREFIX</code> and <code>BASE</code> are case-insensitive
        and can be written as <code>prefix</code> and <code>base</code>
        or use mixed case.

-- which I'll make into a PR if needed.

gkellogg added the spec:editorial Minor issue or proposed change in the specification (markup, typo, informative text) label Aug 23, 2023

gkellogg requested review from domel and TallTed August 23, 2023 20:50

TallTed suggested changes Aug 23, 2023

View reviewed changes

spec/index.html Show resolved Hide resolved

spec/index.html Outdated Show resolved Hide resolved

spec/index.html Outdated Show resolved Hide resolved

spec/index.html Outdated Show resolved Hide resolved

TallTed reviewed Aug 23, 2023

View reviewed changes

spec/index.html Outdated Show resolved Hide resolved

TallTed reviewed Aug 24, 2023

View reviewed changes

spec/index.html Outdated Show resolved Hide resolved

gkellogg requested a review from TallTed September 16, 2023 21:11

domel approved these changes Sep 16, 2023

View reviewed changes

gkellogg force-pushed the unicode branch from af35ee1 to 0b135ca Compare September 18, 2023 21:15

TallTed suggested changes Sep 18, 2023

View reviewed changes

spec/index.html Outdated Show resolved Hide resolved

spec/index.html Outdated Show resolved Hide resolved

spec/index.html Outdated Show resolved Hide resolved

Tpt reviewed Sep 19, 2023

View reviewed changes

spec/index.html Outdated Show resolved Hide resolved

afs reviewed Sep 19, 2023

View reviewed changes

spec/index.html Outdated Show resolved Hide resolved

TallTed reviewed Sep 19, 2023

View reviewed changes

afs reviewed Sep 20, 2023

View reviewed changes

spec/index.html Show resolved Hide resolved

afs requested changes Sep 20, 2023

View reviewed changes

TallTed suggested changes Sep 20, 2023

View reviewed changes

TallTed reviewed Sep 20, 2023

View reviewed changes

TallTed suggested changes Sep 21, 2023

View reviewed changes

spec/index.html Outdated Show resolved Hide resolved

spec/index.html Show resolved Hide resolved

gkellogg and others added 9 commits September 23, 2023 12:54

Unicode terminology and representation updates.

d0e0709

Apply suggestions from code review

edccb91

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

More spelled out characters.

2c2cf0d

Reorder character ranges for readability

66d8fb5

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

Minor grammar update

5f2a89b

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

Updates leveraging "RDF string" definition in RDF Concepts.

e23dc4d

Use code elements for codepoints.

3c60ac7

Apply suggestions from code review

b1573af

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

Fix markup

6be9ef0

gkellogg and others added 12 commits September 23, 2023 12:54

Be more explicit about codes for '_' and '.'.

9eb0465

More consistency in character representation.

9d3f28a

Use of \UXXXXXXXX escapes

bc9a340

Quotes in grammar table.

a434638

Apply suggestions from code review

804ef0c

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

Apply suggestions from code review

7b0c4ad

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

Update quotes in EBNF HTML grammar to match that from the raw EBNF.

d8c9057

Update EBNF (raw and HTML) to make 'a', 'true', and 'false' case-sens…

a67dbd8

…itive.

Re-align table elements in grammar table.

1854796

Make use of quoted tokens consistent in BNF.

f19f54e

Make sure examples brought in through data-include are marked as exam…

ab4a930

…ples.

Remove most quotes around single characters and tokens.

4b27b1c

gkellogg force-pushed the unicode branch from c906b82 to 4b27b1c Compare September 23, 2023 20:58

gkellogg requested review from afs, TallTed and Tpt September 23, 2023 21:01

afs approved these changes Sep 27, 2023

View reviewed changes

TallTed suggested changes Sep 28, 2023

View reviewed changes

More explicit code points for characters

77cf2b5

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

Tpt approved these changes Sep 28, 2023

View reviewed changes

Monor fixes suggested from code review.

51bfdc9

gkellogg merged commit e882b8a into main Sep 28, 2023
2 checks passed

gkellogg deleted the unicode branch September 28, 2023 20:52

gkellogg mentioned this pull request Sep 28, 2023

Simplify character token representation #45

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unicode terminology and representation updates. #39

Unicode terminology and representation updates. #39

gkellogg commented Aug 23, 2023 •

edited by pr-preview bot

Loading

TallTed left a comment

afs Sep 19, 2023

TallTed Sep 19, 2023 •

edited

Loading

TallTed Sep 19, 2023

afs Sep 19, 2023

gkellogg Sep 19, 2023

gkellogg Sep 19, 2023

afs Sep 19, 2023

gkellogg Sep 20, 2023

TallTed Sep 19, 2023 •

edited

Loading

afs Sep 20, 2023 •

edited

Loading

gkellogg Sep 20, 2023

afs Sep 20, 2023

TallTed Sep 20, 2023

afs left a comment

TallTed left a comment

TallTed Sep 20, 2023

TallTed Sep 20, 2023

gkellogg commented Sep 23, 2023

TallTed Sep 28, 2023

TallTed Sep 28, 2023

gkellogg commented Sep 28, 2023

TallTed commented Sep 28, 2023

gkellogg commented Sep 28, 2023

TallTed commented Sep 29, 2023

-        <li>Literals delimited by a sequence of three <code>'</code> (apostrophe, code point <code class="codepoint">U+0027</code>) may not contain a sequence of three <code>'</code> (apostrophe, code point <code class="codepoint">U+0027</code>).</li>
+        <li>Literals delimited by a sequence of three '<code>'</code>' (apostrophe,
+          code point <code class="codepoint">U+0027</code>), i.e.,
+          '<code>'''</code>', may not contain such a sequence.</li>

		<tr id="handle-STRING_LITERAL_QUOTE" ><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_QUOTE" >STRING_LITERAL_QUOTE </a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost '<code>"</code>'s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>
		<tr id="handle-STRING_LITERAL_LONG_SINGLE_QUOTE"><td style="text-align:left;" ><a class="type lexicalForm" href="#grammar-production-STRING_LITERAL_LONG_SINGLE_QUOTE">STRING_LITERAL_LONG_SINGLE_QUOTE</a></td><td><a data-cite="RDF12-CONCEPTS#dfn-lexical-form"> lexical form </a></td><td>The characters between the outermost "<code>'''</code>"s are taken, with <a href="#numeric">numeric</a> and <a href="#string">string</a> escape sequences unescaped, to form the <a data-cite="RDF12-CONCEPTS#dfn-rdf-string">RDF string</a> of a lexical form.</td></tr>

	This expresses a series of RDF Triples with the corresponding subject and predicate
	This expresses a series of RDF Triples with the same subject and predicate

Unicode terminology and representation updates. #39

Unicode terminology and representation updates. #39

Conversation

gkellogg commented Aug 23, 2023 • edited by pr-preview bot Loading

TallTed left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TallTed Sep 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TallTed Sep 19, 2023 • edited Loading

Choose a reason for hiding this comment

afs Sep 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

afs left a comment

Choose a reason for hiding this comment

TallTed left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gkellogg commented Sep 23, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gkellogg commented Sep 28, 2023

TallTed commented Sep 28, 2023

gkellogg commented Sep 28, 2023

TallTed commented Sep 29, 2023

gkellogg commented Aug 23, 2023 •

edited by pr-preview bot

Loading

TallTed Sep 19, 2023 •

edited

Loading

TallTed Sep 19, 2023 •

edited

Loading

afs Sep 20, 2023 •

edited

Loading