Skip to content

Commit

Permalink
Address #69
Browse files Browse the repository at this point in the history
Add a reference to 4647 when talking about BCP47

Also moved requirements above explanation per our standard (thanks
@r12a)
  • Loading branch information
aphillips committed Jan 27, 2023
1 parent 7b24ade commit 85fc0dc
Showing 1 changed file with 11 additions and 12 deletions.
23 changes: 11 additions & 12 deletions index.html
Expand Up @@ -303,10 +303,10 @@ <h3>Defining language values</h3>


<div class="req" id="lang_bcp_not_rfc">
<p class="advisement">Refer to BCP 47, not to RFC 5646.</p>
<p class="advisement">Refer to BCP 47, not to its constituent parts, such as RFC 5646 or RFC 4647.</p>
</div>

<p>The link to and name of BCP 47 was created specifically so that there is an unchanging reference to the definition of Tags for the Identification of Languages. RFCs 1766, 3066, 4646 were previous (superseded) versions and 5646 is the current version of BCP 47.</p>
<p>The link to and name of BCP 47 was created specifically so that there is an unchanging reference to the definition of <cite>Tags for the Identification of Languages</cite>. RFCs 1766, 3066, 4646 were previous (superseded) versions. The current version of BCP 47 is made up of two RFCs: 5646 and 4647.</p>



Expand Down Expand Up @@ -2825,6 +2825,13 @@ <h3>Specifying sort and search functionality</h3>

<p>Applications often need to organize sets of information or content. Frequently this involves sorting the content so that users can find what they are looking for. Many data types, such as numbers or dates, are easily sorted by comparing the values. When it comes to textual information, however, the nature of character encodings and user expectations regarding "alphabetical" order brings some additional complexity.</p>

<div class="req" id="char_sort_internal_only">
<p class="advisement">Specifications or implementations that require a program-internal, fast, and deterministic sorting of text which is not intended for human viewing or interaction SHOULD specify that strings are sorted according to their definition of string. For string types based on UTF-16 (such as DOMString or JavaScript), specify <em>ascending code unit</em> order. For data that uses scalar value strings (such as USVString or many XML processes), specify <em>ascending code point</em> order.</p>
<details class="links"><summary>explanations &amp; examples</summary>
<a href="#char_string">Defining 'string'</a>
</details>
</div>

<p>One key choice is whether the sorting of textual data will be shown to users and thus need to be [=locale=]-sensitive (that is, following the sorting rules of a specific language or culture) or whether the sorting is strictly internal. There are two potential internal sorting sequences: ordering by Unicode [=code point=] or ordering by [=code unit=]. For either type of ordering, the resulting list will not match any particular alphabet or lexicographical order.</p>

<p>Sorting by [=code point=] makes sense when strings are stored and processed as a sequence of code points, such as in a <a href="https://webidl.spec.whatwg.org/#idl-USVString">USVString</a>. Sorting by [=code unit=] makes sense when strings are stored and processed using the underlying encoding, such as in a <a href="https://webidl.spec.whatwg.org/#idl-DOMString">DOMString</a>.</p>
Expand All @@ -2848,16 +2855,6 @@ <h3>Specifying sort and search functionality</h3>
<p>Note that UTF-8 <em>code unit order</em> (that is, when sorting by byte values in UTF-8 encoded byte strings) is the same as code point order.</p>
</aside>


<div class="req" id="char_sort_internal_only">
<p class="advisement">Specifications or implementations that require a program-internal, fast, and deterministic sorting of text which is not intended for human viewing or interaction SHOULD specify that strings are sorted according to their definition of string. For string types based on UTF-16 (such as DOMString or JavaScript), specify <em>ascending code unit</em> order. For data that uses scalar value strings (such as USVString or many XML processes), specify <em>ascending code point</em> order.</p>
<details class="links"><summary>explanations &amp; examples</summary>
<a href="#char_string">Defining 'string'</a>
</details>
</div>

<p>Specifications or applications that need to deal with sorting natural language text for display to users face some additional complexity. Unicode defines a default collation (sorting) order as part of the <cite>Unicode Collation Algorithm</cite> [[UTS10]], which can then be tailored to meet the needs of specific languages, <a>locales</a>, and cultures.</p>


<div class="req" id="char_sort_units">
<p class="advisement">Software that sorts or searches text for display to users SHOULD do so on the basis of appropriate collation units and ordering rules for the relevant language and/or application.</p>
Expand All @@ -2866,6 +2863,8 @@ <h3>Specifying sort and search functionality</h3>
</details>
</div>

<p>Specifications or applications that need to deal with sorting natural language text for display to users face some additional complexity. Unicode defines a default collation (sorting) order as part of the <cite>Unicode Collation Algorithm</cite> [[UTS10]], which can then be tailored to meet the needs of specific languages, <a>locales</a>, and cultures.</p>

<aside class="issue" id="char_sort_user_issue">
<p>The following requirement is somewhat unclear for specification authors. There are many places where what I'd want to advise specs to do is follow the language (locale) of the given document or of the application or to provide controls so that the application can choose appropriately. The "current user", where it means "operating system" or "user agent host system's locale" or "browser's localization" is not always what is expected.</p>
</aside>
Expand Down

0 comments on commit 85fc0dc

Please sign in to comment.