Skip to content

Commit

Permalink
Revisions to introduction to address @xfq's comment. Basically I remo…
Browse files Browse the repository at this point in the history
…ved the forward definition of Unicode locale (and all of the associated text--moving a relevant bit of it into the section below). I then added text that made clear the purpose of this document.

I also added a reference to string-meta.
  • Loading branch information
aphillips committed Jul 5, 2020
1 parent 60bd306 commit 645ee75
Showing 1 changed file with 8 additions and 14 deletions.
22 changes: 8 additions & 14 deletions index.html
Expand Up @@ -73,24 +73,16 @@
<section id="introduction">
<h2>Introduction</h2>

<p>Tags for identifying the <a>natural language</a> of content or the <a>international preferences</a> of users (in order to select the natural language of content) are one of the fundamental building blocks of the Web. The language tags found in Web and Internet formats and protocols are defined by [[BCP47]]. Consistent use of language tags provides applications the ability to perform language-specific formatting or processing. For example, a user-agent might use the language to select an appropriate font for displaying text or a Web page designer might style text differently in one language than in another.</p>
<p>Tags for identifying the <a>natural language</a> of content or the <a>international preferences</a> of users (in order to select the natural language of content) are one of the fundamental building blocks of the Web. The <a>language tags</a> found in Web and Internet formats and protocols are defined by [[BCP47]]. Consistent use of language tags provides applications the ability to perform language-specific formatting or processing. For example, a user-agent might use the language to select an appropriate font for displaying text or a Web page designer might style text differently in one language than in another.</p>

<p>In addition, language tags are also used to identify <a>international preferences</a> linked to culture, region, or language. Because these preferences are linked to the natural language or regional association of the end user, [[BCP47]] provides a natural way of encoding these preferences also. Such preferences are applied to processes such as presenting numbers, dates, or times; sorting lists linguistically; providing defaults for items such as the presentation of a calendar, common units of measurement, or 12- vs. 24-hour time presentation; and many other details that users might find too tedious to set individually. Collectively, these preferences are usually called a <a>locale</a>.</p>
<p><a>Language tags</a> can also be used to identify <a>international preferences</a> associated with a given piece of content or user because these preferences are linked to the natural language, regional association, or culture of the end user. Such preferences are applied to processes such as presenting numbers, dates, or times; sorting lists linguistically; providing defaults for items such as the presentation of a calendar, or common units of measurement; selecting between 12- vs. 24-hour time presentation; and many other details that users might find too tedious to set individually. Collectively, these preferences are usually called a <a>locale</a>.</p>

<p>This document describes best practices for the adoption and use of [[BCP47]] language tags for the identification of <a>natural language</a> content as well as the use of language tags to represent the locale preferences of the user or content author. It describes how document formats, specifications, and implementations should handle the language tags described by [[BCP47]], as well as data structures that extend these tags to describe <a>international preferences</a> (see also sec. 3.1 in [[WS-I18N-SCENARIOS]]). </p>

<p>Identification of language and locale has a broad range of applications within the World Wide Web. Existing standards which make use of language identification include the <code>xml:lang</code> attribute in [[XML10]], the <code>lang</code> and <code>hreflang</code> atttributes in [[HTML]], the <code>language</code> property in [[XSL10]], and the <code>:lang</code> pseudo-class in CSS [[CSS3-SELECTORS]]. Language tags are also used to identify locales, such as in the Unicode Common Locale Data Repository or "CLDR" project [[CLDR]].</p>
<p>Historically, locales were identified by the programming language or operating environment of the user. This application-specific identifier was often inferred from language tags. For example, an implementation could map a language tag from an existing protocol, such as HTTP's Accept-Language header, to its locale model. Locales may also be identified directly by using the language tag syntax in data items (elements, attributes, headers, etc.) that explicitly serve the purpose of locale identification.</p>
<p>Identification of language and locale has a broad range of applications within the World Wide Web. Existing standards which make use of language identification include the <code>xml:lang</code> attribute in [[XML10]], the <code>lang</code> and <code>hreflang</code> atttributes in [[HTML]], the <code>language</code> property in [[XSL10]], and the <code>:lang</code> pseudo-class in CSS [[CSS3-SELECTORS]]. Document formats and protocols often need to identify and exchange information about the <a>natural language</a> of content or perform <a>language negotiation</a> when selecting appropriate content on the Web. For more information and best practices related to this type of language metadata, see [[STRING-META]].</p>

<p>This document's focus is on the interplay between <a>language tags</a> and the <a>internationalization</a> of the Web. Many applications need to provide for the <a>localized</a> presentation of data values (such as numbers or dates). This document provides <a href="#best-practices">best practices</a> for specification authors who need to define language tags in their document formats or protocols, including common operations such as <a>language negotiation</a>. It also provides recommendations for how to specify locale-affected behavior and defines core terminology that specifications might need to refer to these behaviors or capabilities.</p>

<p>Unicode Locales are an extension of [[BCP47]] defined as part of Unicode's [[CLDR]] project. These identifiers are identical to language tags, but apply additional rules about the content of certain language tags. Unicode Locales increasingly form the basis for <a>internationalization</a> on the Web, particularly as part of the <code>Intl</code> locale framework in JavaScript.</p>

<section id="out_of_scope">
<h3>Out of Scope</h3>
<p>This specification does not deal with formats for locale data or
actual locale data. One source of locale data and data formats is the
Unicode Common Locale Data Repository project ([[CLDR]]).</p>
</section>
</section>


Expand Down Expand Up @@ -142,9 +134,11 @@ <h3>What are Internationalization and Localization?</h3>

<p><a>Language tags</a> can provide information about the language, script, region, and various specially-registered variants using subtags. But sometimes there are international preferences that do not correlate directly with any of these. For example, many cultures have more than one way of sorting content items, and so the appropriate sort ordering cannot always be inferred from the language tag by itself. Thus a German language user might want to choose between the sort ordering used in a dictionary versus that used in a phone book.</p>

<p>Historically, locales were identified by the programming language or operating environment of the user. This application-specific identifier was often inferred from language tags. For example, an implementation could map a language tag from an existing protocol, such as HTTP's Accept-Language header, to its locale model.</p>

<p><dfn data-lt="common locale data repository|CLDR">Common Locale Data Repository</dfn> or <em>CLDR</em> [[CLDR]] is a Unicode Consortium project that defines, collects, and curates sets of <a>locale</a> data needed to <a>enable</a> systems or operating environments. CLDR data and it's locale model are widely adopted, particularly in browsers.</p>

<p><dfn>Unicode Locale</dfn>. A language tag extension [[RFC6067]] and additional processing rules defined by [[CLDR]] to support <a>locales</a> defined by Unicode. Unicode locales provide the ability to specify in a language tag a number of the international preference variations that users or content authors might wish to specify directly. The language tag extension uses the <code>-u-</code> subtag.</p>
<p><dfn>Unicode Locale</dfn>. A language tag extension [[RFC6067]] and additional processing rules defined by [[CLDR]] to support <a>locales</a> defined by Unicode. Unicode locales provide the ability to specify in a language tag a number of the international preference variations that users or content authors might wish to specify directly. The language tag extension uses the <code>-u-</code> subtag. These identifiers are identical to language tags, but apply additional rules about the content of certain language tags. Unicode Locales increasingly form the basis for <a>internationalization</a> on the Web, particularly as part of the <code>Intl</code> locale framework in JavaScript.</p>

<p>Unicode's [[CLDR]] project also maintains [[RFC6497]], which defines a [[BCP47]] registered extension (using the <code>-t-</code> subtag) which describes <em>transformations</em> (generally text transformations, such as transliteration between scripts).</p>

Expand Down

0 comments on commit 645ee75

Please sign in to comment.