Skip to content

Commit

Permalink
[selectors-4] Canonicalize into extlang form prior to :lang() matching (
Browse files Browse the repository at this point in the history
#4212)

Fixes #4154

Co-authored-by: Chris Lilley <chris@w3.org>
  • Loading branch information
frivoal and svgeesus committed Mar 6, 2024
1 parent 65d3182 commit f2506ef
Showing 1 changed file with 13 additions and 10 deletions.
23 changes: 13 additions & 10 deletions selectors-4/Overview.bs
Expand Up @@ -27,7 +27,7 @@ At Risk: the column combinator
At Risk: [=user action pseudo-classes=] applying to non-[=tree-abiding=] [=pseudo-elements=]
At Risk: the '':blank'' pseudo-class
Ignored Terms: function token, Document, DocumentFragment, math, h1, shadow tree, querySelector(), quirks mode, button, a, span, object, p, div, q, area, link, label, input, html, em, li, ol, pre, CSS Value Definition Syntax
Ignored Vars: identifier, extended filtering, i
Ignored Vars: identifier, i
</pre>
<pre class=link-defaults>
spec:css-values-4; type:dfn; text:identifier
Expand Down Expand Up @@ -1977,21 +1977,24 @@ The Language Pseudo-class: '':lang()''</h3>
e.g. '':lang(\*-Latn)'' or '':lang("*-Latn")''.)

Note: The <a>content language</a> of an element is defined by the document language.
For example, in HTML [[HTML5]], the <a>content language</a> is determined

For example, in HTML [[HTML5]], the <a>content language</a> is determined
by a combination of the <code>lang</code> attribute,
information from <a element>meta</a> elements,
and possibly also the protocol (e.g. from HTTP headers).
XML languages can use the <code>xml:lang</code> attribute
to indicate language information for an element. [[XML10]]

An element's <a>content language</a> matches a <a>language range</a> if,
when represented in BCP 47 syntax [[BCP47]],
it matches that <a>language range</a> in an <var>extended filtering</var>
operation per [[RFC4647]] <cite>Matching of Language Tags</cite> (section 3.3.2).
For this purpose, a wildcard [=language range=] (<code>"*"</code>) does not match
elements whose language is not tagged (e.g. <code>lang=""</code>),
but does match elements whose language is tagged as undetermined (<code>lang=und</code>).
The matching is performed [=ASCII case-insensitively=].
The element's <a>content language</a> matches a <a>language range</a> if
its <a>content language</a>, as represented in BCP 47 syntax,
matches the given <a>language range</a> in an <i>extended filtering</i>
operation per [[!RFC4647]] <cite>Matching of Language Tags</cite> (section 3.3.2).
Both the [=content language=] and the [=language range=]
must be <i>canonicalized</i>
and converted to <i>extlang form</i> as per section 4.5 of [[!RFC5646]]
prior to the <i>extended filtering</i> operation.
The matching is performed case-insensitively within the ASCII range.

The <a>language range</a> does not need to be a valid language code to
perform this comparison.

Expand Down

0 comments on commit f2506ef

Please sign in to comment.