Browse files

[e] (0) apply wg decision

Fixing http://www.w3.org/Bugs/Public/show_bug.cgi?id=8207

git-svn-id: http://svn.whatwg.org/webapps@6007 340c8d12-0b0e-0410-8428-c7bf67bfef74
  • Loading branch information...
1 parent 26203de commit c2695e182da54fbe020a5eaf91808916382d0055 @Hixie Hixie committed Apr 14, 2011
Showing with 848 additions and 130 deletions.
  1. +272 −42 complete.html
  2. +272 −42 index
  3. +304 −46 source
View
314 complete.html
@@ -372,8 +372,10 @@ <h2 class="no-num no-toc" id=contents>Table of contents</h2>
<li><a href=#urls><span class=secno>2.6 </span>URLs</a>
<ol>
<li><a href=#terminology-0><span class=secno>2.6.1 </span>Terminology</a></li>
- <li><a href=#dynamic-changes-to-base-urls><span class=secno>2.6.2 </span>Dynamic changes to base URLs</a></li>
- <li><a href=#interfaces-for-url-manipulation><span class=secno>2.6.3 </span>Interfaces for URL manipulation</a></ol></li>
+ <li><a href=#parsing-urls><span class=secno>2.6.2 </span>Parsing URLs</a></li>
+ <li><a href=#resolving-urls><span class=secno>2.6.3 </span>Resolving URLs</a></li>
+ <li><a href=#dynamic-changes-to-base-urls><span class=secno>2.6.4 </span>Dynamic changes to base URLs</a></li>
+ <li><a href=#interfaces-for-url-manipulation><span class=secno>2.6.5 </span>Interfaces for URL manipulation</a></ol></li>
<li><a href=#fetching-resources><span class=secno>2.7 </span>Fetching resources</a>
<ol>
<li><a href=#concept-http-equivalent><span class=secno>2.7.1 </span>Protocol concepts</a></li>
@@ -5998,9 +6000,21 @@ <h4 id=mq><span class=secno>2.5.10 </span>Media queries</h4>
<h3 id=urls><span class=secno>2.6 </span>URLs</h3>
- <h4 id=terminology-0><span class=secno>2.6.1 </span>Terminology</h4>
+ <p>This specification defines the term <a href=#url>URL</a>, and defines
+ various algorithms for dealing with URLs, because for historical
+ reasons the rules defined by the URI and IRI specifications are not
+ a complete description of what HTML user agents need to implement to
+ be compatible with Web content.</p>
+
+ <p class=note>The term "URL" in this specification is used in a
+ manner distinct from the precise technical meaning it is given in
+ RFC 3986. Readers familiar with that RFC will find it easier to read
+ <em>this</em> specification if they pretend the term "URL" as used
+ herein is really called something else altogether. This is a
+ <a href=#willful-violation>willful violation</a> of RFC 3986. <a href=#refsRFC3986>[RFC3986]</a></p>
+
- <!-- see also: svn diff -r3244:3245 source -->
+ <h4 id=terminology-0><span class=secno>2.6.1 </span>Terminology</h4>
<p>A <dfn id=url>URL</dfn> is a string used to identify a resource.</p>
@@ -6031,24 +6045,155 @@ <h4 id=terminology-0><span class=secno>2.6.1 </span>Terminology</h4>
whitespace">stripping leading and trailing whitespace</a> from
it, it is a <a href=#valid-non-empty-url>valid non-empty URL</a>.</p>
+ <p>This specification defines the URL
+ <dfn id=about:legacy-compat><code>about:legacy-compat</code></dfn> as a reserved, though
+ unresolvable, <code title="">about:</code> URI, for use in <a href=#syntax-doctype title=syntax-doctype>DOCTYPE</a>s in <a href=#html-documents>HTML
+ documents</a> when needed for compatibility with XML tools. <a href=#refsABOUT>[ABOUT]</a></p>
+
+ <p>This specification defines the URL
+ <dfn id=about:srcdoc><code>about:srcdoc</code></dfn> as a reserved, though
+ unresolvable, <code title="">about:</code> URI, that is used as
+ <a href="#the-document's-address">the document's address</a> of <a href=#an-iframe-srcdoc-document title="an iframe srcdoc
+ document"><code>iframe</code> <code title=attr-iframe-srcdoc>srcdoc</code> documents</a>. <a href=#refsABOUT>[ABOUT]</a></p>
+
+
<div class=impl>
+ <h4 id=parsing-urls><span class=secno>2.6.2 </span>Parsing URLs</h4>
+
<p>To <dfn id=parse-a-url>parse a URL</dfn> <var title="">url</var> into its
- component parts, the user agent must use the <span class=XXX>parse
- an address</span> algorithm defined by the IRI specification. <a href=#refsRFC3987>[RFC3987]</a></p>
-
- <p>Parsing a URL can fail. If it does not, then it results in the
- following components, again as defined by the IRI specification:</p>
-
- <ul class=brief><li><dfn id=url-scheme title=url-scheme>&lt;scheme&gt;</dfn></li>
- <li><dfn id=url-host title=url-host>&lt;host&gt;</dfn></li>
- <li><dfn id=url-port title=url-port>&lt;port&gt;</dfn></li>
- <li><dfn id=url-hostport title=url-hostport>&lt;hostport&gt;</dfn></li>
- <li><dfn id=url-path title=url-path>&lt;path&gt;</dfn></li>
- <li><dfn id=url-query title=url-query>&lt;query&gt;</dfn></li>
- <li><dfn id=url-fragment title=url-fragment>&lt;fragment&gt;</dfn></li>
- <li><dfn id=url-host-specific title=url-host-specific>&lt;host-specific&gt;</dfn></li>
- </ul><hr><p>To <dfn id=resolve-a-url>resolve a URL</dfn> to an <a href=#absolute-url>absolute URL</a>
+ component parts, the user agent must use the following steps:</p>
+
+ <ol><li><p>Strip leading and trailing <a href=#space-character title="space
+ character">space characters</a> from <var title="">url</var>.</li>
+
+ <li>
+
+ <p>Parse <var title="">url</var> in the manner defined by RFC
+ 3986, with the following exceptions:</p>
+
+ <ul><li>Add all characters with code points less than or equal to
+ U+0020 or greater than or equal to U+007F to the
+ &lt;unreserved&gt; production.</li>
+
+ <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E,
+ U+0060, and U+007B .. U+007D to the &lt;unreserved&gt;
+ production.
+ <!--
+ 0022 QUOTATION MARK
+ 003C LESS-THAN SIGN
+ 003E GREATER-THAN SIGN
+ 005B LEFT SQUARE BRACKET
+ 005C REVERSE SOLIDUS
+ 005D RIGHT SQUARE BRACKET
+ 005E CIRCUMFLEX ACCENT
+ 0060 GRAVE ACCENT
+ 007B LEFT CURLY BRACKET
+ 007C VERTICAL LINE
+ 007D RIGHT CURLY BRACKET
+ -->
+ </li>
+
+ <li>Add a single U+0025 PERCENT SIGN character as a second
+ alternative way of matching the &lt;pct-encoded&gt; production,
+ except when the &lt;pct-encoded&gt; is used in the
+ &lt;reg-name&gt; production.</li>
+
+ <li>Add the U+0023 NUMBER SIGN character to the characters
+ allowed in the &lt;fragment&gt; production.</li>
+
+ <!-- some browsers also have other differences, e.g. Mozilla
+ seems to treat ";" as if it was not in sub-delims, if the scheem
+ is "ftp". -->
+
+ </ul></li>
+
+ <li>
+
+ <p>If <var title="">url</var> doesn't match the
+ &lt;URI-reference&gt; production, even after the above changes are
+ made to the ABNF definitions, then parsing the URL fails with an
+ error. <a href=#refsRFC3986>[RFC3986]</a></p>
+
+ <p>Otherwise, parsing <var title="">url</var> was successful; the
+ components of the URL are substrings of <var title="">url</var>
+ defined as follows:</p>
+
+ <dl><dt><dfn id=url-scheme title=url-scheme>&lt;scheme&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;scheme&gt; production, if any.</dd>
+
+
+ <dt><dfn id=url-host title=url-host>&lt;host&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;host&gt; production, if any.</dd>
+
+
+ <dt><dfn id=url-port title=url-port>&lt;port&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;port&gt; production, if any.</dd>
+
+
+ <dt><dfn id=url-hostport title=url-hostport>&lt;hostport&gt;</dfn></dt>
+
+ <dd><p>If there is a &lt;scheme&gt; component and a &lt;port&gt;
+ component and the port given by the &lt;port&gt; component is
+ different than the default port defined for the protocol given by
+ the &lt;scheme&gt; component, then &lt;hostport&gt; is the
+ substring that starts with the substring matched by the
+ &lt;host&gt; production and ends with the substring matched by the
+ &lt;port&gt; production, and includes the colon in between the
+ two. Otherwise, it is the same as the &lt;host&gt; component.</p>
+
+
+ <dt><dfn id=url-path title=url-path>&lt;path&gt;</dfn></dt>
+
+ <dd>
+
+ <p>The substring matched by one of the following productions, if
+ one of them was matched:</p>
+
+ <ul class=brief><li>&lt;path-abempty&gt;</li>
+ <li>&lt;path-absolute&gt;</li>
+ <li>&lt;path-noscheme&gt;</li>
+ <li>&lt;path-rootless&gt;</li>
+ <li>&lt;path-empty&gt;</li>
+ </ul></dd>
+
+
+ <dt><dfn id=url-query title=url-query>&lt;query&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;query&gt; production, if any.</dd>
+
+
+ <dt><dfn id=url-fragment title=url-fragment>&lt;fragment&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;fragment&gt; production, if any.</dd>
+
+
+ <dt><dfn id=url-host-specific title=url-host-specific>&lt;host-specific&gt;</dfn></dt>
+
+ <dd><p>The substring that <em>follows</em> the substring matched
+ by the &lt;authority&gt; production, or the whole string if the
+ &lt;authority&gt; production wasn't matched.</dd>
+
+ </dl></li>
+
+ </ol><p class=note>These parsing rules are a <a href=#willful-violation>willful
+ violation</a> of RFC 3986 and RFC 3987 (which do not define error
+ handling), motivated by a desire to handle legacy content. <a href=#refsRFC3986>[RFC3986]</a> <a href=#refsRFC3987>[RFC3987]</a></p>
+
+ </div>
+
+
+ <h4 id=resolving-urls><span class=secno>2.6.3 </span>Resolving URLs</h4>
+
+ <p>Resolving a URL is the process of taking a relative URL and
+ obtaining the absolute URL that it implies.</p>
+
+ <div class=impl>
+
+ <p>To <dfn id=resolve-a-url>resolve a URL</dfn> to an <a href=#absolute-url>absolute URL</a>
relative to either another <a href=#absolute-url>absolute URL</a> or an element,
the user agent must use the following steps. Resolving a URL can
result in an error, in which case the URL is not resolvable.</p>
@@ -6150,11 +6295,113 @@ <h4 id=terminology-0><span class=secno>2.6.1 </span>Terminology</h4>
</ol></li>
- <li><p>Return the result of applying the <span class=XXX>resolve
- an address</span> algorithm defined by the IRI specification to
- resolve <var title="">url</var> relative to <var title="">base</var> using encoding <var title="">encoding</var>. <a href=#refsRFC3987>[RFC3987]</a></li>
+ <li><p><a href=#parse-a-url title="parse a URL">Parse</a> <var title="">url</var> into its component parts.</li>
- </ol></div>
+ <li>
+
+ <p>If parsing <var title="">url</var> resulted in a <a href=#url-host title=url-host>&lt;host&gt;</a> component, then replace the
+ matching substring of <var title="">url</var> with the string that
+ results from expanding any sequences of percent-encoded octets in
+ that component that are valid UTF-8 sequences into Unicode
+ characters as defined by UTF-8.</p>
+
+ <p>If any percent-encoded octets in that component are not valid
+ UTF-8 sequences, then return an error and abort these steps.</p>
+
+ <p>Apply the IDNA ToASCII algorithm to the matching substring,
+ with both the AllowUnassigned and UseSTD3ASCIIRules flags
+ set. Replace the matching substring with the result of the ToASCII
+ algorithm.</p>
+
+ <p>If ToASCII fails to convert one of the components of the
+ string, e.g. because it is too long or because it contains invalid
+ characters, then return an error and abort these steps. <a href=#refsRFC3490>[RFC3490]</a></p>
+
+ </li>
+
+ <li>
+
+ <p>If parsing <var title="">url</var> resulted in a <a href=#url-path title=url-path>&lt;path&gt;</a> component, then replace the
+ matching substring of <var title="">url</var> with the string that
+ results from applying the following steps to each character other
+ than U+0025 PERCENT SIGN (%) that doesn't match the original
+ &lt;path&gt; production defined in RFC 3986:</p>
+
+ <ol><li>Encode the character into a sequence of octets as defined by
+ UTF-8.</li>
+
+ <li>Replace the character with the percent-encoded form of those
+ octets. <a href=#refsRFC3986>[RFC3986]</a></li>
+
+ </ol><div class=example>
+
+ <p>For instance if <var title="">url</var> was "<code title="">//example.com/a^b&#9786;c%FFd%z/?e</code>", then the
+ <a href=#url-path title=url-path>&lt;path&gt;</a> component's substring
+ would be "<code title="">/a^b&#9786;c%FFd%z/</code>" and the two
+ characters that would have to be escaped would be "<code title="">^</code>" and "<code title="">&#9786;</code>". The
+ result after this step was applied would therefore be that <var title="">url</var> now had the value "<code title="">//example.com/a%5Eb%E2%98%BAc%FFd%z/?e</code>".</p>
+
+ </div>
+
+ </li>
+
+ <li>
+
+ <p>If parsing <var title="">url</var> resulted in a <a href=#url-query title=url-query>&lt;query&gt;</a> component, then replace the
+ matching substring of <var title="">url</var> with the string that
+ results from applying the following steps to each character other
+ than U+0025 PERCENT SIGN (%) that doesn't match the original
+ &lt;query&gt; production defined in RFC 3986:</p>
+
+ <ol><li>If the character in question cannot be expressed in the
+ encoding <var title="">encoding</var>, then replace it with a
+ single 0x3F octet (an ASCII question mark) and skip the remaining
+ substeps for this character.</li>
+
+ <li>Encode the character into a sequence of octets as defined by
+ the encoding <var title="">encoding</var>.</li>
+
+ <li>Replace the character with the percent-encoded form of those
+ octets. <a href=#refsRFC3986>[RFC3986]</a></li>
+
+ </ol></li>
+
+ <li><p>Apply the algorithm described in RFC 3986 section 5.2
+ Relative Resolution, using <var title="">url</var> as the
+ potentially relative URI reference (<var title="">R</var>), and
+ <var title="">base</var> as the base URI (<var title="">Base</var>). <a href=#refsRFC3986>[RFC3986]</a></li>
+
+ <li>
+
+ <p>Apply any relevant conformance criteria of RFC 3986 and RFC
+ 3987, returning an error and aborting these steps if
+ appropriate. <a href=#refsRFC3986>[RFC3986]</a> <a href=#refsRFC3987>[RFC3987]</a></p>
+
+ <p class=example>For instance, if an absolute URI that would be
+ returned by the above algorithm violates the restrictions specific
+ to its scheme, e.g. a <code title="">data:</code> URI using the
+ "<code title="">//</code>" server-based naming authority syntax,
+ then user agents are to treat this as an error instead.<!-- RFC
+ 3986, 3.1 Scheme --></p>
+
+ </li>
+
+ <li><p>Let <var title="">result</var> be the target URI (<var title="">T</var>) returned by the Relative Resolution
+ algorithm.</li>
+
+ <li><p>If <var title="">result</var> uses a scheme with a
+ server-based naming authority, replace all U+005C REVERSE SOLIDUS
+ (\) characters in <var title="">result</var> with U+002F SOLIDUS
+ (/) characters.</li>
+
+ <li><p>Return <var title="">result</var>.</li>
+
+ </ol><p class=note>Some of the steps in these rules, for example the
+ processing of U+005C REVERSE SOLIDUS (\) characters, are a
+ <a href=#willful-violation>willful violation</a> of RFC 3986 and RFC 3987, motivated
+ by a desire to handle legacy content. <a href=#refsRFC3986>[RFC3986]</a> <a href=#refsRFC3987>[RFC3987]</a></p>
+
+ </div>
<p>A <a href=#url>URL</a> is an <dfn id=absolute-url>absolute URL</dfn> if <a href=#resolve-a-url title="resolve a url">resolving</a> it results in the same output
regardless of what it is resolved relative to, and that output is
@@ -6170,28 +6417,11 @@ <h4 id=terminology-0><span class=secno>2.6.1 </span>Terminology</h4>
immediately after the <a href=#url-scheme title=url-scheme>&lt;scheme&gt;</a>
component and they are both U+002F SOLIDUS characters (//).</p>
- <hr><p>This specification defines the URL
- <dfn id=about:legacy-compat><code>about:legacy-compat</code></dfn> as a reserved, though
- unresolvable, <code title="">about:</code> URI, for use in <a href=#syntax-doctype title=syntax-doctype>DOCTYPE</a>s in <a href=#html-documents>HTML
- documents</a> when needed for compatibility with XML tools. <a href=#refsABOUT>[ABOUT]</a></p>
-
- <p>This specification defines the URL
- <dfn id=about:srcdoc><code>about:srcdoc</code></dfn> as a reserved, though
- unresolvable, <code title="">about:</code> URI, that is used as
- <a href="#the-document's-address">the document's address</a> of <a href=#an-iframe-srcdoc-document title="an iframe srcdoc
- document"><code>iframe</code> <code title=attr-iframe-srcdoc>srcdoc</code> documents</a>. <a href=#refsABOUT>[ABOUT]</a></p>
-
- <p class=note>The term "URL" in this specification is used in a
- manner distinct from the precise technical meaning it is given in
- RFC 3986. Readers familiar with that RFC will find it easier to read
- <em>this</em> specification if they pretend the term "URL" as used
- herein is really called something else altogether. This is a
- <a href=#willful-violation>willful violation</a> of RFC 3986. <a href=#refsRFC3986>[RFC3986]</a></p>
<div class=impl>
- <h4 id=dynamic-changes-to-base-urls><span class=secno>2.6.2 </span>Dynamic changes to base URLs</h4>
+ <h4 id=dynamic-changes-to-base-urls><span class=secno>2.6.4 </span>Dynamic changes to base URLs</h4>
<p>When an <code title=attr-xml-base><a href=#the-xml:base-attribute-(xml-only)>xml:base</a></code> attribute
changes, the attribute's element, and all descendant elements, are
@@ -6264,7 +6494,7 @@ <h4 id=dynamic-changes-to-base-urls><span class=secno>2.6.2 </span>Dynamic chang
- <h4 id=interfaces-for-url-manipulation><span class=secno>2.6.3 </span>Interfaces for URL manipulation</h4>
+ <h4 id=interfaces-for-url-manipulation><span class=secno>2.6.5 </span>Interfaces for URL manipulation</h4>
<p>An interface that has a complement of <dfn id=url-decomposition-idl-attributes>URL decomposition IDL
attributes</dfn> has seven attributes with the following
View
314 index
@@ -380,8 +380,10 @@
<li><a href=#urls><span class=secno>2.6 </span>URLs</a>
<ol>
<li><a href=#terminology-0><span class=secno>2.6.1 </span>Terminology</a></li>
- <li><a href=#dynamic-changes-to-base-urls><span class=secno>2.6.2 </span>Dynamic changes to base URLs</a></li>
- <li><a href=#interfaces-for-url-manipulation><span class=secno>2.6.3 </span>Interfaces for URL manipulation</a></ol></li>
+ <li><a href=#parsing-urls><span class=secno>2.6.2 </span>Parsing URLs</a></li>
+ <li><a href=#resolving-urls><span class=secno>2.6.3 </span>Resolving URLs</a></li>
+ <li><a href=#dynamic-changes-to-base-urls><span class=secno>2.6.4 </span>Dynamic changes to base URLs</a></li>
+ <li><a href=#interfaces-for-url-manipulation><span class=secno>2.6.5 </span>Interfaces for URL manipulation</a></ol></li>
<li><a href=#fetching-resources><span class=secno>2.7 </span>Fetching resources</a>
<ol>
<li><a href=#concept-http-equivalent><span class=secno>2.7.1 </span>Protocol concepts</a></li>
@@ -6007,9 +6009,21 @@ explained in the previous section, which talks about RFC 2119. -->
<h3 id=urls><span class=secno>2.6 </span>URLs</h3>
- <h4 id=terminology-0><span class=secno>2.6.1 </span>Terminology</h4>
+ <p>This specification defines the term <a href=#url>URL</a>, and defines
+ various algorithms for dealing with URLs, because for historical
+ reasons the rules defined by the URI and IRI specifications are not
+ a complete description of what HTML user agents need to implement to
+ be compatible with Web content.</p>
+
+ <p class=note>The term "URL" in this specification is used in a
+ manner distinct from the precise technical meaning it is given in
+ RFC 3986. Readers familiar with that RFC will find it easier to read
+ <em>this</em> specification if they pretend the term "URL" as used
+ herein is really called something else altogether. This is a
+ <a href=#willful-violation>willful violation</a> of RFC 3986. <a href=#refsRFC3986>[RFC3986]</a></p>
+
- <!-- see also: svn diff -r3244:3245 source -->
+ <h4 id=terminology-0><span class=secno>2.6.1 </span>Terminology</h4>
<p>A <dfn id=url>URL</dfn> is a string used to identify a resource.</p>
@@ -6040,24 +6054,155 @@ explained in the previous section, which talks about RFC 2119. -->
whitespace">stripping leading and trailing whitespace</a> from
it, it is a <a href=#valid-non-empty-url>valid non-empty URL</a>.</p>
+ <p>This specification defines the URL
+ <dfn id=about:legacy-compat><code>about:legacy-compat</code></dfn> as a reserved, though
+ unresolvable, <code title="">about:</code> URI, for use in <a href=#syntax-doctype title=syntax-doctype>DOCTYPE</a>s in <a href=#html-documents>HTML
+ documents</a> when needed for compatibility with XML tools. <a href=#refsABOUT>[ABOUT]</a></p>
+
+ <p>This specification defines the URL
+ <dfn id=about:srcdoc><code>about:srcdoc</code></dfn> as a reserved, though
+ unresolvable, <code title="">about:</code> URI, that is used as
+ <a href="#the-document's-address">the document's address</a> of <a href=#an-iframe-srcdoc-document title="an iframe srcdoc
+ document"><code>iframe</code> <code title=attr-iframe-srcdoc>srcdoc</code> documents</a>. <a href=#refsABOUT>[ABOUT]</a></p>
+
+
<div class=impl>
+ <h4 id=parsing-urls><span class=secno>2.6.2 </span>Parsing URLs</h4>
+
<p>To <dfn id=parse-a-url>parse a URL</dfn> <var title="">url</var> into its
- component parts, the user agent must use the <span class=XXX>parse
- an address</span> algorithm defined by the IRI specification. <a href=#refsRFC3987>[RFC3987]</a></p>
-
- <p>Parsing a URL can fail. If it does not, then it results in the
- following components, again as defined by the IRI specification:</p>
-
- <ul class=brief><li><dfn id=url-scheme title=url-scheme>&lt;scheme&gt;</dfn></li>
- <li><dfn id=url-host title=url-host>&lt;host&gt;</dfn></li>
- <li><dfn id=url-port title=url-port>&lt;port&gt;</dfn></li>
- <li><dfn id=url-hostport title=url-hostport>&lt;hostport&gt;</dfn></li>
- <li><dfn id=url-path title=url-path>&lt;path&gt;</dfn></li>
- <li><dfn id=url-query title=url-query>&lt;query&gt;</dfn></li>
- <li><dfn id=url-fragment title=url-fragment>&lt;fragment&gt;</dfn></li>
- <li><dfn id=url-host-specific title=url-host-specific>&lt;host-specific&gt;</dfn></li>
- </ul><hr><p>To <dfn id=resolve-a-url>resolve a URL</dfn> to an <a href=#absolute-url>absolute URL</a>
+ component parts, the user agent must use the following steps:</p>
+
+ <ol><li><p>Strip leading and trailing <a href=#space-character title="space
+ character">space characters</a> from <var title="">url</var>.</li>
+
+ <li>
+
+ <p>Parse <var title="">url</var> in the manner defined by RFC
+ 3986, with the following exceptions:</p>
+
+ <ul><li>Add all characters with code points less than or equal to
+ U+0020 or greater than or equal to U+007F to the
+ &lt;unreserved&gt; production.</li>
+
+ <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E,
+ U+0060, and U+007B .. U+007D to the &lt;unreserved&gt;
+ production.
+ <!--
+ 0022 QUOTATION MARK
+ 003C LESS-THAN SIGN
+ 003E GREATER-THAN SIGN
+ 005B LEFT SQUARE BRACKET
+ 005C REVERSE SOLIDUS
+ 005D RIGHT SQUARE BRACKET
+ 005E CIRCUMFLEX ACCENT
+ 0060 GRAVE ACCENT
+ 007B LEFT CURLY BRACKET
+ 007C VERTICAL LINE
+ 007D RIGHT CURLY BRACKET
+ -->
+ </li>
+
+ <li>Add a single U+0025 PERCENT SIGN character as a second
+ alternative way of matching the &lt;pct-encoded&gt; production,
+ except when the &lt;pct-encoded&gt; is used in the
+ &lt;reg-name&gt; production.</li>
+
+ <li>Add the U+0023 NUMBER SIGN character to the characters
+ allowed in the &lt;fragment&gt; production.</li>
+
+ <!-- some browsers also have other differences, e.g. Mozilla
+ seems to treat ";" as if it was not in sub-delims, if the scheem
+ is "ftp". -->
+
+ </ul></li>
+
+ <li>
+
+ <p>If <var title="">url</var> doesn't match the
+ &lt;URI-reference&gt; production, even after the above changes are
+ made to the ABNF definitions, then parsing the URL fails with an
+ error. <a href=#refsRFC3986>[RFC3986]</a></p>
+
+ <p>Otherwise, parsing <var title="">url</var> was successful; the
+ components of the URL are substrings of <var title="">url</var>
+ defined as follows:</p>
+
+ <dl><dt><dfn id=url-scheme title=url-scheme>&lt;scheme&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;scheme&gt; production, if any.</dd>
+
+
+ <dt><dfn id=url-host title=url-host>&lt;host&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;host&gt; production, if any.</dd>
+
+
+ <dt><dfn id=url-port title=url-port>&lt;port&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;port&gt; production, if any.</dd>
+
+
+ <dt><dfn id=url-hostport title=url-hostport>&lt;hostport&gt;</dfn></dt>
+
+ <dd><p>If there is a &lt;scheme&gt; component and a &lt;port&gt;
+ component and the port given by the &lt;port&gt; component is
+ different than the default port defined for the protocol given by
+ the &lt;scheme&gt; component, then &lt;hostport&gt; is the
+ substring that starts with the substring matched by the
+ &lt;host&gt; production and ends with the substring matched by the
+ &lt;port&gt; production, and includes the colon in between the
+ two. Otherwise, it is the same as the &lt;host&gt; component.</p>
+
+
+ <dt><dfn id=url-path title=url-path>&lt;path&gt;</dfn></dt>
+
+ <dd>
+
+ <p>The substring matched by one of the following productions, if
+ one of them was matched:</p>
+
+ <ul class=brief><li>&lt;path-abempty&gt;</li>
+ <li>&lt;path-absolute&gt;</li>
+ <li>&lt;path-noscheme&gt;</li>
+ <li>&lt;path-rootless&gt;</li>
+ <li>&lt;path-empty&gt;</li>
+ </ul></dd>
+
+
+ <dt><dfn id=url-query title=url-query>&lt;query&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;query&gt; production, if any.</dd>
+
+
+ <dt><dfn id=url-fragment title=url-fragment>&lt;fragment&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;fragment&gt; production, if any.</dd>
+
+
+ <dt><dfn id=url-host-specific title=url-host-specific>&lt;host-specific&gt;</dfn></dt>
+
+ <dd><p>The substring that <em>follows</em> the substring matched
+ by the &lt;authority&gt; production, or the whole string if the
+ &lt;authority&gt; production wasn't matched.</dd>
+
+ </dl></li>
+
+ </ol><p class=note>These parsing rules are a <a href=#willful-violation>willful
+ violation</a> of RFC 3986 and RFC 3987 (which do not define error
+ handling), motivated by a desire to handle legacy content. <a href=#refsRFC3986>[RFC3986]</a> <a href=#refsRFC3987>[RFC3987]</a></p>
+
+ </div>
+
+
+ <h4 id=resolving-urls><span class=secno>2.6.3 </span>Resolving URLs</h4>
+
+ <p>Resolving a URL is the process of taking a relative URL and
+ obtaining the absolute URL that it implies.</p>
+
+ <div class=impl>
+
+ <p>To <dfn id=resolve-a-url>resolve a URL</dfn> to an <a href=#absolute-url>absolute URL</a>
relative to either another <a href=#absolute-url>absolute URL</a> or an element,
the user agent must use the following steps. Resolving a URL can
result in an error, in which case the URL is not resolvable.</p>
@@ -6159,11 +6304,113 @@ explained in the previous section, which talks about RFC 2119. -->
</ol></li>
- <li><p>Return the result of applying the <span class=XXX>resolve
- an address</span> algorithm defined by the IRI specification to
- resolve <var title="">url</var> relative to <var title="">base</var> using encoding <var title="">encoding</var>. <a href=#refsRFC3987>[RFC3987]</a></li>
+ <li><p><a href=#parse-a-url title="parse a URL">Parse</a> <var title="">url</var> into its component parts.</li>
- </ol></div>
+ <li>
+
+ <p>If parsing <var title="">url</var> resulted in a <a href=#url-host title=url-host>&lt;host&gt;</a> component, then replace the
+ matching substring of <var title="">url</var> with the string that
+ results from expanding any sequences of percent-encoded octets in
+ that component that are valid UTF-8 sequences into Unicode
+ characters as defined by UTF-8.</p>
+
+ <p>If any percent-encoded octets in that component are not valid
+ UTF-8 sequences, then return an error and abort these steps.</p>
+
+ <p>Apply the IDNA ToASCII algorithm to the matching substring,
+ with both the AllowUnassigned and UseSTD3ASCIIRules flags
+ set. Replace the matching substring with the result of the ToASCII
+ algorithm.</p>
+
+ <p>If ToASCII fails to convert one of the components of the
+ string, e.g. because it is too long or because it contains invalid
+ characters, then return an error and abort these steps. <a href=#refsRFC3490>[RFC3490]</a></p>
+
+ </li>
+
+ <li>
+
+ <p>If parsing <var title="">url</var> resulted in a <a href=#url-path title=url-path>&lt;path&gt;</a> component, then replace the
+ matching substring of <var title="">url</var> with the string that
+ results from applying the following steps to each character other
+ than U+0025 PERCENT SIGN (%) that doesn't match the original
+ &lt;path&gt; production defined in RFC 3986:</p>
+
+ <ol><li>Encode the character into a sequence of octets as defined by
+ UTF-8.</li>
+
+ <li>Replace the character with the percent-encoded form of those
+ octets. <a href=#refsRFC3986>[RFC3986]</a></li>
+
+ </ol><div class=example>
+
+ <p>For instance if <var title="">url</var> was "<code title="">//example.com/a^b&#9786;c%FFd%z/?e</code>", then the
+ <a href=#url-path title=url-path>&lt;path&gt;</a> component's substring
+ would be "<code title="">/a^b&#9786;c%FFd%z/</code>" and the two
+ characters that would have to be escaped would be "<code title="">^</code>" and "<code title="">&#9786;</code>". The
+ result after this step was applied would therefore be that <var title="">url</var> now had the value "<code title="">//example.com/a%5Eb%E2%98%BAc%FFd%z/?e</code>".</p>
+
+ </div>
+
+ </li>
+
+ <li>
+
+ <p>If parsing <var title="">url</var> resulted in a <a href=#url-query title=url-query>&lt;query&gt;</a> component, then replace the
+ matching substring of <var title="">url</var> with the string that
+ results from applying the following steps to each character other
+ than U+0025 PERCENT SIGN (%) that doesn't match the original
+ &lt;query&gt; production defined in RFC 3986:</p>
+
+ <ol><li>If the character in question cannot be expressed in the
+ encoding <var title="">encoding</var>, then replace it with a
+ single 0x3F octet (an ASCII question mark) and skip the remaining
+ substeps for this character.</li>
+
+ <li>Encode the character into a sequence of octets as defined by
+ the encoding <var title="">encoding</var>.</li>
+
+ <li>Replace the character with the percent-encoded form of those
+ octets. <a href=#refsRFC3986>[RFC3986]</a></li>
+
+ </ol></li>
+
+ <li><p>Apply the algorithm described in RFC 3986 section 5.2
+ Relative Resolution, using <var title="">url</var> as the
+ potentially relative URI reference (<var title="">R</var>), and
+ <var title="">base</var> as the base URI (<var title="">Base</var>). <a href=#refsRFC3986>[RFC3986]</a></li>
+
+ <li>
+
+ <p>Apply any relevant conformance criteria of RFC 3986 and RFC
+ 3987, returning an error and aborting these steps if
+ appropriate. <a href=#refsRFC3986>[RFC3986]</a> <a href=#refsRFC3987>[RFC3987]</a></p>
+
+ <p class=example>For instance, if an absolute URI that would be
+ returned by the above algorithm violates the restrictions specific
+ to its scheme, e.g. a <code title="">data:</code> URI using the
+ "<code title="">//</code>" server-based naming authority syntax,
+ then user agents are to treat this as an error instead.<!-- RFC
+ 3986, 3.1 Scheme --></p>
+
+ </li>
+
+ <li><p>Let <var title="">result</var> be the target URI (<var title="">T</var>) returned by the Relative Resolution
+ algorithm.</li>
+
+ <li><p>If <var title="">result</var> uses a scheme with a
+ server-based naming authority, replace all U+005C REVERSE SOLIDUS
+ (\) characters in <var title="">result</var> with U+002F SOLIDUS
+ (/) characters.</li>
+
+ <li><p>Return <var title="">result</var>.</li>
+
+ </ol><p class=note>Some of the steps in these rules, for example the
+ processing of U+005C REVERSE SOLIDUS (\) characters, are a
+ <a href=#willful-violation>willful violation</a> of RFC 3986 and RFC 3987, motivated
+ by a desire to handle legacy content. <a href=#refsRFC3986>[RFC3986]</a> <a href=#refsRFC3987>[RFC3987]</a></p>
+
+ </div>
<p>A <a href=#url>URL</a> is an <dfn id=absolute-url>absolute URL</dfn> if <a href=#resolve-a-url title="resolve a url">resolving</a> it results in the same output
regardless of what it is resolved relative to, and that output is
@@ -6179,28 +6426,11 @@ explained in the previous section, which talks about RFC 2119. -->
immediately after the <a href=#url-scheme title=url-scheme>&lt;scheme&gt;</a>
component and they are both U+002F SOLIDUS characters (//).</p>
- <hr><p>This specification defines the URL
- <dfn id=about:legacy-compat><code>about:legacy-compat</code></dfn> as a reserved, though
- unresolvable, <code title="">about:</code> URI, for use in <a href=#syntax-doctype title=syntax-doctype>DOCTYPE</a>s in <a href=#html-documents>HTML
- documents</a> when needed for compatibility with XML tools. <a href=#refsABOUT>[ABOUT]</a></p>
-
- <p>This specification defines the URL
- <dfn id=about:srcdoc><code>about:srcdoc</code></dfn> as a reserved, though
- unresolvable, <code title="">about:</code> URI, that is used as
- <a href="#the-document's-address">the document's address</a> of <a href=#an-iframe-srcdoc-document title="an iframe srcdoc
- document"><code>iframe</code> <code title=attr-iframe-srcdoc>srcdoc</code> documents</a>. <a href=#refsABOUT>[ABOUT]</a></p>
-
- <p class=note>The term "URL" in this specification is used in a
- manner distinct from the precise technical meaning it is given in
- RFC 3986. Readers familiar with that RFC will find it easier to read
- <em>this</em> specification if they pretend the term "URL" as used
- herein is really called something else altogether. This is a
- <a href=#willful-violation>willful violation</a> of RFC 3986. <a href=#refsRFC3986>[RFC3986]</a></p>
<div class=impl>
- <h4 id=dynamic-changes-to-base-urls><span class=secno>2.6.2 </span>Dynamic changes to base URLs</h4>
+ <h4 id=dynamic-changes-to-base-urls><span class=secno>2.6.4 </span>Dynamic changes to base URLs</h4>
<p>When an <code title=attr-xml-base><a href=#the-xml:base-attribute-(xml-only)>xml:base</a></code> attribute
changes, the attribute's element, and all descendant elements, are
@@ -6273,7 +6503,7 @@ explained in the previous section, which talks about RFC 2119. -->
- <h4 id=interfaces-for-url-manipulation><span class=secno>2.6.3 </span>Interfaces for URL manipulation</h4>
+ <h4 id=interfaces-for-url-manipulation><span class=secno>2.6.5 </span>Interfaces for URL manipulation</h4>
<p>An interface that has a complement of <dfn id=url-decomposition-idl-attributes>URL decomposition IDL
attributes</dfn> has seven attributes with the following
View
350 source
@@ -5609,9 +5609,22 @@ is conforming depends on which specs apply, and leaves it at that. -->
<h3>URLs</h3>
- <h4>Terminology</h4>
+ <p>This specification defines the term <span>URL</span>, and defines
+ various algorithms for dealing with URLs, because for historical
+ reasons the rules defined by the URI and IRI specifications are not
+ a complete description of what HTML user agents need to implement to
+ be compatible with Web content.</p>
- <!-- see also: svn diff -r3244:3245 source -->
+ <p class="note">The term "URL" in this specification is used in a
+ manner distinct from the precise technical meaning it is given in
+ RFC 3986. Readers familiar with that RFC will find it easier to read
+ <em>this</em> specification if they pretend the term "URL" as used
+ herein is really called something else altogether. This is a
+ <span>willful violation</span> of RFC 3986. <a
+ href="#refsRFC3986">[RFC3986]</a></p>
+
+
+ <h4>Terminology</h4>
<p>A <dfn>URL</dfn> is a string used to identify a resource.</p>
@@ -5650,28 +5663,175 @@ is conforming depends on which specs apply, and leaves it at that. -->
whitespace">stripping leading and trailing whitespace</span> from
it, it is a <span>valid non-empty URL</span>.</p>
+ <p>This specification defines the URL
+ <dfn><code>about:legacy-compat</code></dfn> as a reserved, though
+ unresolvable, <code title="">about:</code> URI, for use in <span
+ title="syntax-doctype">DOCTYPE</span>s in <span>HTML
+ documents</span> when needed for compatibility with XML tools. <a
+ href="#refsABOUT">[ABOUT]</a></p>
+
+ <p>This specification defines the URL
+ <dfn><code>about:srcdoc</code></dfn> as a reserved, though
+ unresolvable, <code title="">about:</code> URI, that is used as
+ <span>the document's address</span> of <span title="an iframe srcdoc
+ document"><code>iframe</code> <code
+ title="attr-iframe-srcdoc">srcdoc</code> documents</span>. <a
+ href="#refsABOUT">[ABOUT]</a></p>
+
+
<div class="impl">
+ <h4>Parsing URLs</h4>
+
<p>To <dfn>parse a URL</dfn> <var title="">url</var> into its
- component parts, the user agent must use the <span class="XXX">parse
- an address</span> algorithm defined by the IRI specification. <a
+ component parts, the user agent must use the following steps:</p>
+
+ <ol>
+
+ <li><p>Strip leading and trailing <span title="space
+ character">space characters</span> from <var
+ title="">url</var>.</p></li>
+
+ <li>
+
+ <p>Parse <var title="">url</var> in the manner defined by RFC
+ 3986, with the following exceptions:</p>
+
+ <ul>
+
+ <li>Add all characters with code points less than or equal to
+ U+0020 or greater than or equal to U+007F to the
+ &lt;unreserved&gt; production.</li>
+
+ <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E,
+ U+0060, and U+007B .. U+007D to the &lt;unreserved&gt;
+ production.
+ <!--
+ 0022 QUOTATION MARK
+ 003C LESS-THAN SIGN
+ 003E GREATER-THAN SIGN
+ 005B LEFT SQUARE BRACKET
+ 005C REVERSE SOLIDUS
+ 005D RIGHT SQUARE BRACKET
+ 005E CIRCUMFLEX ACCENT
+ 0060 GRAVE ACCENT
+ 007B LEFT CURLY BRACKET
+ 007C VERTICAL LINE
+ 007D RIGHT CURLY BRACKET
+ -->
+ </li>
+
+ <li>Add a single U+0025 PERCENT SIGN character as a second
+ alternative way of matching the &lt;pct-encoded&gt; production,
+ except when the &lt;pct-encoded&gt; is used in the
+ &lt;reg-name&gt; production.</li>
+
+ <li>Add the U+0023 NUMBER SIGN character to the characters
+ allowed in the &lt;fragment&gt; production.</li>
+
+ <!-- some browsers also have other differences, e.g. Mozilla
+ seems to treat ";" as if it was not in sub-delims, if the scheem
+ is "ftp". -->
+
+ </ul>
+
+ </li>
+
+ <li>
+
+ <p>If <var title="">url</var> doesn't match the
+ &lt;URI-reference&gt; production, even after the above changes are
+ made to the ABNF definitions, then parsing the URL fails with an
+ error. <a href="#refsRFC3986">[RFC3986]</a></p>
+
+ <p>Otherwise, parsing <var title="">url</var> was successful; the
+ components of the URL are substrings of <var title="">url</var>
+ defined as follows:</p>
+
+ <dl>
+
+ <dt><dfn title="url-scheme">&lt;scheme&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;scheme&gt; production, if any.</p></dd>
+
+
+ <dt><dfn title="url-host">&lt;host&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;host&gt; production, if any.</p></dd>
+
+
+ <dt><dfn title="url-port">&lt;port&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;port&gt; production, if any.</p></dd>
+
+
+ <dt><dfn title="url-hostport">&lt;hostport&gt;</dfn></dt>
+
+ <dd><p>If there is a &lt;scheme&gt; component and a &lt;port&gt;
+ component and the port given by the &lt;port&gt; component is
+ different than the default port defined for the protocol given by
+ the &lt;scheme&gt; component, then &lt;hostport&gt; is the
+ substring that starts with the substring matched by the
+ &lt;host&gt; production and ends with the substring matched by the
+ &lt;port&gt; production, and includes the colon in between the
+ two. Otherwise, it is the same as the &lt;host&gt; component.</p>
+
+
+ <dt><dfn title="url-path">&lt;path&gt;</dfn></dt>
+
+ <dd>
+
+ <p>The substring matched by one of the following productions, if
+ one of them was matched:</p>
+
+ <ul class="brief">
+ <li>&lt;path-abempty&gt;</li>
+ <li>&lt;path-absolute&gt;</li>
+ <li>&lt;path-noscheme&gt;</li>
+ <li>&lt;path-rootless&gt;</li>
+ <li>&lt;path-empty&gt;</li>
+ </ul>
+
+ </dd>
+
+
+ <dt><dfn title="url-query">&lt;query&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;query&gt; production, if any.</p></dd>
+
+
+ <dt><dfn title="url-fragment">&lt;fragment&gt;</dfn></dt>
+
+ <dd><p>The substring matched by the &lt;fragment&gt; production, if any.</p></dd>
+
+
+ <dt><dfn title="url-host-specific">&lt;host-specific&gt;</dfn></dt>
+
+ <dd><p>The substring that <em>follows</em> the substring matched
+ by the &lt;authority&gt; production, or the whole string if the
+ &lt;authority&gt; production wasn't matched.</p></dd>
+
+ </dl>
+
+ </li>
+
+ </ol>
+
+ <p class="note">These parsing rules are a <span>willful
+ violation</span> of RFC 3986 and RFC 3987 (which do not define error
+ handling), motivated by a desire to handle legacy content. <a
+ href="#refsRFC3986">[RFC3986]</a> <a
href="#refsRFC3987">[RFC3987]</a></p>
- <p>Parsing a URL can fail. If it does not, then it results in the
- following components, again as defined by the IRI specification:</p>
+ </div>
- <ul class="brief">
- <li><dfn title="url-scheme">&lt;scheme&gt;</dfn></li>
- <li><dfn title="url-host">&lt;host&gt;</dfn></li>
- <li><dfn title="url-port">&lt;port&gt;</dfn></li>
- <li><dfn title="url-hostport">&lt;hostport&gt;</dfn></li>
- <li><dfn title="url-path">&lt;path&gt;</dfn></li>
- <li><dfn title="url-query">&lt;query&gt;</dfn></li>
- <li><dfn title="url-fragment">&lt;fragment&gt;</dfn></li>
- <li><dfn title="url-host-specific">&lt;host-specific&gt;</dfn></li>
- </ul>
- <hr>
+ <h4>Resolving URLs</h4>
+
+ <p>Resolving a URL is the process of taking a relative URL and
+ obtaining the absolute URL that it implies.</p>
+
+ <div class="impl">
<p>To <dfn>resolve a URL</dfn> to an <span>absolute URL</span>
relative to either another <span>absolute URL</span> or an element,
@@ -5791,14 +5951,136 @@ is conforming depends on which specs apply, and leaves it at that. -->
</li>
- <li><p>Return the result of applying the <span class="XXX">resolve
- an address</span> algorithm defined by the IRI specification to
- resolve <var title="">url</var> relative to <var
- title="">base</var> using encoding <var title="">encoding</var>. <a
- href="#refsRFC3987">[RFC3987]</a></p></li>
+ <li><p><span title="parse a URL">Parse</span> <var
+ title="">url</var> into its component parts.</p></li>
+
+ <li>
+
+ <p>If parsing <var title="">url</var> resulted in a <span
+ title="url-host">&lt;host&gt;</span> component, then replace the
+ matching substring of <var title="">url</var> with the string that
+ results from expanding any sequences of percent-encoded octets in
+ that component that are valid UTF-8 sequences into Unicode
+ characters as defined by UTF-8.</p>
+
+ <p>If any percent-encoded octets in that component are not valid
+ UTF-8 sequences, then return an error and abort these steps.</p>
+
+ <p>Apply the IDNA ToASCII algorithm to the matching substring,
+ with both the AllowUnassigned and UseSTD3ASCIIRules flags
+ set. Replace the matching substring with the result of the ToASCII
+ algorithm.</p>
+
+ <p>If ToASCII fails to convert one of the components of the
+ string, e.g. because it is too long or because it contains invalid
+ characters, then return an error and abort these steps. <a
+ href="#refsRFC3490">[RFC3490]</a></p>
+
+ </li>
+
+ <li>
+
+ <p>If parsing <var title="">url</var> resulted in a <span
+ title="url-path">&lt;path&gt;</span> component, then replace the
+ matching substring of <var title="">url</var> with the string that
+ results from applying the following steps to each character other
+ than U+0025 PERCENT SIGN (%) that doesn't match the original
+ &lt;path&gt; production defined in RFC 3986:</p>
+
+ <ol>
+
+ <li>Encode the character into a sequence of octets as defined by
+ UTF-8.</li>
+
+ <li>Replace the character with the percent-encoded form of those
+ octets. <a href="#refsRFC3986">[RFC3986]</a></li>
+
+ </ol>
+
+ <div class="example">
+
+ <p>For instance if <var title="">url</var> was "<code
+ title="">//example.com/a^b&#x263a;c%FFd%z/?e</code>", then the
+ <span title="url-path">&lt;path&gt;</span> component's substring
+ would be "<code title="">/a^b&#x263a;c%FFd%z/</code>" and the two
+ characters that would have to be escaped would be "<code
+ title="">^</code>" and "<code title="">&#x263a;</code>". The
+ result after this step was applied would therefore be that <var
+ title="">url</var> now had the value "<code
+ title="">//example.com/a%5Eb%E2%98%BAc%FFd%z/?e</code>".</p>
+
+ </div>
+
+ </li>
+
+ <li>
+
+ <p>If parsing <var title="">url</var> resulted in a <span
+ title="url-query">&lt;query&gt;</span> component, then replace the
+ matching substring of <var title="">url</var> with the string that
+ results from applying the following steps to each character other
+ than U+0025 PERCENT SIGN (%) that doesn't match the original
+ &lt;query&gt; production defined in RFC 3986:</p>
+
+ <ol>
+
+ <li>If the character in question cannot be expressed in the
+ encoding <var title="">encoding</var>, then replace it with a
+ single 0x3F octet (an ASCII question mark) and skip the remaining
+ substeps for this character.</li>
+
+ <li>Encode the character into a sequence of octets as defined by
+ the encoding <var title="">encoding</var>.</li>
+
+ <li>Replace the character with the percent-encoded form of those
+ octets. <a href="#refsRFC3986">[RFC3986]</a></li>
+
+ </ol>
+
+ </li>
+
+ <li><p>Apply the algorithm described in RFC 3986 section 5.2
+ Relative Resolution, using <var title="">url</var> as the
+ potentially relative URI reference (<var title="">R</var>), and
+ <var title="">base</var> as the base URI (<var
+ title="">Base</var>). <a href="#refsRFC3986">[RFC3986]</a></p></li>
+
+ <li>
+
+ <p>Apply any relevant conformance criteria of RFC 3986 and RFC
+ 3987, returning an error and aborting these steps if
+ appropriate. <a href="#refsRFC3986">[RFC3986]</a> <a
+ href="#refsRFC3987">[RFC3987]</a></p>
+
+ <p class="example">For instance, if an absolute URI that would be
+ returned by the above algorithm violates the restrictions specific
+ to its scheme, e.g. a <code title="">data:</code> URI using the
+ "<code title="">//</code>" server-based naming authority syntax,
+ then user agents are to treat this as an error instead.<!-- RFC
+ 3986, 3.1 Scheme --></p>
+
+ </li>
+
+ <li><p>Let <var title="">result</var> be the target URI (<var
+ title="">T</var>) returned by the Relative Resolution
+ algorithm.</p></li>
+
+ <li><p>If <var title="">result</var> uses a scheme with a
+ server-based naming authority, replace all U+005C REVERSE SOLIDUS
+ (\) characters in <var title="">result</var> with U+002F SOLIDUS
+ (/) characters.</p></li>
+
+ <li><p>Return <var title="">result</var>.</p></li>
</ol>
+ <p class="note">Some of the steps in these rules, for example the
+ processing of U+005C REVERSE SOLIDUS (\) characters, are a
+ <span>willful violation</span> of RFC 3986 and RFC 3987, motivated
+ by a desire to handle legacy content. <a
+ href="#refsRFC3986">[RFC3986]</a> <a
+ href="#refsRFC3987">[RFC3987]</a></p>
+
</div>
<p>A <span>URL</span> is an <dfn>absolute URL</dfn> if <span
@@ -5818,30 +6100,6 @@ is conforming depends on which specs apply, and leaves it at that. -->
immediately after the <span title="url-scheme">&lt;scheme&gt;</span>
component and they are both U+002F SOLIDUS characters (//).</p>
- <hr>
-
- <p>This specification defines the URL
- <dfn><code>about:legacy-compat</code></dfn> as a reserved, though
- unresolvable, <code title="">about:</code> URI, for use in <span
- title="syntax-doctype">DOCTYPE</span>s in <span>HTML
- documents</span> when needed for compatibility with XML tools. <a
- href="#refsABOUT">[ABOUT]</a></p>
-
- <p>This specification defines the URL
- <dfn><code>about:srcdoc</code></dfn> as a reserved, though
- unresolvable, <code title="">about:</code> URI, that is used as
- <span>the document's address</span> of <span title="an iframe srcdoc
- document"><code>iframe</code> <code
- title="attr-iframe-srcdoc">srcdoc</code> documents</span>. <a
- href="#refsABOUT">[ABOUT]</a></p>
-
- <p class="note">The term "URL" in this specification is used in a
- manner distinct from the precise technical meaning it is given in
- RFC 3986. Readers familiar with that RFC will find it easier to read
- <em>this</em> specification if they pretend the term "URL" as used
- herein is really called something else altogether. This is a
- <span>willful violation</span> of RFC 3986. <a
- href="#refsRFC3986">[RFC3986]</a></p>
<div class="impl">

0 comments on commit c2695e1

Please sign in to comment.