diff --git a/review-drafts/2018-12.bs b/review-drafts/2018-12.bs
new file mode 100644
index 0000000..7ce875d
--- /dev/null
+++ b/review-drafts/2018-12.bs
@@ -0,0 +1,3241 @@
+<pre class=metadata>
+Group: WHATWG
+Date: 2018-12-11
+H1: Encoding
+Shortname: encoding
+Text Macro: TWITTER encodings
+Abstract: The Encoding Standard defines encodings and their JavaScript API.
+Translation: ja https://triple-underscore.github.io/Encoding-ja.html
+Markup Shorthands: css off
+Translate IDs: dictdef-textdecoderoptions textdecoderoptions,dictdef-textdecodeoptions textdecodeoptions,index section-index
+</pre>
+
+<link rel=stylesheet href=visualization-colors.css>
+
+<pre class=link-defaults>
+spec:infra; type:dfn;
+    text:code point
+    text:ascii case-insensitive
+spec:streams;
+    type:interface; text:ReadableStream
+</pre>
+
+
+
+<h2 id=preface>Preface</h2>
+
+<p>The UTF-8 encoding is the most appropriate encoding for interchange of Unicode, the
+universal coded character set. Therefore for new protocols and formats, as well as
+existing formats deployed in new contexts, this specification requires (and defines) the
+UTF-8 encoding.
+
+<p>The other (legacy) encodings have been defined to some extent in the past. However,
+user agents have not always implemented them in the same way, have not always used the
+same labels, and often differ in dealing with undefined and former proprietary areas of
+encodings. This specification addresses those gaps so that new user agents do not have to
+reverse engineer encoding implementations and existing user agents can converge.
+
+<p>In particular, this specification defines all those encodings, their algorithms to go
+from bytes to scalar values and back, and their canonical names and identifying labels.
+This specification also defines an API to expose part of the encoding algorithms to
+JavaScript.
+
+<p>User agents have also significantly deviated from the labels listed in the
+<a href=https://www.iana.org/assignments/character-sets/character-sets.xhtml>IANA Character Sets registry</a>.
+To stop spreading legacy encodings further, this specification is exhaustive about the
+aforementioned details and therefore has no need for the registry. In particular, this
+specification does not provide a mechanism for extending any aspect of encodings.
+
+
+
+<h2 id=security-background>Security background</h2>
+
+<p>There is a set of encoding security issues when the producer and consumer do not agree
+on the encoding in use, or on the way a given encoding is to be implemented. For instance,
+an attack was reported in 2011 where a <a>Shift_JIS</a> lead byte 0x82 was used to
+“mask” a 0x22 trail byte in a JSON resource of which an attacker could control some field.
+The producer did not see the problem even though this is an illegal byte combination. The
+consumer decoded it as a single U+FFFD and therefore changed the overall interpretation as
+U+0022 is an important delimiter. Decoders of encodings that use multiple bytes for scalar
+values now require that in case of an illegal byte combination, a scalar value in the
+range U+0000 to U+007F, inclusive, cannot be “masked”. For the aforementioned sequence the
+output would be U+FFFD U+0022.
+
+<p>This is a larger issue for encodings that map anything that is an <a>ASCII byte</a> to
+something that is not an <a>ASCII code point</a>, when there is no lead byte present. These
+are “ASCII-incompatible” encodings and other than <a>ISO-2022-JP</a>, <a>UTF-16BE</a>,
+and <a>UTF-16LE</a>, which are unfortunately required due to deployed content, they are not
+supported. (Investigation is
+<a href=https://github.com/whatwg/encoding/issues/8 lt="Add more labels to the replacement encoding">ongoing</a>
+whether more labels of other such encodings can be mapped to the <a>replacement</a>
+encoding, rather than the unknown encoding fallback.) An example attack is injecting
+carefully crafted content into a resource and then encouraging the user to override the
+encoding, resulting in e.g. script execution.
+
+<p>Encoders used by URLs found in HTML and HTML's form feature can also result in slight
+information loss when an encoding is used that cannot represent all scalar values. E.g.
+when a resource uses the <a>windows-1252</a> encoding a server will not be able to
+distinguish between an end user entering “💩” and “&amp;#128169;” into a form.
+
+<p>The problems outlined here go away when exclusively using UTF-8, which is one of the
+many reasons that is now the mandatory encoding for all things.
+
+<p class=note>See also the <a href=#browser-ui>Browser UI</a> chapter.
+
+
+
+<h2 id=terminology>Terminology</h2>
+
+<p>This specification depends on the Infra Standard. [[!INFRA]]
+
+<p>Hexadecimal numbers are prefixed with "0x".
+
+<p>In equations, all numbers are integers, addition is represented by "+", subtraction by "&minus;",
+multiplication by "×", integer division by "/" (returns the quotient), modulo by "%" (returns the
+remainder of an integer division), logical left shifts by "&lt;&lt;", logical right shifts by ">>",
+bitwise AND by "&amp;", and bitwise OR by "|".
+
+<p>For logical right shifts operands must have at least twenty-one bits precision.
+
+<hr>
+
+<p>A <dfn id=concept-token>token</dfn> is a piece of data, such as a <a>byte</a>
+or <a>code point</a>.
+
+<p>A <dfn id=concept-stream>stream</dfn> represents an ordered sequence of
+<a>tokens</a>. <dfn>End-of-stream</dfn> is a special
+<a>token</a> that signifies no more
+<a>tokens</a> are in the
+<a for=/>stream</a>.
+
+<p>When a <a>token</a> is
+<dfn id=concept-stream-read for=stream>read</dfn> from a <a for=/>stream</a>,
+the first token in the stream must be returned and subsequently removed, and
+<a>end-of-stream</a> must be returned otherwise.
+<!-- this means read is blocking on e.g. networking activity;
+     SimonSapin thinks this is fine, blame him if not -->
+
+<p>When one or more <a>tokens</a> are
+<dfn id=concept-stream-prepend for=stream>prepended</dfn> to a
+<a for=/>stream</a>, those tokens must be inserted, in given order,
+before the first token in the stream.
+
+<p class=example id=example-tokens>Inserting the sequence of tokens <code>&amp;#128169;</code>
+in a stream "<code> hello world</code>", results in a stream
+"<code>&amp;#128169; hello world</code>". The next token to be read would be
+<code>&amp;</code>. <!-- 💩 -->
+
+<p>When one or more <a>tokens</a> are
+<dfn id=concept-stream-push for=stream>pushed</dfn> to a <a for=/>stream</a>,
+those tokens must be inserted, in given order, after the last token in the stream.
+
+
+
+<h2 id=encodings>Encodings</h2>
+
+<p>An <dfn export>encoding</dfn> defines a mapping from a <a>scalar value</a> sequence to
+a <a>byte</a> sequence (and vice versa). Each <a for=/>encoding</a> has a
+<dfn id=name export for=encoding>name</dfn>, and one or more
+<dfn id=label export for=encoding lt=label>labels</dfn>.
+
+<p class="note no-backref">This specification defines three <a for=/>encodings</a> with the same
+names as <i>encoding schemes</i> defined in the Unicode standard: <a>UTF-8</a>, <a>UTF-16LE</a>, and
+<a>UTF-16BE</a>. The <a for=/>encodings</a> differ from the <i>encoding schemes</i> by byte order
+mark (also known as BOM) handling not being part of the <a for=/>encodings</a> themselves and
+instead being part of wrapper algorithms in this specification, whereas byte order mark handling is
+part of the definition of the <i>encoding schemes</i> in the Unicode Standard. <a>UTF-8</a> used
+together with the <a>UTF-8 decode</a> algorithm matches the <i>encoding scheme</i> of the same name.
+This specification does not provide wrapper algorithms that would combine with <a>UTF-16LE</a> and
+<a>UTF-16BE</a> to match the similarly-named <i>encoding schemes</i>. [[UNICODE]]
+
+
+<h3 id=encoders-and-decoders>Encoders and decoders</h3>
+
+<p>Each <a for=/>encoding</a> has an associated <dfn>decoder</dfn> and most of them have an
+associated <dfn>encoder</dfn>. Each <a for=/>decoder</a> and <a for=/>encoder</a> have a
+<dfn>handler</dfn> algorithm. A <a>handler</a> algorithm takes an input
+<a for=/>stream</a> and a <a>token</a>, and returns
+<dfn>finished</dfn>, one or more <a>tokens</a>, <dfn>error</dfn>
+optionally with a <a>code point</a>, or <dfn>continue</dfn>.
+
+<p class="note no-backref">The <a>replacement</a>, <a>UTF-16BE</a>, and
+<a>UTF-16LE</a> <a for=/>encodings</a> have no <a for=/>encoder</a>.
+
+<p>An <dfn>error mode</dfn> as used below is "<code>replacement</code>" (default) or
+"<code>fatal</code>" for a <a for=/>decoder</a> and "<code>fatal</code>" (default) or
+"<code>html</code>" for an <a for=/>encoder</a>.
+
+<p class=note>An XML processor would set <a for=/>error mode</a> to "<code>fatal</code>".
+[[XML]]
+
+<p class=note><code>html</code> exists as <a for=/>error mode</a> due to URLs and HTML forms
+requiring a non-terminating legacy <a for=/>encoder</a>. The "<code>html</code>"
+<a for=/>error mode</a> causes a sequence to be emitted that cannot be distinguished from
+legitimate input and can therefore lead to silent data loss. Developers are strongly
+encouraged to use the <a>UTF-8</a> <a for=/>encoding</a> to prevent this from
+happening.
+[[URL]]
+[[HTML]]
+
+<p>To <dfn id=concept-encoding-run for=encoding>run</dfn> an <a for=/>encoding</a>'s
+<a for=/>decoder</a> or <a for=/>encoder</a> <var>encoderDecoder</var> with input
+<a for=/>stream</a> <var>input</var>, output
+<a for=/>stream</a> <var>output</var>, and optional
+<a for=/>error mode</a> <var>mode</var>, run these steps:
+
+<ol>
+ <li><p>If <var>mode</var> is not given, set it to "<code>replacement</code>", if
+ <var>encoderDecoder</var> is a <a for=/>decoder</a>, and "<code>fatal</code>" otherwise.
+
+ <li><p>Let <var>encoderDecoderInstance</var> be a new <var>encoderDecoder</var>.
+
+ <li>
+  <p>While true:
+
+  <ol>
+   <li><p>Let <var>result</var> be the result of
+   <a>processing</a> the result of
+   <a>reading</a> from <var>input</var> for
+   <var>encoderDecoderInstance</var>, <var>input</var>, <var>output</var>, and
+   <var>mode</var>.
+
+   <li><p>If <var>result</var> is not <a>continue</a>, return <var>result</var>.
+
+   <li><p>Otherwise, do nothing.
+  </ol>
+</ol>
+
+<p>To <dfn id=concept-encoding-process for=encoding>process</dfn> a
+<a>token</a> <var>token</var> for an <a for=/>encoding</a>'s
+<a for=/>encoder</a> or <a for=/>decoder</a> instance <var>encoderDecoderInstance</var>,
+<a for=/>stream</a> <var>input</var>, output
+<a for=/>stream</a> <var>output</var>, and optional
+<a for=/>error mode</a> <var>mode</var>, run these steps:
+
+<ol>
+ <li><p>If <var>mode</var> is not given, set it to "<code>replacement</code>", if
+ <var>encoderDecoderInstance</var> is a <a for=/>decoder</a> instance, and "<code>fatal</code>"
+ otherwise.
+
+ <li><p>Let <var>result</var> be the result of running <var>encoderDecoderInstance</var>'s
+ <a>handler</a> on <var>input</var> and <var>token</var>.
+
+ <li><p>If <var>result</var> is <a>continue</a> or <a>finished</a>, return
+ <var>result</var>.
+
+ <li><p>Otherwise, if <var>result</var> is one or more
+ <a>tokens</a>, <a>push</a>
+ <var>result</var> to <var>output</var>.
+
+ <li>
+  <p>Otherwise, if <var>result</var> is <a>error</a>, switch on <var>mode</var> and
+  run the associated steps:
+
+  <dl class=switch>
+   <dt>"<code>replacement</code>"
+   <dd><a>Push</a> U+FFFD to <var>output</var>.
+   <dt>"<code>html</code>"
+   <dd><a>Prepend</a> U+0026, U+0023, followed by the
+   shortest sequence of <a>ASCII digits</a> representing <var>result</var>'s
+   <a>code point</a> in base ten, followed by U+003B to <var>input</var>.
+   <!-- &# ... ; -->
+   <dt>"<code>fatal</code>"
+   <dd>Return <a>error</a>.
+  </dl>
+
+ <li>Return <a>continue</a>.
+</ol>
+
+
+<h3 id=names-and-labels>Names and labels</h3>
+
+<p>The table below lists all <a for=/>encodings</a>
+and their <a>labels</a> user agents must support.
+User agents must not support any other <a for=/>encodings</a>
+or <a>labels</a>.
+
+<p class=note>For each encoding, <a lt="ASCII lowercase">ASCII-lowercasing</a> its
+<a for=encoding>name</a> yields one of its <a for=encoding>labels</a>.
+
+<p>Authors must use the <a>UTF-8</a> <a for=/>encoding</a> and must use the
+<a>ASCII case-insensitive</a> "<code>utf-8</code>" <a>label</a> to
+identify it.
+
+<p>New protocols and formats, as well as existing formats deployed in new contexts, must
+use the <a>UTF-8</a> <a for=/>encoding</a> exclusively. If these protocols and
+formats need to expose the <a for=/>encoding</a>'s <a>name</a> or
+<a>label</a>, they must expose it as "<code>utf-8</code>".
+
+<p>To
+<dfn export lt="get an encoding|getting an encoding" id=concept-encoding-get>get an encoding</dfn>
+from a string <var>label</var>, run these steps:
+
+<ol>
+ <li><p>Remove any leading and trailing <a>ASCII whitespace</a> from
+ <var>label</var>.
+
+ <li><p>If <var>label</var> is an <a>ASCII case-insensitive</a>
+ match for any of the <a>labels</a> listed in the table
+ below, return the corresponding <a for=/>encoding</a>, and failure otherwise.
+</ol>
+
+<p class="note no-backref">This is a more basic and restrictive algorithm of mapping <a>labels</a>
+to <a for=/>encodings</a> than
+<a href=https://www.unicode.org/reports/tr22/tr22-8.html#Charset_Alias_Matching>section 1.4 of Unicode Technical Standard #22</a>
+prescribes, as that is necessary to be compatible with deployed content.
+
+<table>
+ <thead>
+  <tr>
+   <th><a>Name</a>
+   <th><a>Labels</a>
+ <tbody>
+  <tr><th colspan=2><a href=#the-encoding>The Encoding</a>
+  <tr>
+   <td rowspan=3><a>UTF-8</a>
+   <td>"<code>unicode-1-1-utf-8</code>"
+  <tr><td>"<code>utf-8</code>"
+  <tr><td>"<code>utf8</code>"
+ <tbody>
+  <tr><th colspan=2><a href=#legacy-single-byte-encodings>Legacy single-byte encodings</a>
+  <tr>
+   <td rowspan=4><a>IBM866</a>
+   <td>"<code>866</code>"
+  <tr><td>"<code>cp866</code>"
+  <tr><td>"<code>csibm866</code>"
+  <tr><td>"<code>ibm866</code>"
+  <tr>
+   <td rowspan=9><a>ISO-8859-2</a>
+   <td>"<code>csisolatin2</code>"
+  <tr><td>"<code>iso-8859-2</code>"
+  <tr><td>"<code>iso-ir-101</code>"
+  <tr><td>"<code>iso8859-2</code>"
+  <tr><td>"<code>iso88592</code>"
+  <tr><td>"<code>iso_8859-2</code>"
+  <tr><td>"<code>iso_8859-2:1987</code>"
+  <tr><td>"<code>l2</code>"
+  <tr><td>"<code>latin2</code>"
+  <tr>
+   <td rowspan=9><a>ISO-8859-3</a>
+   <td>"<code>csisolatin3</code>"
+  <tr><td>"<code>iso-8859-3</code>"
+  <tr><td>"<code>iso-ir-109</code>"
+  <tr><td>"<code>iso8859-3</code>"
+  <tr><td>"<code>iso88593</code>"
+  <tr><td>"<code>iso_8859-3</code>"
+  <tr><td>"<code>iso_8859-3:1988</code>"
+  <tr><td>"<code>l3</code>"
+  <tr><td>"<code>latin3</code>"
+  <tr>
+   <td rowspan=9><a>ISO-8859-4</a>
+   <td>"<code>csisolatin4</code>"
+  <tr><td>"<code>iso-8859-4</code>"
+  <tr><td>"<code>iso-ir-110</code>"
+  <tr><td>"<code>iso8859-4</code>"
+  <tr><td>"<code>iso88594</code>"
+  <tr><td>"<code>iso_8859-4</code>"
+  <tr><td>"<code>iso_8859-4:1988</code>"
+  <tr><td>"<code>l4</code>"
+  <tr><td>"<code>latin4</code>"
+  <tr>
+   <td rowspan=8><a>ISO-8859-5</a>
+   <td>"<code>csisolatincyrillic</code>"
+  <tr><td>"<code>cyrillic</code>"
+  <tr><td>"<code>iso-8859-5</code>"
+  <tr><td>"<code>iso-ir-144</code>"
+  <tr><td>"<code>iso8859-5</code>"
+  <tr><td>"<code>iso88595</code>"
+  <tr><td>"<code>iso_8859-5</code>"
+  <tr><td>"<code>iso_8859-5:1988</code>"
+  <tr>
+   <td rowspan=14><a>ISO-8859-6</a>
+   <td>"<code>arabic</code>"
+  <tr><td>"<code>asmo-708</code>"
+  <tr><td>"<code>csiso88596e</code>"
+  <tr><td>"<code>csiso88596i</code>"
+  <tr><td>"<code>csisolatinarabic</code>"
+  <tr><td>"<code>ecma-114</code>"
+  <tr><td>"<code>iso-8859-6</code>"
+  <tr><td>"<code>iso-8859-6-e</code>"
+  <tr><td>"<code>iso-8859-6-i</code>"
+  <tr><td>"<code>iso-ir-127</code>"
+  <tr><td>"<code>iso8859-6</code>"
+  <tr><td>"<code>iso88596</code>"
+  <tr><td>"<code>iso_8859-6</code>"
+  <tr><td>"<code>iso_8859-6:1987</code>"
+  <tr>
+   <td rowspan=12><a>ISO-8859-7</a>
+   <td>"<code>csisolatingreek</code>"
+  <tr><td>"<code>ecma-118</code>"
+  <tr><td>"<code>elot_928</code>"
+  <tr><td>"<code>greek</code>"
+  <tr><td>"<code>greek8</code>"
+  <tr><td>"<code>iso-8859-7</code>"
+  <tr><td>"<code>iso-ir-126</code>"
+  <tr><td>"<code>iso8859-7</code>"
+  <tr><td>"<code>iso88597</code>"
+  <tr><td>"<code>iso_8859-7</code>"
+  <tr><td>"<code>iso_8859-7:1987</code>"
+  <tr><td>"<code>sun_eu_greek</code>"
+  <tr>
+   <td rowspan=11><a>ISO-8859-8</a>
+   <td>"<code>csiso88598e</code>"
+  <tr><td>"<code>csisolatinhebrew</code>"
+  <tr><td>"<code>hebrew</code>"
+  <tr><td>"<code>iso-8859-8</code>"
+  <tr><td>"<code>iso-8859-8-e</code>"
+  <tr><td>"<code>iso-ir-138</code>"
+  <tr><td>"<code>iso8859-8</code>"
+  <tr><td>"<code>iso88598</code>"
+  <tr><td>"<code>iso_8859-8</code>"
+  <tr><td>"<code>iso_8859-8:1988</code>"
+  <tr><td>"<code>visual</code>"
+  <tr>
+   <td rowspan=3><a>ISO-8859-8-I</a>
+   <td>"<code>csiso88598i</code>"
+  <tr><td>"<code>iso-8859-8-i</code>"
+  <tr><td>"<code>logical</code>"
+  <tr>
+   <td rowspan=7><a>ISO-8859-10</a>
+   <td>"<code>csisolatin6</code>"
+  <tr><td>"<code>iso-8859-10</code>"
+  <tr><td>"<code>iso-ir-157</code>"
+  <tr><td>"<code>iso8859-10</code>"
+  <tr><td>"<code>iso885910</code>"
+  <tr><td>"<code>l6</code>"
+  <tr><td>"<code>latin6</code>"
+  <tr>
+   <td rowspan=3><a>ISO-8859-13</a>
+   <td>"<code>iso-8859-13</code>"
+  <tr><td>"<code>iso8859-13</code>"
+  <tr><td>"<code>iso885913</code>"
+  <tr>
+   <td rowspan=3><a>ISO-8859-14</a>
+   <td>"<code>iso-8859-14</code>"
+  <tr><td>"<code>iso8859-14</code>"
+  <tr><td>"<code>iso885914</code>"
+  <tr>
+   <td rowspan=6><a>ISO-8859-15</a>
+   <td>"<code>csisolatin9</code>"
+  <tr><td>"<code>iso-8859-15</code>"
+  <tr><td>"<code>iso8859-15</code>"
+  <tr><td>"<code>iso885915</code>"
+  <tr><td>"<code>iso_8859-15</code>"
+  <tr><td>"<code>l9</code>"
+  <tr>
+   <td><a>ISO-8859-16</a>
+   <td>"<code>iso-8859-16</code>"
+  <tr>
+   <td rowspan=5><a>KOI8-R</a>
+   <td>"<code>cskoi8r</code>"
+  <tr><td>"<code>koi</code>"
+  <tr><td>"<code>koi8</code>"
+  <tr><td>"<code>koi8-r</code>"
+  <tr><td>"<code>koi8_r</code>"
+  <tr>
+   <td rowspan=2><a>KOI8-U</a>
+   <td>"<code>koi8-ru</code>"
+  <tr><td>"<code>koi8-u</code>"
+  <tr>
+   <td rowspan=4><a>macintosh</a>
+   <td>"<code>csmacintosh</code>"
+  <tr><td>"<code>mac</code>"
+  <tr><td>"<code>macintosh</code>"
+  <tr><td>"<code>x-mac-roman</code>"
+  <tr>
+   <td rowspan=6><a>windows-874</a>
+   <td>"<code>dos-874</code>"
+  <tr><td>"<code>iso-8859-11</code>"
+  <tr><td>"<code>iso8859-11</code>"
+  <tr><td>"<code>iso885911</code>"
+  <tr><td>"<code>tis-620</code>"
+  <tr><td>"<code>windows-874</code>"
+  <tr>
+   <td rowspan=3><a>windows-1250</a>
+   <td>"<code>cp1250</code>"
+  <tr><td>"<code>windows-1250</code>"
+  <tr><td>"<code>x-cp1250</code>"
+  <tr>
+   <td rowspan=3><a>windows-1251</a>
+   <td>"<code>cp1251</code>"
+  <tr><td>"<code>windows-1251</code>"
+  <tr><td>"<code>x-cp1251</code>"
+  <tr>
+   <td rowspan=17><a>windows-1252</a>
+   <td>"<code>ansi_x3.4-1968</code>"
+  <tr><td>"<code>ascii</code>"
+  <tr><td>"<code>cp1252</code>"
+  <tr><td>"<code>cp819</code>"
+  <tr><td>"<code>csisolatin1</code>"
+  <tr><td>"<code>ibm819</code>"
+  <tr><td>"<code>iso-8859-1</code>"
+  <tr><td>"<code>iso-ir-100</code>"
+  <tr><td>"<code>iso8859-1</code>"
+  <tr><td>"<code>iso88591</code>"
+  <tr><td>"<code>iso_8859-1</code>"
+  <tr><td>"<code>iso_8859-1:1987</code>"
+  <tr><td>"<code>l1</code>"
+  <tr><td>"<code>latin1</code>"
+  <tr><td>"<code>us-ascii</code>"
+  <tr><td>"<code>windows-1252</code>"
+  <tr><td>"<code>x-cp1252</code>"
+  <tr>
+   <td rowspan=3><a>windows-1253</a>
+   <td>"<code>cp1253</code>"
+  <tr><td>"<code>windows-1253</code>"
+  <tr><td>"<code>x-cp1253</code>"
+  <tr>
+   <td rowspan=12><a>windows-1254</a>
+   <td>"<code>cp1254</code>"
+  <tr><td>"<code>csisolatin5</code>"
+  <tr><td>"<code>iso-8859-9</code>"
+  <tr><td>"<code>iso-ir-148</code>"
+  <tr><td>"<code>iso8859-9</code>"
+  <tr><td>"<code>iso88599</code>"
+  <tr><td>"<code>iso_8859-9</code>"
+  <tr><td>"<code>iso_8859-9:1989</code>"
+  <tr><td>"<code>l5</code>"
+  <tr><td>"<code>latin5</code>"
+  <tr><td>"<code>windows-1254</code>"
+  <tr><td>"<code>x-cp1254</code>"
+  <tr>
+   <td rowspan=3><a>windows-1255</a>
+   <td>"<code>cp1255</code>"
+  <tr><td>"<code>windows-1255</code>"
+  <tr><td>"<code>x-cp1255</code>"
+  <tr>
+   <td rowspan=3><a>windows-1256</a>
+   <td>"<code>cp1256</code>"
+  <tr><td>"<code>windows-1256</code>"
+  <tr><td>"<code>x-cp1256</code>"
+  <tr>
+   <td rowspan=3><a>windows-1257</a>
+   <td>"<code>cp1257</code>"
+  <tr><td>"<code>windows-1257</code>"
+  <tr><td>"<code>x-cp1257</code>"
+  <tr>
+   <td rowspan=3><a>windows-1258</a>
+   <td>"<code>cp1258</code>"
+  <tr><td>"<code>windows-1258</code>"
+  <tr><td>"<code>x-cp1258</code>"
+  <tr>
+   <td rowspan=2><a>x-mac-cyrillic</a>
+   <td>"<code>x-mac-cyrillic</code>"
+  <tr><td>"<code>x-mac-ukrainian</code>"
+ <tbody>
+  <tr><th colspan=2><a href=#legacy-multi-byte-chinese-(simplified)-encodings>Legacy multi-byte Chinese (simplified) encodings</a>
+  <tr>
+   <td rowspan=9><a>GBK</a>
+   <td>"<code>chinese</code>"
+  <tr><td>"<code>csgb2312</code>"
+  <tr><td>"<code>csiso58gb231280</code>"
+  <tr><td>"<code>gb2312</code>"
+  <tr><td>"<code>gb_2312</code>"
+  <tr><td>"<code>gb_2312-80</code>"
+  <tr><td>"<code>gbk</code>"
+  <tr><td>"<code>iso-ir-58</code>"
+  <tr><td>"<code>x-gbk</code>"
+  <tr>
+   <td><a>gb18030</a>
+   <td>"<code>gb18030</code>"
+ <tbody>
+  <tr><th colspan=2><a href=#legacy-multi-byte-chinese-(traditional)-encodings>Legacy multi-byte Chinese (traditional) encodings</a>
+  <tr>
+   <td rowspan=5><a>Big5</a>
+   <td>"<code>big5</code>"
+  <tr><td>"<code>big5-hkscs</code>"
+  <tr><td>"<code>cn-big5</code>"
+  <tr><td>"<code>csbig5</code>"
+  <tr><td>"<code>x-x-big5</code>"
+ <tbody>
+  <tr><th colspan=2><a href=#legacy-multi-byte-japanese-encodings>Legacy multi-byte Japanese encodings</a>
+  <tr>
+   <td rowspan=3><a>EUC-JP</a>
+   <td>"<code>cseucpkdfmtjapanese</code>"
+  <tr><td>"<code>euc-jp</code>"
+  <tr><td>"<code>x-euc-jp</code>"
+  <tr>
+   <td rowspan=2><a>ISO-2022-JP</a>
+   <td>"<code>csiso2022jp</code>"
+  <tr><td>"<code>iso-2022-jp</code>"
+  <tr>
+   <td rowspan=8><a>Shift_JIS</a>
+   <td>"<code>csshiftjis</code>"
+  <tr><td>"<code>ms932</code>"
+  <tr><td>"<code>ms_kanji</code>"
+  <tr><td>"<code>shift-jis</code>"
+  <tr><td>"<code>shift_jis</code>"
+  <tr><td>"<code>sjis</code>"
+  <tr><td>"<code>windows-31j</code>"
+  <tr><td>"<code>x-sjis</code>"
+ <tbody>
+  <tr><th colspan=2><a href=#legacy-multi-byte-korean-encodings>Legacy multi-byte Korean encodings</a>
+  <tr>
+   <td rowspan=10><a>EUC-KR</a>
+   <td>"<code>cseuckr</code>"
+  <tr><td>"<code>csksc56011987</code>"
+  <tr><td>"<code>euc-kr</code>"
+  <tr><td>"<code>iso-ir-149</code>"
+  <tr><td>"<code>korean</code>"
+  <tr><td>"<code>ks_c_5601-1987</code>"
+  <tr><td>"<code>ks_c_5601-1989</code>"
+  <tr><td>"<code>ksc5601</code>"
+  <tr><td>"<code>ksc_5601</code>"
+  <tr><td>"<code>windows-949</code>"
+ <tbody>
+  <tr><th colspan=2><a href=#legacy-miscellaneous-encodings>Legacy miscellaneous encodings</a>
+  <tr>
+   <td rowspan=6><a>replacement</a>
+   <td>"<code>csiso2022kr</code>"
+  <tr><td>"<code>hz-gb-2312</code>"
+  <tr><td>"<code>iso-2022-cn</code>"
+  <tr><td>"<code>iso-2022-cn-ext</code>"
+  <tr><td>"<code>iso-2022-kr</code>"
+  <tr><td>"<code>replacement</code>"
+  <tr>
+   <td><a>UTF-16BE</a>
+   <td>"<code>utf-16be</code>"
+  <tr>
+   <td rowspan=2><a>UTF-16LE</a>
+   <td>"<code>utf-16</code>"
+  <tr><td>"<code>utf-16le</code>"
+  <tr>
+   <td><a>x-user-defined</a>
+   <td>"<code>x-user-defined</code>"
+</table>
+
+<p class=note>All <a for=/>encodings</a> and their
+<a>labels</a> are also available as non-normative
+<a href=encodings.json>encodings.json</a> resource.
+
+
+<h3 id=output-encodings>Output encodings</h3>
+
+<p>To <dfn export>get an output encoding</dfn> from an <a for=/>encoding</a>
+<var>encoding</var>, run these steps:
+
+<ol>
+ <li><p>If <var>encoding</var> is <a>replacement</a>, <a>UTF-16BE</a>, or
+ <a>UTF-16LE</a>, return <a>UTF-8</a>.
+
+ <li><p>Return <var>encoding</var>.
+</ol>
+
+<p class=note>The <a>get an output encoding</a> algorithm is useful for URL parsing and HTML
+form submission, which both need exactly this.
+
+
+
+<h2 id=indexes>Indexes</h2>
+
+<p>Most legacy <a for=/>encodings</a> make use of an <dfn id=index>index</dfn>. An
+<a>index</a> is an ordered list of entries, each entry consisting of a pointer and a
+corresponding code point. Within an <a>index</a> pointers are unique and code points can be
+duplicated.
+
+<p class="note no-backref">An efficient implementation likely has two
+<a lt=index>indexes</a> per <a for=/>encoding</a>. One optimized for its
+<a for=/>decoder</a> and one for its <a for=/>encoder</a>.
+
+<p>To find the pointers and their corresponding code points in an <a>index</a>,
+let <var>lines</var> be the result of splitting the resource's contents on U+000A.
+Then remove each item in <var>lines</var> that is the empty string or starts with U+0023.
+Then the pointers and their corresponding code points are found by splitting each item in <var>lines</var> on U+0009.
+The first subitem is the pointer (as a decimal number) and the second is the corresponding code point (as a hexadecimal number).
+Other subitems are not relevant.
+
+<p class="note no-backref">To signify changes an <a>index</a> includes an
+<i>Identifier</i> and a <i>Date</i>. If an <i>Identifier</i> has
+changed, so has the <a>index</a>.
+
+<p>The <dfn>index code point</dfn> for <var>pointer</var> in
+<var>index</var> is the code point corresponding to
+<var>pointer</var> in <var>index</var>, or null if
+<var>pointer</var> is not in <var>index</var>.
+
+<p>The <dfn>index pointer</dfn> for <var>code point</var> in
+<var>index</var> is the <em>first</em> pointer corresponding to
+<var>code point</var> in <var>index</var>, or null if
+<var>code point</var> is not in <var>index</var>.
+
+<div class=note id=visualization>
+ <p>There is a non-normative visualization for each <a>index</a> other than
+ <a>index gb18030 ranges</a> and <a>index ISO-2022-JP katakana</a>. <a>index jis0208</a> also has an
+ alternative <a>Shift_JIS</a> visualization. Additionally, there is visualization of the Basic
+ Multilingual Plane coverage of each index other than <a>index gb18030 ranges</a> and
+ <a>index ISO-2022-JP katakana</a>.
+
+ <p>The legend for the visualizations is:
+
+ <ul class=visualizationlegend>
+  <li class=unmapped>Unmapped
+  <li class=mid>Two bytes in UTF-8
+  <li class="mid contiguous">Two bytes in UTF-8, code point follows immediately the code point of
+  previous pointer
+  <li class=upper>Three bytes in UTF-8 (non-PUA)
+  <li class="upper contiguous">Three bytes in UTF-8 (non-PUA), code point follows immediately the
+  code point of previous pointer
+  <li class=pua>Private Use
+  <li class="pua contiguous">Private Use, code point follows immediately the code point of previous
+  pointer
+  <li class=astral>Four bytes in UTF-8
+  <li class="astral contiguous">Four bytes in UTF-8, code point follows immediately the code point
+  of previous pointer
+  <li class=duplicate>Duplicate code point already mapped at an earlier index
+  <li class=compatibility>CJK Compatibility Ideograph
+  <li class=ext>CJK Unified Ideographs Extension A
+ </ul>
+</div>
+
+<p>These are the <a lt=index>indexes</a> defined by this
+specification, excluding <a>index single-byte</a>, which have their own table:
+
+<table>
+ <tbody><tr><th colspan=4><a>Index</a><th>Notes
+ <tr>
+  <td><dfn export>index Big5</dfn>
+  <td><a href=index-big5.txt>index-big5.txt</a>
+  <td><a href=big5.html>index Big5 visualization</a>
+  <td><a href=big5-bmp.html>index Big5 BMP coverage</a>
+  <td>This matches the Big5 standard in combination with the
+  Hong Kong Supplementary Character Set and other common extensions.
+ <tr>
+  <td><dfn export>index EUC-KR</dfn>
+  <td><a href=index-euc-kr.txt>index-euc-kr.txt</a>
+  <td><a href=euc-kr.html>index EUC-KR visualization</a>
+  <td><a href=euc-kr-bmp.html>index EUC-KR BMP coverage</a>
+  <td>This matches the KS X 1001 standard and the Unified Hangul Code, more commonly known together
+  as Windows Codepage 949. It covers the Hangul Syllables block of Unicode in its entirety. The
+  Hangul block whose top left corner in the visualization is at pointer 9026 is in the Unicode
+  order. Taken separately, the rest of the Hangul syllables in this index are in the Unicode order,
+  too.
+ <tr>
+  <td><dfn export>index gb18030</dfn>
+  <td><a href=index-gb18030.txt>index-gb18030.txt</a>
+  <td><a href=gb18030.html>index gb18030 visualization</a>
+  <td><a href=gb18030-bmp.html>index gb18030 BMP coverage</a>
+  <td>This matches the GB18030-2005 standard for code points encoded as two bytes, except for
+  0xA3 0xA0 which maps to U+3000 to be compatible with deployed content. This index covers the
+  CJK Unified Ideographs block of Unicode in its entirety. Entries from that block that are above or
+  to the left of (the first) U+3000 in the visualization are in the Unicode order.
+  <!-- https://bugzilla.mozilla.org/show_bug.cgi?id=131837
+       https://bugs.webkit.org/show_bug.cgi?id=17014
+       https://www.w3.org/Bugs/Public/show_bug.cgi?id=25396
+       https://github.com/whatwg/encoding/issues/17 -->
+ <tr>
+  <td><dfn export>index gb18030 ranges</dfn>
+  <td colspan=3><a href=index-gb18030-ranges.txt>index-gb18030-ranges.txt</a>
+  <td>This <a>index</a> works different from all others. Listing all code points would result
+  in over a million items whereas they can be represented neatly in 207 ranges combined with trivial
+  limit checks. It therefore only superficially matches the GB18030-2005 standard for code points
+  encoded as four bytes. See also <a>index gb18030 ranges code point</a> and
+  <a>index gb18030 ranges pointer</a> below.
+ <tr>
+  <td><dfn export>index jis0208</dfn>
+  <td><a href=index-jis0208.txt>index-jis0208.txt</a>
+  <td><a href=jis0208.html>index jis0208 visualization</a>, <a href=shift_jis.html>Shift_JIS visualization</a>
+  <td><a href=jis0208-bmp.html>index jis0208 BMP coverage</a>
+  <td>This is the JIS X 0208 standard including formerly proprietary
+  extensions from IBM and NEC.
+  <!-- NEC = Nippon Electronics Corporation -->
+ <tr>
+  <td><dfn export>index jis0212</dfn>
+  <td><a href=index-jis0212.txt>index-jis0212.txt</a>
+  <td><a href=jis0212.html>index jis0212 visualization</a>
+  <td><a href=jis0212-bmp.html>index jis0212 BMP coverage</a>
+  <td>This is the JIS X 0212 standard. It is only used by the <a>EUC-JP decoder</a>
+  due to lack of widespread support elsewhere.
+  <!--
+   No JIX X 0212 EUC-JP encoder support:
+     https://bugzilla.mozilla.org/show_bug.cgi?id=600715
+     https://code.google.com/p/chromium/issues/detail?id=78847
+
+   No JIX X 0212 ISO-2022-JP support:
+     https://www.w3.org/Bugs/Public/show_bug.cgi?id=26885
+  -->
+ <tr>
+  <td><dfn export>index ISO-2022-JP katakana</dfn>
+  <td colspan=3><a href=index-iso-2022-jp-katakana.txt>index-iso-2022-jp-katakana.txt</a>
+  <td>This maps halfwidth to fullwidth katakana as per Unicode Normalization Form KC, except that
+  U+FF9E and U+FF9F map to U+309B and U+309C rather than U+3099 and U+309A. It is only used by the
+  <a>ISO-2022-JP encoder</a>. [[UNICODE]]
+</table>
+
+<p>The <dfn>index gb18030 ranges code point</dfn> for <var>pointer</var> is
+the return value of these steps:
+
+<ol>
+ <li><p>If <var>pointer</var> is greater than 39419 and less than
+ 189000, or <var>pointer</var> is greater than 1237575, return null.
+
+ <li><p>If <var>pointer</var> is 7457, return code point U+E7C7.
+ <!-- 7457 is 0x81 0x35 0xF4 0x37 -->
+
+ <li><p>Let <var>offset</var> be the last pointer in
+ <a>index gb18030 ranges</a> that is equal to or less than
+ <var>pointer</var> and let <var>code point offset</var> be its
+ corresponding code point.
+
+ <li><p>Return a code point whose value is
+ <var>code point offset</var> + <var>pointer</var> &minus; <var>offset</var>.
+</ol>
+
+<p>The <dfn>index gb18030 ranges pointer</dfn> for <var>code point</var> is
+the return value of these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is U+E7C7, return pointer 7457.
+
+ <li><p>Let <var>offset</var> be the last code point in
+ <a>index gb18030 ranges</a> that is equal to or less than
+ <var>code point</var> and let <var>pointer offset</var> be its
+ corresponding pointer.
+
+ <li><p>Return a pointer whose value is
+ <var>pointer offset</var> + <var>code point</var> &minus; <var>offset</var>.
+</ol>
+
+<p>The <dfn>index Shift_JIS pointer</dfn> for <var>code point</var> is the return value of these
+steps:
+
+<ol>
+ <li>
+  <p>Let <var>index</var> be <a>index jis0208</a> excluding all entries whose pointer is in
+  the range 8272 to 8835, inclusive.
+  <!-- selected NEC duplicates from IBM extensions later in the index; need to use IBM
+       extensions when going back to bytes -->
+
+  <p class=note>The <a>index jis0208</a> contains duplicate code points so the exclusion of
+  these entries causes later code points to be used.
+
+ <li><p>Return the <a>index pointer</a> for <var>code point</var> in
+ <var>index</var>.
+</ol>
+
+<p>The <dfn>index Big5 pointer</dfn> for <var>code point</var> is the return value of
+these steps:
+
+<ol>
+ <li>
+  <p>Let <var>index</var> be <a>index Big5</a> excluding all entries whose pointer is less
+  than (0xA1 - 0x81) × 157.
+
+  <p class=note>Avoid returning Hong Kong Supplementary Character Set extensions literally.
+
+ <li>
+  <p>If <var>code point</var> is U+2550, U+255E, U+2561, U+256A, U+5341, or U+5345,
+  return the <em>last</em> pointer corresponding to <var>code point</var> in
+  <var>index</var>.
+  <!-- https://www.w3.org/Bugs/Public/show_bug.cgi?id=27878 -->
+
+  <p class=note>There are other duplicate code points, but for those the <em>first</em> pointer is
+  to be used.
+
+ <li><p>Return the <a>index pointer</a> for <var>code point</var> in
+ <var>index</var>.
+</ol>
+
+<hr>
+
+<p class="note no-backref">All <a lt=index>indexes</a> are also available as a non-normative
+<a href=indexes.json>indexes.json</a> resource. (<a>Index gb18030 ranges</a> has a slightly
+different format here, to be able to represent ranges.)
+
+
+
+<h2 id=specification-hooks>Hooks for standards</h2>
+
+<div class=note>
+ <p>The algorithms defined below (<a>decode</a>, <a>UTF-8 decode</a>,
+ <a>UTF-8 decode without BOM</a>, <a>UTF-8 decode without BOM or fail</a>, <a for=/>encode</a>, and
+ <a>UTF-8 encode</a>) are intended for usage by other standards.
+
+ <p>For decoding, <a>UTF-8 decode</a> is to be used by new formats. For identifiers or byte
+ sequences within a format or protocol, use <a>UTF-8 decode without BOM</a> or
+ <a>UTF-8 decode without BOM or fail</a>.
+
+ <p>For encoding, <a>UTF-8 encode</a> is to be used.
+
+ <p>Standards are strongly discouraged from using <a>decode</a> and <a for=/>encode</a>, except as
+ needed for compatibility.
+
+ <p>The <a>get an encoding</a> algorithm is to be used to turn a <a>label</a> into an
+ <a for=/>encoding</a>.
+</div>
+
+<p>To <dfn export>decode</dfn> a byte stream <var>stream</var> using
+fallback encoding <var>encoding</var>, run these steps:
+
+<ol>
+ <li><p>Let <var>buffer</var> be an empty byte sequence.
+
+ <li><p>Let <var>BOM seen flag</var> be unset.
+
+ <li><p><a>Read</a> bytes from <var>stream</var>
+ into <var>buffer</var> until either <var>buffer</var> contains three bytes or
+ <a>read</a> returns <a>end-of-stream</a>.
+
+ <li>
+  <p>For each of the rows in the table below, starting with the first
+  one and going down, if the first bytes of <var>buffer</var> match
+  all the bytes given in the first column, then set <var>encoding</var>
+  to the <a for=/>encoding</a> given in the cell in the second column of
+  that row and set <var>BOM seen flag</var>.
+
+  <table>
+   <tbody><tr><th>Byte order mark<th>Encoding
+   <tr><td>0xEF 0xBB 0xBF<td><a>UTF-8</a>
+   <tr><td>0xFE 0xFF<td><a>UTF-16BE</a>
+   <tr><td>0xFF 0xFE<td><a>UTF-16LE</a>
+  </table>
+
+  <p class=note>For compatibility with deployed content, the byte order mark is more authoritative
+  than anything else. In a context where HTTP is used this is in violation of the semantics of the
+  `<code>Content-Type</code>` header.
+
+ <li><p>If <var>BOM seen flag</var> is unset,
+ <a>prepend</a> <var>buffer</var> to <var>stream</var>.
+
+ <li><p>Otherwise, if <var>BOM seen flag</var> is set, <var>encoding</var> is not
+ <a>UTF-8</a>, and <var>buffer</var> contains three bytes,
+ <a>prepend</a> the last byte of <var>buffer</var> to
+ <var>stream</var>.
+
+ <li><p>Let <var>output</var> be a code point <a for=/>stream</a>.
+
+ <li><p><a>Run</a> <var>encoding</var>'s
+ <a for=/>decoder</a> with <var>stream</var> and <var>output</var>.
+
+ <li><p>Return <var>output</var>.
+</ol>
+
+<p>To <dfn export>UTF-8 decode</dfn> a byte stream <var>stream</var>, run
+these steps:
+
+<ol>
+ <li><p>Let <var>buffer</var> be an empty byte sequence.
+
+ <li><p><a>Read</a> three bytes from <var>stream</var>
+ into <var>buffer</var>.
+
+ <li><p>If <var>buffer</var> does not match 0xEF 0xBB 0xBF,
+ <a>prepend</a> <var>buffer</var> to <var>stream</var>.
+
+ <li><p>Let <var>output</var> be a code point <a for=/>stream</a>.
+
+ <li><p><a>Run</a> <a>UTF-8</a>'s
+ <a for=/>decoder</a> with <var>stream</var> and <var>output</var>.
+
+ <li><p>Return <var>output</var>.
+</ol>
+
+<p>To <dfn export>UTF-8 decode without BOM</dfn> a byte stream <var>stream</var>, run these
+steps:
+
+<ol>
+ <li><p>Let <var>output</var> be a code point <a for=/>stream</a>.
+
+ <li><p><a>Run</a> <a>UTF-8</a>'s
+ <a for=/>decoder</a> with <var>stream</var> and <var>output</var>.
+
+ <li><p>Return <var>output</var>.
+</ol>
+
+<p>To <dfn export>UTF-8 decode without BOM or fail</dfn> a byte stream <var>stream</var>, run these
+steps:
+<!-- Needed by https://tools.ietf.org/html/rfc6455#section-8.1 and
+     https://webassembly.github.io/spec/js-api/#dom-module-customsections-moduleobject-sectionname
+     -->
+
+<ol>
+ <li><p>Let <var>output</var> be a code point stream.
+
+ <li><p>Let <var>potentialError</var> be the result of <a>running</a>
+ <a>UTF-8</a>'s <a for=/>decoder</a> with <var>stream</var>, <var>output</var>, and
+ "<code>fatal</code>".
+
+ <li><p>If <var>potentialError</var> is <a>error</a>, return failure.
+
+ <li><p>Return <var>output</var>.
+</ol>
+
+<hr>
+
+<p>To <dfn export>encode</dfn> a code point stream <var>stream</var> using
+encoding <var>encoding</var>, run these steps:
+
+<ol>
+ <li><p>Assert: <var>encoding</var> is not <a>replacement</a>, <a>UTF-16BE</a> or
+ <a>UTF-16LE</a>.
+
+ <li><p>Let <var>output</var> be a byte <a for=/>stream</a>.
+
+ <li><p><a>Run</a> <var>encoding</var>'s
+ <a for=/>encoder</a> with <var>stream</var>, <var>output</var>, and "<code>html</code>".
+
+ <li><p>Return <var>output</var>.
+</ol>
+
+<p class="note no-backref">This is mostly a legacy hook for URLs and HTML forms. Layering
+<a>UTF-8 encode</a> on top is safe as it never triggers
+<a>errors</a>.
+[[URL]]
+[[HTML]]
+
+<p>To <dfn export>UTF-8 encode</dfn> a code point stream <var>stream</var>,
+return the result of <a lt=encode for=/>encoding</a>
+<var>stream</var> using encoding <a>UTF-8</a>.
+
+
+
+<h2 id=api>API</h2>
+
+<p>This section uses terminology from Web IDL. Browser user agents must support this API. JavaScript
+implementations should support this API. Other user agents or programming languages are encouraged
+to use an API suitable to their needs, which might not be this one. [[!WEBIDL]]
+
+<div class=example id=example-textencoder>
+ <p>The following example uses the {{TextEncoder}} object to encode
+ an array of strings into an
+ {{ArrayBuffer}}. The result is a
+ {{Uint8Array}} containing the number
+ of strings (as a {{Uint32Array}}),
+ followed by the length of the first string (as a
+ {{Uint32Array}}), the
+ <a>UTF-8</a> encoded string data, the length of the second string (as
+ a {{Uint32Array}}), the string data,
+ and so on.
+ <pre><code class=lang-javascript>
+function encodeArrayOfStrings(strings) {
+  var encoder, encoded, len, bytes, view, offset;
+
+  encoder = new TextEncoder();
+  encoded = [];
+
+  len = Uint32Array.BYTES_PER_ELEMENT;
+  for (var i = 0; i &lt; strings.length; i++) {
+    len += Uint32Array.BYTES_PER_ELEMENT;
+    encoded[i] = encoder.encode(strings[i]);
+    len += encoded[i].byteLength;
+  }
+
+  bytes = new Uint8Array(len);
+  view = new DataView(bytes.buffer);
+  offset = 0;
+
+  view.setUint32(offset, strings.length);
+  offset += Uint32Array.BYTES_PER_ELEMENT;
+  for (var i = 0; i &lt; encoded.length; i += 1) {
+    len = encoded[i].byteLength;
+    view.setUint32(offset, len);
+    offset += Uint32Array.BYTES_PER_ELEMENT;
+    bytes.set(encoded[i], offset);
+    offset += len;
+  }
+  return bytes.buffer;
+}</code></pre>
+
+ <p>The following example decodes an {{ArrayBuffer}} containing data encoded in the
+ format produced by the previous example, or an equivalent algorithm for encodings other than
+ <a>UTF-8</a>, back into an array of strings.
+
+ <pre><code class=lang-javascript>
+function decodeArrayOfStrings(buffer, encoding) {
+  var decoder, view, offset, num_strings, strings, len;
+
+  decoder = new TextDecoder(encoding);
+  view = new DataView(buffer);
+  offset = 0;
+  strings = [];
+
+  num_strings = view.getUint32(offset);
+  offset += Uint32Array.BYTES_PER_ELEMENT;
+  for (var i = 0; i &lt; num_strings; i++) {
+    len = view.getUint32(offset);
+    offset += Uint32Array.BYTES_PER_ELEMENT;
+    strings[i] = decoder.decode(
+      new DataView(view.buffer, offset, len));
+    offset += len;
+  }
+  return strings;
+}</code></pre>
+</div>
+
+
+<h3 id=interface-mixin-textdecodercommon>Interface mixin {{TextDecoderCommon}}</h3>
+
+<pre class=idl>
+interface mixin TextDecoderCommon {
+  readonly attribute DOMString encoding;
+  readonly attribute boolean fatal;
+  readonly attribute boolean ignoreBOM;
+};
+</pre>
+
+<p>The {{TextDecoderCommon}} interface mixin defines common attributes that are shared between
+{{TextDecoder}} and {{TextDecoderStream}} objects. These objects have an associated
+<dfn id=textdecoder-encoding for=TextDecoderCommon>encoding</dfn>,
+<dfn id=textdecoder-ignore-bom-flag for=TextDecoderCommon>ignore BOM flag</dfn> (initially unset),
+<dfn id=textdecoder-bom-seen-flag for=TextDecoderCommon>BOM seen flag</dfn> (initially unset), and
+<dfn id=textdecoder-error-mode for=TextDecoderCommon>error mode</dfn> (initially
+"<code>replacement</code>").
+
+<p>These objects also have an associated
+<dfn id=concept-td-serialize for=TextDecoderCommon>serialize stream</dfn> algorithm, that given a
+<a for=/>stream</a> <var>stream</var>, runs these steps:
+
+<ol>
+ <li><p>Let <var>output</var> be the empty string.
+
+ <li>
+  <p>While true:
+
+  <ol>
+   <li><p>Let <var>token</var> be the result of <a>reading</a> from <var>stream</var>.
+
+   <li>
+    <p>If <a for=TextDecoderCommon>encoding</a> is <a>UTF-8</a>, <a>UTF-16BE</a>, or
+    <a>UTF-16LE</a>, and <a for=TextDecoderCommon>ignore BOM flag</a> and
+    <a for=TextDecoderCommon>BOM seen flag</a> are unset, then:
+
+    <ol>
+     <li><p>If <var>token</var> is U+FEFF, then set <a for=TextDecoderCommon>BOM seen flag</a>.
+
+     <li><p>Otherwise, if <var>token</var> is not <a>end-of-stream</a>, then set
+     <a for=TextDecoderCommon>BOM seen flag</a> and append <var>token</var> to <var>output</var>.
+
+     <li><p>Otherwise, return <var>output</var>.
+    </ol>
+
+   <li><p>Otherwise, if <var>token</var> is not <a>end-of-stream</a>, then append <var>token</var>
+   to <var>output</var>.
+
+   <li><p>Otherwise, return <var>output</var>.
+  </ol>
+</ol>
+
+<p class=note>This algorithm is intentionally different with respect to BOM handling from
+the <a for=/>decode</a> algorithm used by the rest of the platform to give API users more
+control.
+
+<hr>
+
+<p>The <dfn attribute id=dom-textdecoder-encoding for=TextDecoderCommon><code>encoding</code></dfn>
+attribute's getter, when invoked, must return this object's <a for=TextDecoderCommon>encoding</a>'s
+<a for=encoding>name</a> in <a>ASCII lowercase</a>.
+
+<p>The <dfn attribute id=dom-textdecoder-fatal for=TextDecoderCommon><code>fatal</code></dfn>
+attribute's getter, when invoked, must return true if this object's
+<a for=TextDecoderCommon>error mode</a> is "<code>fatal</code>", and false otherwise.
+
+<p>The
+<dfn attribute id=dom-textdecoder-ignorebom for=TextDecoderCommon><code>ignoreBOM</code></dfn>
+attribute's getter, when invoked, must return true if this object's
+<a for=TextDecoderCommon>ignore BOM flag</a> is set, and false otherwise.
+
+
+<h3 id=interface-textdecoder>Interface {{TextDecoder}}</h3>
+
+<pre class=idl>
+dictionary TextDecoderOptions {
+  boolean fatal = false;
+  boolean ignoreBOM = false;
+};
+
+dictionary TextDecodeOptions {
+  boolean stream = false;
+};
+
+[Constructor(optional DOMString label = "utf-8", optional TextDecoderOptions options),
+ Exposed=(Window,Worker)]
+interface TextDecoder {
+  USVString decode(optional BufferSource input, optional TextDecodeOptions options);
+};
+TextDecoder includes TextDecoderCommon;
+</pre>
+
+<p>A {{TextDecoder}} object has an associated <dfn for=TextDecoder>decoder</dfn>,
+<dfn for=TextDecoder>stream</dfn>, and <dfn for=TextDecoder>do not flush flag</dfn> (initially
+unset).
+
+<dl class=domintro>
+ <dt><code><var>decoder</var> = new <a constructor for=TextDecoder lt=TextDecoder()>TextDecoder([<var>label</var> = "utf-8" [, <var>options</var>]])</a></code>
+ <dd>
+  <p>Returns a new {{TextDecoder}} object.
+  <p>If <var>label</var> is either not a <a>label</a> or is a
+  <a>label</a> for <a>replacement</a>,
+  <a>throws</a> a
+  {{RangeError}}.
+
+ <dt><code><var>decoder</var> . <a attribute for=TextDecoderCommon>encoding</a></code>
+ <dd><p>Returns <a for=TextDecoderCommon>encoding</a>'s <a>name</a>, lowercased.
+
+ <dt><code><var>decoder</var> . <a attribute for=TextDecoderCommon>fatal</a></code>
+ <dd><p>Returns true if <a for=TextDecoderCommon>error mode</a> is "<code>fatal</code>", and
+ false otherwise.
+
+ <dt><code><var>decoder</var> . <a attribute for=TextDecoderCommon>ignoreBOM</a></code>
+ <dd><p>Returns true if <a for=TextDecoderCommon>ignore BOM flag</a> is set, and false
+ otherwise.
+
+ <dt><code><var>decoder</var> . <a method for=TextDecoder lt=decode()>decode([<var>input</var> [, <var>options</var>]])</a></code>
+ <dd>
+  <p>Returns the result of running <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a>.
+  The method can be invoked zero or more times with <var>options</var>'s <code>stream</code> set to
+  true, and then once without <var>options</var>'s <code>stream</code> (or set to false), to process
+  a fragmented stream. If the invocation without <var>options</var>'s <code>stream</code> (or set to
+  false) has no <var>input</var>, it's clearest to omit both arguments.
+
+  <pre class=example id=example-end-of-stream><code class=lang-javascript>
+var string = "", decoder = new TextDecoder(encoding), buffer;
+while(buffer = next_chunk()) {
+  string += decoder.decode(buffer, {stream:true});
+}
+string += decoder.decode(); // end-of-stream</code></pre>
+
+  <p>If the <a for=TextDecoderCommon>error mode</a> is "<code>fatal</code>" and
+  <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a> returns <a>error</a>,
+  <a>throws</a> a {{TypeError}}.
+</dl>
+
+<p>The
+<dfn constructor for=TextDecoder id=dom-textdecoder><code>TextDecoder(<var>label</var>, <var>options</var>)</code></dfn>
+constructor, when invoked, must run these steps:
+
+<ol>
+ <li><p>Let <var>encoding</var> be the result of <a>getting an encoding</a> from <var>label</var>.
+
+ <li><p>If <var>encoding</var> is failure or <a>replacement</a>, then <a>throw</a> a {{RangeError}}.
+
+ <li><p>Let <var>dec</var> be a new {{TextDecoder}} object.
+
+ <li><p>Set <var>dec</var>'s <a for=TextDecoderCommon>encoding</a> to <var>encoding</var>.
+
+ <li><p>If <var>options</var>'s <code>fatal</code> member is true, then set <var>dec</var>'s
+ <a for=TextDecoderCommon>error mode</a> to "<code>fatal</code>".
+
+ <li><p>If <var>options</var>'s <code>ignoreBOM</code> member is true, then set <var>dec</var>'s
+ <a for=TextDecoderCommon>ignore BOM flag</a>.
+
+ <li><p>Return <var>dec</var>.
+</ol>
+
+<p>The <dfn method for=TextDecoder><code>decode(<var>input</var>, <var>options</var>)</code></dfn>
+method, when invoked, must run these steps:
+
+<ol>
+ <li><p>If the <a for=TextDecoder>do not flush flag</a> is unset, set <a for=TextDecoder>decoder</a>
+ to a new <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a>, set
+ <a for=TextDecoder>stream</a> to a new <a for=/>stream</a>, and unset the
+ <a for=TextDecoderCommon>BOM seen flag</a>.
+
+ <li><p>If <var>options</var>'s <code>stream</code> is true, set the
+ <a for=TextDecoder>do not flush flag</a>, and unset the <a for=TextDecoder>do not flush flag</a>
+ otherwise.
+
+ <li>
+  <p>If <var>input</var> is given, then <a>push</a> a
+  <a lt="get a copy of the buffer source">copy of</a> <var>input</var> to
+  <a for=TextDecoder>stream</a>.
+
+  <p class=note>Implementations are strongly encouraged to use an implementation strategy that
+  avoids this copy. When doing so they will have to make sure that changes to <var>input</var> do
+  not affect future calls to <a method><code>decode()</code></a>.
+
+ <li><p>Let <var>output</var> be a new <a for=/>stream</a>.
+
+ <li>
+  <p>While true:
+
+  <ol>
+   <li><p>Let <var>token</var> be the result of <a>reading</a> from <a for=TextDecoder>stream</a>.
+
+   <li>
+    <p>If <var>token</var> is <a>end-of-stream</a> and the <a for=TextDecoder>do not flush flag</a>
+    is set, then return <var>output</var>,
+    <a lt="serialize stream" for=TextDecoderCommon>serialized</a>.
+
+    <p class=note>The way streaming works is to not handle <a>end-of-stream</a> here when the
+    <a for=TextDecoder>do not flush flag</a> is set and to not unset that flag. That way in a
+    subsequent invocation <a for=TextDecoder>decoder</a> is not set anew in the first step of the
+    algorithm and its state is preserved.
+
+   <li>
+    <p>Otherwise:
+
+    <ol>
+     <li><p>Let <var>result</var> be the result of <a>processing</a> <var>token</var> for
+     <a for=TextDecoder>decoder</a>, <a for=TextDecoder>stream</a>, <var>output</var>, and
+     <a for=TextDecoderCommon>error mode</a>.
+
+     <li><p>If <var>result</var> is <a>finished</a>, then return <var>output</var>,
+     <a lt="serialize stream" for=TextDecoderCommon>serialized</a>.
+
+     <li><p>Otherwise, if <var>result</var> is <a>error</a>, then <a lt=throw>throw</a> a
+     {{TypeError}}.
+    </ol>
+  </ol>
+</ol>
+
+<h3 id=interface-mixin-textencodercommon>Interface mixin {{TextEncoderCommon}}</h3>
+
+<pre class=idl>
+interface mixin TextEncoderCommon {
+  readonly attribute DOMString encoding;
+};
+</pre>
+
+<p>The {{TextEncoderCommon}} interface mixin defines common attributes that are shared between
+{{TextEncoder}} and {{TextEncoderStream}} objects.
+
+<p>The <dfn attribute id=dom-textencoder-encoding for=TextEncoderCommon><code>encoding</code></dfn>
+attribute's getter, when invoked, must return "<code>utf-8</code>".
+
+
+<h3 id=interface-textencoder>Interface {{TextEncoder}}</h3>
+
+<pre class=idl>
+[Constructor,
+ Exposed=(Window,Worker)]
+interface TextEncoder {
+  [NewObject] Uint8Array encode(optional USVString input = "");
+};
+TextEncoder includes TextEncoderCommon;
+</pre>
+
+<p>A {{TextEncoder}} object has an associated <dfn for=TextEncoder>encoder</dfn>.
+
+<p class="note no-backref">A {{TextEncoder}} object offers no <var>label</var> argument as it only
+supports <a>UTF-8</a>. It also offers no <code>stream</code> option as no <a for=/>encoder</a>
+requires buffering of scalar values.
+
+<hr>
+
+<dl class=domintro>
+ <dt><code><var>encoder</var> = new <a constructor for=TextEncoder>TextEncoder()</a></code>
+ <dd><p>Returns a new {{TextEncoder}} object.
+
+ <dt><code><var>encoder</var> . <a attribute for=TextEncoderCommon>encoding</a></code>
+ <dd><p>Returns "<code>utf-8</code>".
+
+ <dt><code><var>encoder</var> . <a method for=TextEncoder lt=encode()>encode([<var>input</var> = ""])</a></code>
+ <dd><p>Returns the result of running <a>UTF-8</a>'s <a for=/>encoder</a>.
+</dl>
+
+<p>The <dfn constructor for=TextEncoder id=dom-textencoder><code>TextEncoder()</code></dfn>
+constructor, when invoked, must run these steps:
+
+<ol>
+ <li><p>Let <var>enc</var> be a new {{TextEncoder}} object.
+
+ <li><p>Set <var>enc</var>'s <a for=TextEncoder>encoder</a> to <a>UTF-8</a>'s <a for=/>encoder</a>.
+
+ <li><p>Return <var>enc</var>.
+</ol>
+
+<p>The <dfn method for=TextEncoder><code>encode(<var>input</var>)</code></dfn> method, when invoked,
+must run these steps:
+
+<ol>
+ <li><p>Convert <var>input</var> to a <a for=/>stream</a>.
+
+ <li><p>Let <var>output</var> be a new <a for=/>stream</a>.
+
+ <li>
+  <p>While true:
+
+  <ol>
+   <li><p>Let <var>token</var> be the result of
+   <a>reading</a> from <var>input</var>.
+
+   <li><p>Let <var>result</var> be the result of
+   <a>processing</a> <var>token</var> for
+   <a for=TextEncoder>encoder</a>, <var>input</var>, <var>output</var>.
+
+   <li>
+    <p>If <var>result</var> is <a>finished</a>, convert <var>output</var> into a
+    byte sequence, and then return a {{Uint8Array}} object wrapping an
+    {{ArrayBuffer}} containing <var>output</var>.
+    <!-- XXX https://www.w3.org/Bugs/Public/show_bug.cgi?id=26966 -->
+
+    <p class=note><a>UTF-8</a> cannot return <a>error</a>.
+  </ol>
+</ol>
+
+
+<h3 id=interface-mixin-generictransformstream>Interface mixin {{GenericTransformStream}}</h3>
+
+<p>The {{GenericTransformStream}} interface mixin represents the concept of a
+<a>transform stream</a> in IDL. It is not a {{TransformStream}}, though it has the same interface
+and it delegates to one.
+
+<pre class=idl>
+interface mixin GenericTransformStream {
+  readonly attribute ReadableStream readable;
+  readonly attribute WritableStream writable;
+};
+</pre>
+
+<p>An object that includes {{GenericTransformStream}} has an associated
+<dfn for=GenericTransformStream>transform</dfn> of type {{TransformStream}}.
+
+<p>The <dfn attribute for=GenericTransformStream><code>readable</code></dfn> attribute's getter,
+when invoked, must return this object's <a for=GenericTransformStream>transform</a>.\[[readable]].
+
+<p>The <dfn attribute for=GenericTransformStream><code>writable</code></dfn> attribute's getter,
+when invoked, must return this object's <a for=GenericTransformStream>transform</a>.\[[writable]].
+
+
+<h3 id=interface-textdecoderstream>Interface {{TextDecoderStream}}</h3>
+
+<pre class=idl>
+[Constructor(optional DOMString label = "utf-8", optional TextDecoderOptions options),
+ Exposed=(Window,Worker)]
+interface TextDecoderStream {
+};
+TextDecoderStream includes TextDecoderCommon;
+TextDecoderStream includes GenericTransformStream;
+</pre>
+
+<p>A {{TextDecoderStream}} object has an associated
+<dfn for=TextDecoderStream>decoder</dfn>, and <dfn for=TextDecoderStream>stream</dfn>.
+
+<dl class=domintro>
+ <dt><code><var>decoder</var> = new
+ <a constructor for=TextDecoderStream lt=TextDecoderStream()>TextDecoderStream([<var>label</var> =
+ "utf-8" [, <var>options</var>]])</a></code>
+ <dd>
+  <p>Returns a new {{TextDecoderStream}} object.
+  <p>If <var>label</var> is either not a <a>label</a> or is a <a>label</a> for <a>replacement</a>,
+  <a>throws</a> a {{RangeError}}.
+
+ <dt><code><var>decoder</var> . <a attribute for=TextDecoderCommon>encoding</a></code>
+ <dd><p>Returns <a for=TextDecoderCommon>encoding</a>'s <a>name</a>, lowercased.
+
+ <dt><code><var>decoder</var> . <a attribute for=TextDecoderCommon>fatal</a></code>
+ <dd><p>Returns true if <a for=TextDecoderCommon>error mode</a> is "<code>fatal</code>", and
+ false otherwise.
+
+ <dt><code><var>decoder</var> . <a attribute for=TextDecoderCommon>ignoreBOM</a></code>
+ <dd><p>Returns true if <a for=TextDecoderCommon>ignore BOM flag</a> is set, and false
+ otherwise.
+
+ <dt><code><var>decoder</var> . <a attribute for=GenericTransformStream>readable</a></code>
+ <dd>
+  <p>Returns a <a>readable stream</a> whose <a>chunks</a> are strings resulting from running
+  <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a> on the chunks written to
+  {{GenericTransformStream/writable}}.
+
+ <dt><code><var>decoder</var> . <a attribute for=GenericTransformStream>writable</a></code>
+ <dd>
+  <p>Returns a <a>writable stream</a> which accepts {{BufferSource}} chunks and runs them through
+  <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a> before making them available to
+  {{GenericTransformStream/readable}}.
+
+  <p>Typically this will be used via the {{ReadableStream/pipeThrough()}} method on a
+  {{ReadableStream}} source.
+
+  <pre class=example id=example-textdecoderstream-writable><code class=lang-javascript>
+var decoder = new TextDecoderStream(encoding);
+byteReadable
+  .pipeThrough(decoder)
+  .pipeTo(textWritable);</code></pre>
+
+  <p>If the <a for=TextDecoderCommon>error mode</a> is "<code>fatal</code>" and
+  <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a> returns <a>error</a>, both
+  {{GenericTransformStream/readable}} and {{GenericTransformStream/writable}} will be errored with a
+  {{TypeError}}.
+</dl>
+
+<p>The
+<dfn constructor for=TextDecoderStream id=dom-textdecoderstream><code>TextDecoderStream(<var>label</var>,
+<var>options</var>)</code></dfn> constructor, when invoked, must run these steps:
+
+<ol>
+ <li><p>Let <var>encoding</var> be the result of <a>getting an encoding</a> from <var>label</var>.
+
+ <li><p>If <var>encoding</var> is failure or <a>replacement</a>, then <a>throw</a> a {{RangeError}}.
+
+ <li><p>Let <var>dec</var> be a new {{TextDecoderStream}} object.
+
+ <li><p>Set <var>dec</var>'s <a for=TextDecoderCommon>encoding</a> to <var>encoding</var>.
+
+ <li><p>If <var>options</var>'s <code>fatal</code> member is true, then set <var>dec</var>'s
+ <a for=TextDecoderCommon>error mode</a> to "<code>fatal</code>".
+
+ <li><p>If <var>options</var>'s <code>ignoreBOM</code> member is true, then set <var>dec</var>'s
+ <a for=TextDecoderCommon>ignore BOM flag</a>.
+
+ <li>
+  <p>Set <var>dec</var>'s <a for=TextDecoderStream>decoder</a> to a new <a for=/>decoder</a>
+  for <var>dec</var>'s <a for=TextDecoderCommon>encoding</a>, and set <var>dec</var>'s
+  <a for=TextDecoderStream>stream</a> to a new <a for=/>stream</a>.
+
+ <li><p>Let <var>startAlgorithm</var> be an algorithm that takes no arguments and returns nothing.
+
+ <li><p>Let <var>transformAlgorithm</var> be an algorithm which takes a <var>chunk</var> argument
+ and runs the <a>decode and enqueue a chunk</a> algorithm with <var>dec</var> and
+ <var>chunk</var>.
+
+ <li><p>Let <var>flushAlgorithm</var> be an algorithm which takes no arguments and runs the <a>flush
+ and enqueue</a> algorithm with <var>dec</var>.
+
+ <li><p>Let <var>transform</var> be the result of calling
+ <a abstract-op>CreateTransformStream</a>(<var>startAlgorithm</var>, <var>transformAlgorithm</var>,
+ <var>flushAlgorithm</var>).
+
+ <li><p>Set <var>dec</var>'s <a for=GenericTransformStream>transform</a> to <var>transform</var>.
+
+ <li><p>Return <var>dec</var>.
+</ol>
+
+<p>The <dfn>decode and enqueue a chunk</dfn> algorithm, given a {{TextDecoderStream}} object
+<var>dec</var> and a <var>chunk</var>, runs these steps:
+
+<ol>
+ <li><p>Let <var>bufferSource</var> be the result of
+ <a lt="converted to an IDL value">converting</a> <var>chunk</var> to a {{BufferSource}}. If this
+ throws an exception, then return a promise rejected with that exception.
+
+ <li><p><a>Push</a> a <a lt="get a copy of the buffer source">copy of</a> <var>bufferSource</var> to
+ <var>dec</var>'s <a for=TextDecoderStream>stream</a>. If this throws an exception, then return a
+ promise rejected with that exception.
+
+ <li><p>Let <var>controller</var> be <var>dec</var>'s
+ <a for=GenericTransformStream>transform</a>.\[[transformStreamController]].
+
+ <li><p>Let <var>output</var> be a new <a for=/>stream</a>.
+
+ <li>
+  <p>While true, run these steps:
+
+  <ol>
+   <li><p>Let <var>token</var> be the result of <a>reading</a> from <var>dec</var>'s
+   <a for=TextDecoderStream>stream</a>.
+
+   <li>
+    <p>If <var>token</var> is <a>end-of-stream</a>, run these steps:
+    <ol>
+     <li><p>Let <var>outputChunk</var> be <var>output</var>,
+     <a lt="serialize stream" for=TextDecoderCommon>serialized</a>.
+
+     <li><p>if <var>outputChunk</var> is non-empty, call
+     <a abstract-op>TransformStreamDefaultControllerEnqueue</a>(<var>controller</var>,
+     <var>outputChunk</var>).
+
+     <li><p>Return a new promise resolved with undefined.
+    </ol>
+
+   <li><p>Let <var>result</var> be the result of <a>processing</a> <var>token</var> for
+   <var>dec</var>'s <a for=TextDecoderStream>decoder</a>, <var>dec</var>'s
+   <a for=TextDecoderStream>stream</a>, <var>output</var>, and <var>dec</var>'s
+   <a for=TextDecoderCommon>error mode</a>.
+
+   <li><p>If <var>result</var> is <a>error</a>, then return a new promise rejected with a
+   {{TypeError}} exception.
+  </ol>
+</ol>
+
+<p>The <dfn>flush and enqueue</dfn> algorithm, which handles the end of data from the input
+{{ReadableStream}} object, given a {{TextDecoderStream}} object <var>dec</var>, runs these steps:
+
+<ol>
+ <li><p>Let <var>output</var> be a new <a for=/>stream</a>.
+
+ <li><p>Let <var>result</var> be the result of <a>processing</a> <a>end-of-stream</a> for
+ <var>dec</var>'s <a for=TextDecoderStream>decoder</a> and <var>dec</var>'s
+ <a for=TextDecoderStream>stream</a>, <var>output</var>, and <var>dec</var>'s
+ <a for=TextDecoderCommon>error mode</a>.
+
+ <li><p>If <var>result</var> is <a>finished</a>, run these steps:
+ <ol>
+  <li><p>Let <var>outputChunk</var> be <var>output</var>,
+  <a lt="serialize stream" for=TextDecoderCommon>serialized</a>.
+
+  <li><p>Let <var>controller</var> be <var>dec</var>'s
+  <a for=GenericTransformStream>transform</a>.\[[transformStreamController]].
+
+  <li><p>If <var>outputChunk</var> is non-empty, call
+  <a abstract-op>TransformStreamDefaultControllerEnqueue</a>(<var>controller</var>,
+  <var>outputChunk</var>).
+
+  <li><p>Return a new promise resolved with undefined.
+ </ol>
+
+ <li><p>Otherwise, return a new promise rejected with a {{TypeError}} exception.
+</ol>
+
+
+<h3 id=interface-textencoderstream>Interface {{TextEncoderStream}}</h3>
+
+<pre class=idl>
+[Constructor,
+ Exposed=(Window,Worker)]
+interface TextEncoderStream {
+};
+TextEncoderStream includes TextEncoderCommon;
+TextEncoderStream includes GenericTransformStream;
+</pre>
+
+<p>A {{TextEncoderStream}} object has an associated <dfn for=TextEncoderStream>encoder</dfn>,
+and <dfn for=TextEncoderStream>pending high surrogate</dfn> (initially null).
+
+<p class="note no-backref">A {{TextEncoderStream}} object offers no <var>label</var> argument as it
+only supports <a>UTF-8</a>.
+
+<dl class=domintro>
+ <dt><code><var>encoder</var> = new <a constructor for=TextEncoderStream>TextEncoderStream()</a></code>
+ <dd><p>Returns a new {{TextEncoderStream}} object.
+
+ <dt><code><var>encoder</var> . <a attribute for=TextEncoderCommon>encoding</a></code>
+ <dd><p>Returns "<code>utf-8</code>".
+
+ <dt><code><var>encoder</var> . <a attribute for=GenericTransformStream>readable</a></code>
+ <dd>
+  <p>Returns a <a>readable stream</a> whose <a>chunks</a> are {{Uint8Array}}s resulting from running
+  <a>UTF-8</a>'s <a for=/>encoder</a> on the chunks written to {{GenericTransformStream/writable}}.
+
+ <dt><code><var>encoder</var> . <a attribute for=GenericTransformStream>writable</a></code>
+ <dd>
+  <p>Returns a <a>writable stream</a> which accepts string chunks and runs them through
+  <a>UTF-8</a>'s <a for=/>encoder</a> before making them available to
+  {{GenericTransformStream/readable}}.
+
+  <p>Typically this will be used via the {{ReadableStream/pipeThrough()}} method on a
+  {{ReadableStream}} source.
+
+  <pre class=example id=example-textencoderstream-writable><code class=lang-javascript>
+textReadable
+  .pipeThrough(new TextEncoderStream())
+  .pipeTo(byteWritable);</code></pre>
+</dl>
+
+<p>The
+<dfn constructor for=TextEncoderStream id=dom-textencoderstream><code>TextEncoderStream()</code></dfn>
+constructor, when invoked, must run these steps:
+
+<ol>
+ <li><p>Let <var>enc</var> be a new {{TextEncoderStream}} object.
+
+ <li><p>Set <var>enc</var>'s <a for=TextEncoderStream>encoder</a> to <a>UTF-8</a>'s
+ <a for=/>encoder</a>.
+
+ <li><p>Let <var>startAlgorithm</var> be an algorithm that takes no arguments and returns nothing.
+
+ <li><p>Let <var>transformAlgorithm</var> be an algorithm which takes a <var>chunk</var> argument
+ and runs the <a>encode and enqueue a chunk</a> algorithm with <var>enc</var> and <var>chunk</var>.
+
+ <li><p>Let <var>flushAlgorithm</var> be an algorithm which runs the <a>encode and flush</a>
+ algorithm with <var>enc</var>.
+
+ <li><p>Let <var>transform</var> be the result of calling
+ <a abstract-op>CreateTransformStream</a>(<var>startAlgorithm</var>, <var>transformAlgorithm</var>,
+ <var>flushAlgorithm</var>).
+
+ <li><p>Set <var>enc</var>'s <a for=GenericTransformStream>transform</a> to <var>transform</var>.
+
+ <li><p>Return <var>enc</var>.
+</ol>
+
+<hr>
+
+<p>The <dfn>encode and enqueue a chunk</dfn> algorithm, given a {{TextEncoderStream}} object
+<var>enc</var> and <var>chunk</var>, runs these steps:
+
+<ol>
+ <li><p>Let <var>input</var> be the result of <a lt="converted to an IDL value">converting</a>
+ <var>chunk</var> to a {{DOMString}}. If this throws an exception, then return a promise rejected
+ with that exception.
+
+ <p class=note>{{DOMString}} is used here so that a surrogate pair that is split between chunks can
+ be reassembled into the appropriate scalar value. The behavior is otherwise identical to
+ {{USVString}}. In particular, lone surrogates will be replaced with U+FFFD.
+
+ <li><p>Convert <var>input</var> to a <a for=/>stream</a>.
+
+ <li><p>Let <var>output</var> be a new <a for=/>stream</a>.
+
+ <li><p>Let <var>controller</var> be <var>enc</var>'s
+ <a for=GenericTransformStream>transform</a>.\[[transformStreamController]].
+
+ <li>
+  <p>While true, run these steps:
+
+  <ol>
+   <li><p>Let <var>token</var> be the result of <a>reading</a> from <var>input</var>.
+
+   <li>
+    <p>If <var>token</var> is <a>end-of-stream</a>, run these steps:
+
+    <ol>
+     <li><p>Convert <var>output</var> into a byte sequence.
+
+     <li>
+      <p>If <var>output</var> is non-empty, run these steps:
+
+      <ol>
+       <li><p>Let <var>chunk</var> be a {{Uint8Array}} object wrapping an {{ArrayBuffer}} containing
+       <var>output</var>.
+
+       <li><p>Call <a abstract-op>TransformStreamDefaultControllerEnqueue</a>(<var>controller</var>,
+       <var>chunk</var>).
+      </ol>
+
+     <li><p>Return a new promise resolved with undefined.
+    </ol>
+
+   <li><p>Let <var>result</var> be the result of executing the <a>convert code unit to scalar
+   value</a> algorithm with <var>enc</var>, <var>token</var> and <var>input</var>.
+
+   <li><p>If <var>result</var> is not <a>continue</a>, then <a>process</a> <var>result</var> for
+   <a for=TextEncoderStream>encoder</a>, <var>input</var>, <var>output</var>.
+
+  </ol>
+</ol>
+
+<p>The <dfn>convert code unit to scalar value</dfn> algorithm, given a {{TextEncoderStream}} object
+<var>enc</var>, <var>token</var>, and stream <var>input</var>, runs these steps:
+
+<ol>
+ <li>
+  <p>If <var>enc</var>'s <a>pending high surrogate</a> is non-null, run these steps:
+
+  <ol>
+   <li><p>Let <var>high surrogate</var> be <var>enc</var>'s <a>pending high surrogate</a>.
+
+   <li><p>Set <var>enc</var>'s <a>pending high surrogate</a> to null.
+
+   <li><p>If <var>token</var> is in the range U+DC00 to U+DFFF, inclusive, then return a code point
+   whose value is 0x10000 + ((<var>high surrogate</var> &minus; 0xD800) &lt;&lt; 10) +
+   (<var>token</var> &minus; 0xDC00).
+
+   <li><p><a>Prepend</a> <var>token</var> to <var>input</var>.
+
+   <li><p>Return U+FFFD.
+  </ol>
+
+ <li><p>If <var>token</var> is in the range U+D800 to U+DBFF, inclusive, then set <a>pending high
+ surrogate</a> to <var>token</var> and return <a>continue</a>.
+
+ <li><p>If <var>token</var> is in the range U+DC00 to U+DFFF, inclusive, then return U+FFFD.
+
+ <li><p>Return <var>token</var>.
+</ol>
+
+<p class=note>This is equivalent to the "<a>convert</a> a <a>JavaScript string</a> into a <a>scalar
+value string</a>" algorithm from the Infra Standard, but allows for surrogate pairs that are split
+between strings. [[!INFRA]]
+
+<p>The <dfn>encode and flush</dfn> algorithm, given a {{TextEncoderStream}} object <var>enc</var>,
+runs these steps:
+
+<ol>
+ <li>
+  <p>If <var>enc</var>'s <a>pending high surrogate</a> is non-null, run these steps:
+
+  <ol>
+   <li><p>Let <var>controller</var> be <var>enc</var>'s
+   <a for=GenericTransformStream>transform</a>.\[[transformStreamController]].
+
+   <li>
+    <p>Let <var>output</var> be the byte sequence 0xEF 0xBF 0xBD.
+
+    <p class=note>This is the replacement character U+FFFD encoded as UTF-8.
+
+   <li><p>Let <var>chunk</var> be a {{Uint8Array}} object wrapping an {{ArrayBuffer}} containing
+   <var>output</var>.
+
+   <li><p>Call <a abstract-op>TransformStreamDefaultControllerEnqueue</a>(<var>controller</var>,
+   <var>chunk</var>).
+  </ol>
+
+ <li><p>Return a new promise resolved with undefined.
+</ol>
+
+
+
+<h2 id=the-encoding>The encoding</h2>
+
+<h3 id=utf-8 dfn export>UTF-8</h3>
+
+<h4 id=utf-8-decoder dfn export>UTF-8 decoder</h4>
+
+<p class="note no-backref">A byte order mark has priority over a <a>label</a> as it has been found
+to be more accurate in deployed content. Therefore it is not part of the <a>UTF-8 decoder</a>
+algorithm but rather the <a>decode</a> and <a>UTF-8 decode</a> algorithms.
+
+<p><a>UTF-8</a>'s <a for=/>decoder</a>'s has an associated
+<dfn>UTF-8 code point</dfn>, <dfn>UTF-8 bytes seen</dfn>, and
+<dfn>UTF-8 bytes needed</dfn> (all initially 0), a <dfn>UTF-8 lower boundary</dfn>
+(initially 0x80), and a <dfn>UTF-8 upper boundary</dfn> (initially 0xBF).
+
+<p><a>UTF-8</a>'s <a for=/>decoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-stream</a> and
+ <a>UTF-8 bytes needed</a> is not 0, set
+ <a>UTF-8 bytes needed</a> to 0 and return <a>error</a>.
+
+ <li><p>If <var>byte</var> is <a>end-of-stream</a>, return
+ <a>finished</a>.
+
+ <li>
+  <p>If <a>UTF-8 bytes needed</a> is 0, based on <var>byte</var>:
+
+  <dl class=switch>
+   <dt>0x00 to 0x7F
+   <dd><p>Return a code point whose value is <var>byte</var>.
+
+   <dt>0xC2 to 0xDF
+   <dd>
+    <ol>
+     <li><p>Set <a>UTF-8 bytes needed</a> to 1.
+
+     <li>
+      <p>Set <a>UTF-8 code point</a> to <var>byte</var> &amp; 0x1F.
+
+      <p class=note>The five least significant bits of <var>byte</var>.
+    </ol>
+
+   <dt>0xE0 to 0xEF
+   <dd>
+    <ol>
+     <li><p>If <var>byte</var> is 0xE0, set
+     <a>UTF-8 lower boundary</a> to 0xA0.
+
+     <li><p>If <var>byte</var> is 0xED, set
+     <a>UTF-8 upper boundary</a> to 0x9F.
+
+     <li><p>Set <a>UTF-8 bytes needed</a> to 2.
+
+     <li>
+      <p>Set <a>UTF-8 code point</a> to <var>byte</var> &amp; 0xF.
+
+      <p class=note>The four least significant bits of <var>byte</var>.
+    </ol>
+
+   <dt>0xF0 to 0xF4
+   <dd>
+    <ol>
+     <li><p>If <var>byte</var> is 0xF0, set
+     <a>UTF-8 lower boundary</a> to 0x90.
+
+     <li><p>If <var>byte</var> is 0xF4, set
+     <a>UTF-8 upper boundary</a> to 0x8F.
+
+     <li><p>Set <a>UTF-8 bytes needed</a> to 3.
+
+     <li>
+      <p>Set <a>UTF-8 code point</a> to <var>byte</var> &amp; 0x7.
+
+      <p class=note>The three least significant bits of <var>byte</var>.
+    </ol>
+
+   <dt>Otherwise
+   <dd><p>Return <a>error</a>.
+  </dl>
+
+  <p>Return <a>continue</a>.
+
+ <li>
+  <p>If <var>byte</var> is not in the range <a>UTF-8 lower boundary</a> to
+  <a>UTF-8 upper boundary</a>, inclusive, then:
+
+  <ol>
+   <li><p>Set <a>UTF-8 code point</a>,
+   <a>UTF-8 bytes needed</a>, and <a>UTF-8 bytes seen</a> to 0,
+   set <a>UTF-8 lower boundary</a> to 0x80, and set
+   <a>UTF-8 upper boundary</a> to 0xBF.
+
+   <li><p><a>Prepend</a> <var>byte</var> to
+   <var>stream</var>.
+
+   <li><p>Return <a>error</a>.
+  </ol>
+
+ <li><p>Set <a>UTF-8 lower boundary</a> to 0x80 and
+ <a>UTF-8 upper boundary</a> to 0xBF.
+
+ <li>
+  <p>Set <a>UTF-8 code point</a> to (<a>UTF-8 code point</a> &lt;&lt; 6) |
+  (<var>byte</var> &amp; 0x3F)
+
+  <p class="note no-backref">Shift the existing bits of <a>UTF-8 code point</a> left by six
+  places and set the newly-vacated six least significant bits to the six least significant bits of
+  <var>byte</var>.
+
+ <li><p>Increase <a>UTF-8 bytes seen</a> by one.
+
+ <li><p>If <a>UTF-8 bytes seen</a> is not equal to
+ <a>UTF-8 bytes needed</a>, return <a>continue</a>.
+
+ <li><p>Let <var>code point</var> be <a>UTF-8 code point</a>.
+
+ <li><p>Set <a>UTF-8 code point</a>,
+ <a>UTF-8 bytes needed</a>, and <a>UTF-8 bytes seen</a> to 0.
+
+ <li><p>Return a code point whose value is <var>code point</var>.
+</ol>
+
+<p class=note>The constraints in the <a>UTF-8 decoder</a> above match
+“Best Practices for Using U+FFFD” from the Unicode standard. No other
+behavior is permitted per the Encoding Standard (other algorithms that
+achieve the same result are fine, even encouraged).
+[[!UNICODE]]
+
+
+<h4 id=utf-8-encoder dfn export>UTF-8 encoder</h4>
+
+<p><a>UTF-8</a>'s <a for=/>encoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-stream</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return
+ a byte whose value is <var>code point</var>.
+
+ <li>
+  <p>Set <var>count</var> and <var>offset</var> based on the
+  range <var>code point</var> is in:
+
+  <dl class=switch>
+   <dt>U+0080 to U+07FF, inclusive
+   <dd>1 and 0xC0
+   <dt>U+0800 to U+FFFF, inclusive
+   <dd>2 and 0xE0
+   <dt>U+10000 to U+10FFFF, inclusive
+   <dd>3 and 0xF0
+  </dl>
+
+ <li><p>Let <var>bytes</var> be a byte sequence whose first byte is
+ (<var>code point</var> >> (6 × <var>count</var>)) + <var>offset</var>.
+
+ <li>
+  <p>While <var>count</var> is greater than 0:
+
+  <ol>
+   <li><p>Set <var>temp</var> to
+   <var>code point</var> >> (6 × (<var>count</var> &minus; 1)).
+
+   <li><p>Append to <var>bytes</var> 0x80 | (<var>temp</var> &amp; 0x3F).
+
+   <li><p>Decrease <var>count</var> by one.
+  </ol>
+
+ <li><p>Return bytes <var>bytes</var>, in order.
+</ol>
+
+<p class=note>This algorithm has identical results to the one described in the Unicode standard. It
+is included here for completeness. [[!UNICODE]]
+
+
+
+<h2 id=legacy-single-byte-encodings>Legacy single-byte encodings</h2>
+
+<p>An <a for=/>encoding</a> where each byte is either a single code point or
+nothing, is a <dfn>single-byte encoding</dfn>.
+<a>Single-byte encodings</a> share the
+<a for=/>decoder</a> and <a for=/>encoder</a>. <dfn>Index single-byte</dfn>,
+as referenced by the <a>single-byte decoder</a> and
+<a>single-byte encoder</a>,  is defined by the following table, and
+depends on the <a>single-byte encoding</a> in use. All but two
+<a>single-byte encodings</a> have a
+unique <a>index</a>.
+
+<table>
+ <tr><td><dfn export>IBM866</dfn><td><a href=index-ibm866.txt>index-ibm866.txt</a><td><a href=ibm866.html>index IBM866 visualization</a><td><a href=ibm866-bmp.html>index IBM866 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-2</dfn><td><a href=index-iso-8859-2.txt>index-iso-8859-2.txt</a><td><a href=iso-8859-2.html>index ISO-8859-2 visualization</a><td><a href=iso-8859-2-bmp.html>index ISO-8859-2 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-3</dfn><td><a href=index-iso-8859-3.txt>index-iso-8859-3.txt</a><td><a href=iso-8859-3.html>index ISO-8859-3 visualization</a><td><a href=iso-8859-3-bmp.html>index ISO-8859-3 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-4</dfn><td><a href=index-iso-8859-4.txt>index-iso-8859-4.txt</a><td><a href=iso-8859-4.html>index ISO-8859-4 visualization</a><td><a href=iso-8859-4-bmp.html>index ISO-8859-4 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-5</dfn><td><a href=index-iso-8859-5.txt>index-iso-8859-5.txt</a><td><a href=iso-8859-5.html>index ISO-8859-5 visualization</a><td><a href=iso-8859-5-bmp.html>index ISO-8859-5 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-6</dfn><td><a href=index-iso-8859-6.txt>index-iso-8859-6.txt</a><td><a href=iso-8859-6.html>index ISO-8859-6 visualization</a><td><a href=iso-8859-6-bmp.html>index ISO-8859-6 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-7</dfn><td><a href=index-iso-8859-7.txt>index-iso-8859-7.txt</a><td><a href=iso-8859-7.html>index ISO-8859-7 visualization</a><td><a href=iso-8859-7-bmp.html>index ISO-8859-7 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-8</dfn><td rowspan=2><a href=index-iso-8859-8.txt>index-iso-8859-8.txt</a><td rowspan=2><a href=iso-8859-8.html>index ISO-8859-8 visualization</a><td rowspan=2><a href=iso-8859-8-bmp.html>index ISO-8859-8 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-8-I</dfn>
+ <tr><td><dfn export>ISO-8859-10</dfn><td><a href=index-iso-8859-10.txt>index-iso-8859-10.txt</a><td><a href=iso-8859-10.html>index ISO-8859-10 visualization</a><td><a href=iso-8859-10-bmp.html>index ISO-8859-10 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-13</dfn><td><a href=index-iso-8859-13.txt>index-iso-8859-13.txt</a><td><a href=iso-8859-13.html>index ISO-8859-13 visualization</a><td><a href=iso-8859-13-bmp.html>index ISO-8859-13 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-14</dfn><td><a href=index-iso-8859-14.txt>index-iso-8859-14.txt</a><td><a href=iso-8859-14.html>index ISO-8859-14 visualization</a><td><a href=iso-8859-14-bmp.html>index ISO-8859-14 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-15</dfn><td><a href=index-iso-8859-15.txt>index-iso-8859-15.txt</a><td><a href=iso-8859-15.html>index ISO-8859-15 visualization</a><td><a href=iso-8859-15-bmp.html>index ISO-8859-15 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-16</dfn><td><a href=index-iso-8859-16.txt>index-iso-8859-16.txt</a><td><a href=iso-8859-16.html>index ISO-8859-16 visualization</a><td><a href=iso-8859-16-bmp.html>index ISO-8859-16 BMP coverage</a>
+ <tr><td><dfn export>KOI8-R</dfn><td><a href=index-koi8-r.txt>index-koi8-r.txt</a><td><a href=koi8-r.html>index KOI8-R visualization</a><td><a href=koi8-r-bmp.html>index KOI8-R BMP coverage</a>
+ <tr><td><dfn export>KOI8-U</dfn><td><a href=index-koi8-u.txt>index-koi8-u.txt</a><td><a href=koi8-u.html>index KOI8-U visualization</a><td><a href=koi8-u-bmp.html>index KOI8-U BMP coverage</a>
+ <tr><td><dfn export>macintosh</dfn><td><a href=index-macintosh.txt>index-macintosh.txt</a><td><a href=macintosh.html>index macintosh visualization</a><td><a href=macintosh-bmp.html>index macintosh BMP coverage</a>
+ <tr><td><dfn export>windows-874</dfn><td><a href=index-windows-874.txt>index-windows-874.txt</a><td><a href=windows-874.html>index windows-874 visualization</a><td><a href=windows-874-bmp.html>index windows-874 BMP coverage</a>
+ <tr><td><dfn export>windows-1250</dfn><td><a href=index-windows-1250.txt>index-windows-1250.txt</a><td><a href=windows-1250.html>index windows-1250 visualization</a><td><a href=windows-1250-bmp.html>index windows-1250 BMP coverage</a>
+ <tr><td><dfn export>windows-1251</dfn><td><a href=index-windows-1251.txt>index-windows-1251.txt</a><td><a href=windows-1251.html>index windows-1251 visualization</a><td><a href=windows-1251-bmp.html>index windows-1251 BMP coverage</a>
+ <tr><td><dfn export>windows-1252</dfn><td><a href=index-windows-1252.txt>index-windows-1252.txt</a><td><a href=windows-1252.html>index windows-1252 visualization</a><td><a href=windows-1252-bmp.html>index windows-1252 BMP coverage</a>
+ <tr><td><dfn export>windows-1253</dfn><td><a href=index-windows-1253.txt>index-windows-1253.txt</a><td><a href=windows-1253.html>index windows-1253 visualization</a><td><a href=windows-1253-bmp.html>index windows-1253 BMP coverage</a>
+ <tr><td><dfn export>windows-1254</dfn><td><a href=index-windows-1254.txt>index-windows-1254.txt</a><td><a href=windows-1254.html>index windows-1254 visualization</a><td><a href=windows-1254-bmp.html>index windows-1254 BMP coverage</a>
+ <tr><td><dfn export>windows-1255</dfn><td><a href=index-windows-1255.txt>index-windows-1255.txt</a><td><a href=windows-1255.html>index windows-1255 visualization</a><td><a href=windows-1255-bmp.html>index windows-1255 BMP coverage</a>
+ <tr><td><dfn export>windows-1256</dfn><td><a href=index-windows-1256.txt>index-windows-1256.txt</a><td><a href=windows-1256.html>index windows-1256 visualization</a><td><a href=windows-1256-bmp.html>index windows-1256 BMP coverage</a>
+ <tr><td><dfn export>windows-1257</dfn><td><a href=index-windows-1257.txt>index-windows-1257.txt</a><td><a href=windows-1257.html>index windows-1257 visualization</a><td><a href=windows-1257-bmp.html>index windows-1257 BMP coverage</a>
+ <tr><td><dfn export>windows-1258</dfn><td><a href=index-windows-1258.txt>index-windows-1258.txt</a><td><a href=windows-1258.html>index windows-1258 visualization</a><td><a href=windows-1258-bmp.html>index windows-1258 BMP coverage</a>
+ <tr><td><dfn export>x-mac-cyrillic</dfn><td><a href=index-x-mac-cyrillic.txt>index-x-mac-cyrillic.txt</a><td><a href=x-mac-cyrillic.html>index x-mac-cyrillic visualization</a><td><a href=x-mac-cyrillic-bmp.html>index x-mac-cyrillic BMP coverage</a>
+ </table>
+
+<p class=note><a>ISO-8859-8</a> and <a>ISO-8859-8-I</a> are
+distinct <a for=/>encoding</a> <a for=encoding>names</a>, because
+<a>ISO-8859-8</a> has influence on the layout direction. And although
+historically this might have been the case for <a>ISO-8859-6</a> and
+"ISO-8859-6-I" as well, that is no longer true.
+<!-- https://www.w3.org/Bugs/Public/show_bug.cgi?id=19505 -->
+
+<h3 id=single-byte-decoder dfn export>single-byte decoder</h3>
+
+<p><a>Single-byte encodings</a>'s
+<a for=/>decoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-stream</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>byte</var> is an <a>ASCII byte</a>, return a code point whose value
+ is <var>byte</var>.
+
+ <li><p>Let <var>code point</var> be the <a>index code point</a>
+ for <var>byte</var> &minus; 0x80 in <a>index single-byte</a>.
+
+ <li><p>If <var>code point</var> is null, return <a>error</a>.
+
+ <li><p>Return a code point whose value is <var>code point</var>.
+</ol>
+
+<h3 id=single-byte-encoder export dfn>single-byte encoder</h3>
+
+<p><a>Single-byte encodings</a>'s
+<a for=/>encoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-stream</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return
+ a byte whose value is <var>code point</var>.
+
+ <li><p>Let <var>pointer</var> be the <a>index pointer</a> for
+ <var>code point</var> in <a>index single-byte</a>.
+
+ <li><p>If <var>pointer</var> is null, return <a>error</a> with
+ <var>code point</var>.
+
+ <li><p>Return a byte whose value is <var>pointer</var> + 0x80.
+</ol>
+
+
+
+<h2 id=legacy-multi-byte-chinese-(simplified)-encodings>Legacy multi-byte Chinese (simplified) encodings</h2>
+
+<h3 id=gbk dfn export>GBK</h3>
+
+<h4 id=gbk-decoder dfn export>GBK decoder</h4>
+
+<p><a>GBK</a>'s <a for=/>decoder</a> is <a>gb18030</a>'s <a for=/>decoder</a>.
+
+
+<h4 id=gbk-encoder dfn export>GBK encoder</h4>
+
+<p><a>GBK</a>'s <a for=/>encoder</a> is <a>gb18030</a>'s <a for=/>encoder</a>
+with its <a>GBK flag</a> set.
+
+<p class="note no-backref">Not fully aliasing <a>GBK</a> with <a>gb18030</a>
+is a conservative move to decrease the chances of breaking legacy servers and other
+consumers of content generated with <a>GBK</a>'s <a for=/>encoder</a>.
+
+
+<h3 id=gb18030 dfn export>gb18030</h3>
+
+<h4 id=gb18030-decoder dfn export>gb18030 decoder</h4>
+
+<p><a>gb18030</a>'s <a for=/>decoder</a> has an associated <dfn>gb18030 first</dfn>,
+<dfn>gb18030 second</dfn>, and <dfn>gb18030 third</dfn> (all initially 0x00).
+
+<p><a>gb18030</a>'s <a for=/>decoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-stream</a> and
+ <a>gb18030 first</a>, <a>gb18030 second</a>, and <a>gb18030 third</a>
+ are 0x00, return <a>finished</a>.
+
+ <li><p>If <var>byte</var> is <a>end-of-stream</a>, and
+ <a>gb18030 first</a>, <a>gb18030 second</a>, or <a>gb18030 third</a>
+ is not 0x00, set <a>gb18030 first</a>, <a>gb18030 second</a>, and
+ <a>gb18030 third</a> to 0x00, and return <a>error</a>.
+
+ <li>
+  <p>If <a>gb18030 third</a> is not 0x00, then:
+
+  <ol>
+   <li>
+    <p>If <var>byte</var> is not in the range 0x30 to 0x39, inclusive, then:
+
+    <ol>
+     <li><p><a>Prepend</a> <a>gb18030 second</a>, <a>gb18030 third</a>, and <var>byte</var> to
+     <var>stream</var>.
+
+     <li><p>Set <a>gb18030 first</a>, <a>gb18030 second</a>, and <a>gb18030 third</a> to 0x00.
+
+     <li><p>Return <a>error</a>.
+    </ol>
+
+   <li><p>Let <var>code point</var> be the <a>index gb18030 ranges code point</a> for
+   ((<a>gb18030 first</a> &minus; 0x81) × (10 × 126 × 10)) +
+   ((<a>gb18030 second</a> &minus; 0x30) × (10 × 126)) +
+   ((<a>gb18030 third</a> &minus; 0x81) × 10) + <var>byte</var> &minus; 0x30.
+
+   <li><p>Set <a>gb18030 first</a>, <a>gb18030 second</a>, and <a>gb18030 third</a> to 0x00.
+
+   <li><p>If <var>code point</var> is null, return <a>error</a>.
+
+   <li><p>Return a code point whose value is <var>code point</var>.
+  </ol>
+
+ <li>
+  <p>If <a>gb18030 second</a> is not 0x00, then:
+
+  <ol>
+   <li><p>If <var>byte</var> is in the range 0x81 to 0xFE, inclusive, set
+   <a>gb18030 third</a> to <var>byte</var> and return <a>continue</a>.
+
+   <li><p><a>Prepend</a> <a>gb18030 second</a>
+   followed by <var>byte</var> to <var>stream</var>, set
+   <a>gb18030 first</a> and <a>gb18030 second</a> to 0x00, and return
+   <a>error</a>.
+  </ol>
+
+ <li>
+  <p>If <a>gb18030 first</a> is not 0x00, then:
+
+  <ol>
+   <li><p>If <var>byte</var> is in the range 0x30 to 0x39, inclusive, set
+   <a>gb18030 second</a> to <var>byte</var> and return <a>continue</a>.
+
+   <li><p>Let <var>lead</var> be <a>gb18030 first</a>, let
+   <var>pointer</var> be null, and set <a>gb18030 first</a> to 0x00.
+
+   <li><p>Let <var>offset</var> be 0x40 if <var>byte</var> is
+   less than 0x7F and 0x41 otherwise.
+
+   <li><p>If <var>byte</var> is in the range 0x40 to 0x7E, inclusive, or
+   0x80 to 0xFE, inclusive, set <var>pointer</var> to
+   (<var>lead</var> &minus; 0x81) × 190 + (<var>byte</var> &minus; <var>offset</var>).
+
+   <li><p>Let <var>code point</var> be null if
+   <var>pointer</var> is null and the <a>index code point</a>
+   for <var>pointer</var> in <a>index gb18030</a> otherwise.
+
+   <li><p>If <var>code point</var> is non-null, return a code point whose value is
+   <var>code point</var>.
+
+   <li><p>If <var>byte</var> is an <a>ASCII byte</a>, <a>prepend</a> <var>byte</var> to
+   <var>stream</var>.
+
+   <li><p>Return <a>error</a>.
+  </ol>
+
+ <li><p>If <var>byte</var> is an <a>ASCII byte</a>, return
+ a code point whose value is <var>byte</var>.
+
+ <li><p>If <var>byte</var> is 0x80, return code point U+20AC.
+
+ <li><p>If <var>byte</var> is in the range 0x81 to 0xFE, inclusive, set
+ <a>gb18030 first</a> to <var>byte</var> and return <a>continue</a>.
+
+ <li><p>Return <a>error</a>.
+</ol>
+
+
+<h4 id=gb18030-encoder dfn export>gb18030 encoder</h4>
+
+<p><a>gb18030</a>'s <a for=/>encoder</a> has an associated <dfn>GBK flag</dfn>
+(initially unset).
+
+<p><a>gb18030</a>'s <a for=/>encoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-stream</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return
+ a byte whose value is <var>code point</var>.
+
+ <li>
+  <p>If <var>code point</var> is U+E5E5, return <a>error</a> with <var>code point</var>.
+
+  <p class=note><a>Index gb18030</a> maps 0xA3 0xA0 to U+3000 rather than U+E5E5 for
+  compatibility with deployed content. Therefore it cannot roundtrip.
+
+ <li><p>If the <a>GBK flag</a> is set and <var>code point</var> is
+ U+20AC, return byte 0x80.
+
+ <li><p>Let <var>pointer</var> be the <a>index pointer</a> for
+ <var>code point</var> in <a>index gb18030</a>.
+
+ <li>
+  <p>If <var>pointer</var> is non-null, then:
+
+  <ol>
+   <li><p>Let <var>lead</var> be <var>pointer</var> / 190 + 0x81.
+
+   <li><p>Let <var>trail</var> be <var>pointer</var> % 190.
+
+   <li><p>Let <var>offset</var> be 0x40 if <var>trail</var> is
+   less than 0x3F<!--0x7F-0x40--> and 0x41 otherwise.
+
+   <li><p>Return two bytes whose values are <var>lead</var> and
+   <var>trail</var> + <var>offset</var>.
+  </ol>
+
+ <li><p>If <a>GBK flag</a> is set, return <a>error</a> with
+ <var>code point</var>.
+
+ <li><p>Set <var>pointer</var> to the
+ <a>index gb18030 ranges pointer</a> for <var>code point</var>.
+
+ <li><p>Let <var>byte1</var> be <var>pointer</var> / (10 × 126 × 10).
+
+ <li><p>Set <var>pointer</var> to <var>pointer</var> % (10 × 126 × 10).
+
+ <li><p>Let <var>byte2</var> be <var>pointer</var> / (10 × 126).
+
+ <li><p>Set <var>pointer</var> to <var>pointer</var> % (10 × 126).
+
+ <li><p>Let <var>byte3</var> be <var>pointer</var> / 10.
+
+ <li><p>Let <var>byte4</var> be <var>pointer</var> % 10.
+
+ <li><p>Return four bytes whose values are <var>byte1</var> + 0x81,
+ <var>byte2</var> + 0x30, <var>byte3</var> + 0x81,
+ <var>byte4</var> + 0x30.
+</ol>
+
+
+
+<h2 id=legacy-multi-byte-chinese-(traditional)-encodings>Legacy multi-byte Chinese (traditional) encodings</h2>
+
+<!--
+ Lead:  0x81 to 0xFE
+ Trail: 0x40 to 0x7E or 0xA1 to 0xFE
+-->
+
+
+<h3 id=big5 dfn export>Big5</h3>
+
+<h4 id=big5-decoder dfn export>Big5 decoder</h4>
+
+<p><a>Big5</a>'s <a for=/>decoder</a> has an associated
+<dfn>Big5 lead</dfn> (initially 0x00).
+
+<a>Big5</a>'s <a for=/>decoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-stream</a> and <a>Big5 lead</a>
+ is not 0x00, set <a>Big5 lead</a> to 0x00 and return <a>error</a>.
+
+ <li><p>If <var>byte</var> is <a>end-of-stream</a> and <a>Big5 lead</a>
+ is 0x00, return <a>finished</a>.
+
+ <li>
+  <p>If <a>Big5 lead</a> is not 0x00, let <var>lead</var> be
+  <a>Big5 lead</a>, let <var>pointer</var> be null, set
+  <a>Big5 lead</a> to 0x00, and then:
+
+  <ol>
+   <li><p>Let <var>offset</var> be 0x40 if <var>byte</var> is
+   less than 0x7F and 0x62 otherwise.
+   <!-- 0x62 = 0xA1-0x7E+1+0x40 -->
+
+   <li><p>If <var>byte</var> is in the range 0x40 to 0x7E, inclusive, or
+   0xA1 to 0xFE, inclusive, set <var>pointer</var> to
+   (<var>lead</var> &minus; 0x81) × 157 + (<var>byte</var> &minus; <var>offset</var>).
+
+   <li>
+    <p>If there is a row in the table below whose first column is
+    <var>pointer</var>, return the <em>two</em> code points listed in
+    its second column (the third column is irrelevant):
+
+    <table>
+     <tbody><tr><th>Pointer<th>Code points<th>Notes<!-- https://www.unicode.org/Public/UNIDATA/NamedSequences.txt -->
+     <tr><td>1133<!-- 0x88 0x62 --><td>U+00CA U+0304<td>Ê̄ (LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND MACRON)
+     <tr><td>1135<!-- 0x88 0x64 --><td>U+00CA U+030C<td>Ê̌ (LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND CARON)
+     <tr><td>1164<!-- 0x88 0xA3 --><td>U+00EA U+0304<td>ê̄ (LATIN SMALL LETTER E WITH CIRCUMFLEX AND MACRON)
+     <tr><td>1166<!-- 0x88 0xA5 --><td>U+00EA U+030C<td>ê̌ (LATIN SMALL LETTER E WITH CIRCUMFLEX AND CARON)
+    </table>
+    <!-- we do this to avoid PUA -->
+
+    <p class=note>Since <a lt=index>indexes</a> are limited to
+    single code points this table is used for these pointers.
+
+   <li><p>Let <var>code point</var> be null if
+   <var>pointer</var> is null and the <a>index code point</a>
+   for <var>pointer</var> in <a>index Big5</a> otherwise.
+
+   <li><p>If <var>code point</var> is non-null, return a code point whose value is
+   <var>code point</var>.
+
+   <li><p>If <var>byte</var> is an <a>ASCII byte</a>, <a>prepend</a> <var>byte</var> to
+   <var>stream</var>.
+
+   <li><p>Return <a>error</a>.
+  </ol>
+
+ <li><p>If <var>byte</var> is an <a>ASCII byte</a>, return
+ a code point whose value is <var>byte</var>.
+
+ <li><p>If <var>byte</var> is in the range 0x81 to 0xFE, inclusive, set
+ <a>Big5 lead</a> to <var>byte</var> and return <a>continue</a>.
+
+ <li><p>Return <a>error</a>.
+</ol>
+
+
+<h4 id=big5-encoder dfn export>Big5 encoder</h4>
+
+<p><a>Big5</a>'s <a for=/>encoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-stream</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return
+ a byte whose value is <var>code point</var>.
+
+ <li><p>Let <var>pointer</var> be the <a>index Big5 pointer</a> for
+ <var>code point</var>.
+
+ <li><p>If <var>pointer</var> is null, return <a>error</a> with
+ <var>code point</var>.
+
+ <li><p>Let <var>lead</var> be <var>pointer</var> / 157 + 0x81.
+
+ <li><p>Let <var>trail</var> be <var>pointer</var> % 157.
+
+ <li><p>Let <var>offset</var> be 0x40 if <var>trail</var> is
+ less than 0x3F<!--0x7F-0x40--> and 0x62<!--0xA1-0x3F--> otherwise.
+
+ <li><p>Return two bytes whose values are <var>lead</var> and
+ <var>trail</var> + <var>offset</var>.
+</ol>
+
+
+
+<h2 id=legacy-multi-byte-japanese-encodings>Legacy multi-byte Japanese encodings</h2>
+
+<h3 id=euc-jp dfn export>EUC-JP</h3>
+<!-- https://www.iana.org/assignments/charset-reg/CP51932 -->
+
+<h4 id=euc-jp-decoder dfn export>EUC-JP decoder</h4>
+
+<p><a>EUC-JP</a>'s <a for=/>decoder</a> has an associated
+<dfn>EUC-JP jis0212 flag</dfn> (initially unset) and
+<dfn>EUC-JP lead</dfn> (initially 0x00).
+
+<p><a>EUC-JP</a>'s <a for=/>decoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-stream</a> and
+ <a>EUC-JP lead</a> is not 0x00, set <a>EUC-JP lead</a> to 0x00, and return
+ <a>error</a>.
+
+ <li><p>If <var>byte</var> is <a>end-of-stream</a> and
+ <a>EUC-JP lead</a> is 0x00, return <a>finished</a>.
+
+ <li><p>If <a>EUC-JP lead</a> is 0x8E and <var>byte</var> is
+ in the range 0xA1 to 0xDF, inclusive, set <a>EUC-JP lead</a> to 0x00 and return
+ a code point whose value is 0xFF61 &minus; 0xA1 + <var>byte</var>.
+ <!-- Katakana; subtraction is done first to avoid upsetting compilers -->
+
+ <li><p>If <a>EUC-JP lead</a> is 0x8F and <var>byte</var> is in the range
+ 0xA1 to 0xFE, inclusive, set the <a>EUC-JP jis0212 flag</a>, set
+ <a>EUC-JP lead</a> to <var>byte</var>, and return <a>continue</a>.
+
+ <li>
+  <p>If <a>EUC-JP lead</a> is not 0x00, let <var>lead</var> be <a>EUC-JP lead</a>, set
+  <a>EUC-JP lead</a> to 0x00, and then:
+
+  <ol>
+   <li><p>Let <var>code point</var> be null.
+
+   <li><p>If <var>lead</var> and <var>byte</var> are both in the
+   range 0xA1 to 0xFE, inclusive, set <var>code point</var> to the
+   <a>index code point</a> for
+   (<var>lead</var> &minus; 0xA1) × 94 + <var>byte</var> &minus; 0xA1
+   in <a>index jis0208</a> if the <a>EUC-JP jis0212 flag</a> is unset and in
+   <a>index jis0212</a> otherwise.
+
+   <li><p>Unset the <a>EUC-JP jis0212 flag</a>.
+
+   <li><p>If <var>code point</var> is non-null, return a code point whose value is
+   <var>code point</var>.
+
+   <li><p>If <var>byte</var> is an <a>ASCII byte</a>, <a>prepend</a> <var>byte</var> to
+   <var>stream</var>.
+
+   <li><p>Return <a>error</a>.
+  </ol>
+
+ <li><p>If <var>byte</var> is an <a>ASCII byte</a>, return
+ a code point whose value is <var>byte</var>.
+
+ <li><p>If <var>byte</var> is 0x8E, 0x8F, or in the range 0xA1 to
+ 0xFE, inclusive, set <a>EUC-JP lead</a> to <var>byte</var> and return
+ <a>continue</a>.
+
+ <li><p>Return <a>error</a>.
+</ol>
+
+
+<h4 id=euc-jp-encoder dfn export>EUC-JP encoder</h4>
+
+<p><a>EUC-JP</a>'s <a for=/>encoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-stream</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return
+ a byte whose value is <var>code point</var>.
+
+ <li><p>If <var>code point</var> is U+00A5, return byte 0x5C.
+
+ <li><p>If <var>code point</var> is U+203E, return byte 0x7E.
+
+ <li><p>If <var>code point</var> is in the range U+FF61 to U+FF9F, inclusive, return
+ two bytes whose values are 0x8E and <var>code point</var> &minus; 0xFF61 + 0xA1.
+
+ <li><p>If <var>code point</var> is U+2212, set it to U+FF0D.
+
+ <li>
+  <p>Let <var>pointer</var> be the <a>index pointer</a> for <var>code point</var> in
+  <a>index jis0208</a>.
+
+  <p class=note>If <var>pointer</var> is non-null, it is less than 8836 due to the nature of
+  <a>index jis0208</a> and the <a>index pointer</a> operation.
+
+ <li><p>If <var>pointer</var> is null, return <a>error</a> with
+ <var>code point</var>.
+
+ <li><p>Let <var>lead</var> be <var>pointer</var> / 94 + 0xA1.
+
+ <li><p>Let <var>trail</var> be <var>pointer</var> % 94 + 0xA1.
+
+ <li><p>Return two bytes whose values are <var>lead</var> and
+ <var>trail</var>.
+</ol>
+
+
+<h3 id=iso-2022-jp dfn export>ISO-2022-JP</h3>
+<!--
+ https://tools.ietf.org/html/rfc1468
+ https://tools.ietf.org/html/rfc2237 (ISO-2022-JP-1; not used)
+ "ESC ) I" is from ISO-2022-JP-3 reportedly
+-->
+
+<h4 id=iso-2022-jp-decoder dfn export>ISO-2022-JP decoder</h4>
+
+<p><a>ISO-2022-JP</a>'s <a for=/>decoder</a> has an associated
+<dfn>ISO-2022-JP decoder state</dfn> (initially
+<a lt="ISO-2022-JP decoder ASCII">ASCII</a>),
+<dfn>ISO-2022-JP decoder output state</dfn> (initially
+<a lt="ISO-2022-JP decoder ASCII">ASCII</a>),
+<dfn>ISO-2022-JP lead</dfn> (initially 0x00), and
+<dfn>ISO-2022-JP output flag</dfn> (initially unset).
+
+<p><a>ISO-2022-JP</a>'s <a for=/>decoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>byte</var>, runs these steps, switching on
+<a>ISO-2022-JP decoder state</a>:
+
+<dl class=switch>
+ <dt><dfn lt="ISO-2022-JP decoder ASCII">ASCII</dfn>
+ <dd>
+  <p>Based on <var>byte</var>:
+
+  <dl class=switch>
+   <dt>0x1B
+   <dd><p>Set <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder escape start">escape start</a> and return
+   <a>continue</a>.
+
+   <dt>0x00 to 0x7F, excluding 0x0E, 0x0F, and 0x1B
+   <dd><p>Unset the <a>ISO-2022-JP output flag</a> and return a code point whose
+   value is <var>byte</var>.
+
+   <dt><a>end-of-stream</a>
+   <dd><p>Return <a>finished</a>.
+
+   <dt>Otherwise
+   <dd><p>Unset the <a>ISO-2022-JP output flag</a> and return <a>error</a>.
+  </dl>
+
+ <dt><dfn lt="ISO-2022-JP decoder Roman">Roman</dfn>
+ <dd>
+  <p>Based on <var>byte</var>:
+
+  <dl class=switch>
+   <dt>0x1B
+   <dd><p>Set <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder escape start">escape start</a> and return
+   <a>continue</a>.
+
+   <dt>0x5C
+   <dd><p>Unset the <a>ISO-2022-JP output flag</a> and return code point U+00A5.
+
+   <dt>0x7E
+   <dd><p>Unset the <a>ISO-2022-JP output flag</a> and return code point U+203E.
+
+   <dt>0x00 to 0x7F, excluding 0x0E, 0x0F, 0x1B, 0x5C, and 0x7E
+   <dd><p>Unset the <a>ISO-2022-JP output flag</a> and return a code point whose
+   value is <var>byte</var>.
+
+   <dt><a>end-of-stream</a>
+   <dd><p>Return <a>finished</a>.
+
+   <dt>Otherwise
+   <dd><p>Unset the <a>ISO-2022-JP output flag</a> and return <a>error</a>.
+  </dl>
+
+ <dt><dfn lt="ISO-2022-JP decoder katakana">katakana</dfn>
+ <dd>
+  <p>Based on <var>byte</var>:
+  <dl class=switch>
+   <dt>0x1B
+   <dd><p>Set <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder escape start">escape start</a> and return
+   <a>continue</a>.
+
+   <dt>0x21 to 0x5F
+   <dd><p>Unset the <a>ISO-2022-JP output flag</a> and return a code point whose
+   value is 0xFF61 &minus; 0x21 + <var>byte</var>.
+   <!-- Katakana; subtraction is done first to avoid upsetting compilers -->
+
+   <dt><a>end-of-stream</a>
+   <dd><p>Return <a>finished</a>.
+
+   <dt>Otherwise
+   <dd><p>Unset the <a>ISO-2022-JP output flag</a> and return <a>error</a>.
+  </dl>
+
+ <dt><dfn lt="ISO-2022-JP decoder lead byte">Lead byte</dfn>
+ <dd>
+  <p>Based on <var>byte</var>:
+  <dl class=switch>
+   <dt>0x1B
+   <dd><p>Set <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder escape start">escape start</a> and return
+   <a>continue</a>.
+
+   <dt>0x21 to 0x7E
+   <dd><p>Unset the <a>ISO-2022-JP output flag</a>, set
+   <a>ISO-2022-JP lead</a> to <var>byte</var>,
+   <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder trail byte">trail byte</a>, and return
+   <a>continue</a>.
+
+   <dt><a>end-of-stream</a>
+   <dd><p>Return <a>finished</a>.
+
+   <dt>Otherwise
+   <dd><p>Unset the <a>ISO-2022-JP output flag</a> and return <a>error</a>.
+  </dl>
+
+ <dt><dfn lt="ISO-2022-JP decoder trail byte">Trail byte</dfn>
+ <dd>
+  <p>Based on <var>byte</var>:
+  <dl class=switch>
+   <dt>0x1B
+   <dd><p>Set <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder escape start">escape start</a> and return
+   <a>error</a>.
+   <!-- ISO-2022-JP decoder output state is still lead byte -->
+
+   <dt>0x21 to 0x7E
+   <dd>
+    <ol>
+     <li><p>Set the <a>ISO-2022-JP decoder state</a> to
+     <a lt="ISO-2022-JP decoder lead byte">lead byte</a>.
+
+     <li><p>Let <var>pointer</var> be
+     (<a>ISO-2022-JP lead</a> &minus; 0x21) × 94 + <var>byte</var> &minus; 0x21.
+
+     <li><p>Let <var>code point</var> be the <a>index code point</a> for
+     <var>pointer</var> in <a>index jis0208</a>.
+
+     <li><p>If <var>code point</var> is null, return <a>error</a>.
+
+     <li><p>Return a code point whose value is <var>code point</var>.
+    </ol>
+
+   <dt><a>end-of-stream</a>
+   <dd><p>Set the <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder lead byte">lead byte</a>,
+   <a>prepend</a> <var>byte</var> to
+   <var>stream</var>, and return <a>error</a>.
+
+   <dt>Otherwise
+   <dd><p>Set <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder lead byte">lead byte</a> and return
+   <a>error</a>.
+   <!-- ISO-2022-JP decoder output state is still lead byte -->
+  </dl>
+
+ <dt><dfn lt="ISO-2022-JP decoder escape start">Escape start</dfn>
+ <dd>
+  <ol>
+   <li><p>If <var>byte</var> is either <!--$-->0x24 or <!--(-->0x28, set
+   <a>ISO-2022-JP lead</a> to <var>byte</var>,
+   <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder escape">escape</a>, and return
+   <a>continue</a>.
+
+   <li><p><a>Prepend</a> <var>byte</var> to
+   <var>stream</var>.
+
+   <li><p>Unset the <a>ISO-2022-JP output flag</a>, set
+   <a>ISO-2022-JP decoder state</a> to
+   <a>ISO-2022-JP decoder output state</a>, and return <a>error</a>.
+  </ol>
+
+ <dt><dfn lt="ISO-2022-JP decoder escape">Escape</dfn>
+ <dd>
+  <ol>
+   <li><p>Let <var>lead</var> be <a>ISO-2022-JP lead</a> and set
+   <a>ISO-2022-JP lead</a> to 0x00.
+
+   <li><p>Let <var>state</var> be null.
+
+   <li><p>If <var>lead</var> is 0x28 and <var>byte</var> is 0x42<!--B-->, set
+   <var>state</var> to <a lt="ISO-2022-JP decoder ASCII">ASCII</a>.
+
+   <li><p>If <var>lead</var> is 0x28 and <var>byte</var> is 0x4A<!--J-->, set
+   <var>state</var> to <a lt="ISO-2022-JP decoder Roman">Roman</a>.
+
+   <li><p>If <var>lead</var> is 0x28 and <var>byte</var> is 0x49<!--I-->, set
+   <var>state</var> to <a lt="ISO-2022-JP decoder katakana">katakana</a>.
+
+   <li><p>If <var>lead</var> is 0x24 and <var>byte</var> is either
+   0x40<!--@--> or 0x42<!--B-->, set <var>state</var> to
+   <a lt="ISO-2022-JP decoder lead byte">lead byte</a>.
+
+   <li>
+    <p>If <var>state</var> is non-null, then:
+
+    <ol>
+     <li><p>Set <a>ISO-2022-JP decoder state</a> and
+     <a>ISO-2022-JP decoder output state</a> to <var>state</var>.
+
+     <li><p>Let <var>output flag</var> be the <a>ISO-2022-JP output flag</a>.
+
+     <li><p>Set the <a>ISO-2022-JP output flag</a>.
+
+     <li><p>Return <a>continue</a>, if <var>output flag</var> is unset, and
+     <a>error</a> otherwise.
+    </ol>
+
+   <li><p><a>Prepend</a>
+   <var>lead</var> and <var>byte</var> to <var>stream</var>.
+
+   <li><p>Unset the <a>ISO-2022-JP output flag</a>, set
+   <a>ISO-2022-JP decoder state</a> to <a>ISO-2022-JP decoder output state</a>
+   and return <a>error</a>.
+  </ol>
+</dl>
+
+
+<h4 id=iso-2022-jp-encoder dfn export>ISO-2022-JP encoder</h4>
+
+<div class="note no-backref">
+ <p>The <a>ISO-2022-JP encoder</a> is the only <a for=/>encoder</a> for which the concatenation of
+ multiple outputs can result in an <a>error</a> when run through the corresponding
+ <a for=/>decoder</a>.
+
+ <p class=example id=example-iso-2022-jp-encoder-oddity>Encoding U+00A5 gives 0x1B 0x28 0x4A 0x5C
+ 0x1B 0x28 0x42. Doing that twice, concatenating the results, and then decoding yields U+00A5 U+FFFD
+ U+00A5.
+</div>
+
+<p><a>ISO-2022-JP</a>'s <a for=/>encoder</a> has an associated
+<dfn>ISO-2022-JP encoder state</dfn> which is <dfn lt="ISO-2022-JP encoder ASCII">ASCII</dfn>,
+<dfn lt="ISO-2022-JP encoder Roman">Roman</dfn>, or
+<dfn lt="ISO-2022-JP encoder jis0208">jis0208</dfn> (initially
+<a lt="ISO-2022-JP encoder ASCII">ASCII</a>).
+
+<p><a>ISO-2022-JP</a>'s <a for=/>encoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-stream</a> and
+ <a>ISO-2022-JP encoder state</a> is not
+ <a lt="ISO-2022-JP encoder ASCII">ASCII</a>,
+ <a>prepend</a> <var>code point</var> to
+ <var>stream</var>, set <a>ISO-2022-JP encoder state</a> to
+ <a lt="ISO-2022-JP encoder ASCII">ASCII</a>, and return three bytes
+ 0x1B 0x28 0x42.
+
+ <li><p>If <var>code point</var> is <a>end-of-stream</a> and
+ <a>ISO-2022-JP encoder state</a> is
+ <a lt="ISO-2022-JP encoder ASCII">ASCII</a>, return <a>finished</a>.
+
+ <li>
+  <p>If <a>ISO-2022-JP encoder state</a> is
+  <a lt="ISO-2022-JP encoder ASCII">ASCII</a> or
+  <a lt="ISO-2022-JP encoder Roman">Roman</a>, and <var>code point</var> is U+000E, U+000F,
+  or U+001B, return <a>error</a> with U+FFFD.
+
+  <p class=note>This returns U+FFFD rather than <var>code point</var> to prevent attacks.
+  <!-- https://github.com/whatwg/encoding/issues/15 -->
+
+ <li><p>If <a>ISO-2022-JP encoder state</a> is
+ <a lt="ISO-2022-JP encoder ASCII">ASCII</a> and <var>code point</var> is an
+ <a>ASCII code point</a>, return a byte whose value is <var>code point</var>.
+
+ <li>
+  <p>If <a>ISO-2022-JP encoder state</a> is <a lt="ISO-2022-JP encoder Roman">Roman</a> and
+  <var>code point</var> is an <a>ASCII code point</a>, excluding U+005C and U+007E, or is U+00A5 or
+  U+203E, then:
+
+  <ol>
+   <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return a byte
+   whose value is <var>code point</var>.
+
+   <li><p>If <var>code point</var> is U+00A5, return byte 0x5C.
+
+   <li><p>If <var>code point</var> is U+203E, return byte 0x7E.
+  </ol>
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, and
+ <a>ISO-2022-JP encoder state</a> is not
+ <a lt="ISO-2022-JP encoder ASCII">ASCII</a>,
+ <a>prepend</a> <var>code point</var> to
+ <var>stream</var>, set <a>ISO-2022-JP encoder state</a> to
+ <a lt="ISO-2022-JP encoder ASCII">ASCII</a>, and return three bytes
+ 0x1B 0x28 0x42.
+
+ <li><p>If <var>code point</var> is either U+00A5 or U+203E, and
+ <a>ISO-2022-JP encoder state</a> is not
+ <a lt="ISO-2022-JP encoder Roman">Roman</a>,
+ <a>prepend</a> <var>code point</var> to
+ <var>stream</var>, set <a>ISO-2022-JP encoder state</a> to
+ <a lt="ISO-2022-JP encoder Roman">Roman</a>, and return three bytes
+ 0x1B 0x28 0x4A.
+
+ <li><p>If <var>code point</var> is U+2212, set it to U+FF0D.
+
+ <li><p>If <var>code point</var> is in the range U+FF61 to U+FF9F, inclusive, set it to the
+ <a>index code point</a> for <var>code point</var> &minus; 0xFF61 in
+ <a>index ISO-2022-JP katakana</a>.
+
+ <li>
+  <p>Let <var>pointer</var> be the <a>index pointer</a> for <var>code point</var> in
+  <a>index jis0208</a>.
+
+  <p class=note>If <var>pointer</var> is non-null, it is less than 8836 due to the nature of
+  <a>index jis0208</a> and the <a>index pointer</a> operation.
+
+ <li><p>If <var>pointer</var> is null, return <a>error</a> with
+ <var>code point</var>.
+
+ <li><p>If <a>ISO-2022-JP encoder state</a> is not
+ <a lt="ISO-2022-JP encoder jis0208">jis0208</a>,
+ <a>prepend</a> <var>code point</var> to
+ <var>stream</var>, set <a>ISO-2022-JP encoder state</a> to
+ <a lt="ISO-2022-JP encoder jis0208">jis0208</a>, and return three bytes
+ 0x1B 0x24 0x42.
+
+ <li><p>Let <var>lead</var> be <var>pointer</var> / 94 + 0x21.
+
+ <li><p>Let <var>trail</var> be <var>pointer</var> % 94 + 0x21.
+
+ <li><p>Return two bytes whose values are <var>lead</var> and
+ <var>trail</var>.
+</ol>
+
+
+<h3 id=shift_jis dfn export>Shift_JIS</h3>
+
+<h4 id=shift_jis-decoder dfn export>Shift_JIS decoder</h4>
+
+<p><a>Shift_JIS</a>'s <a for=/>decoder</a> has an associated
+<dfn>Shift_JIS lead</dfn> (initially 0x00).
+
+<p><a>Shift_JIS</a>'s <a for=/>decoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-stream</a> and
+ <a>Shift_JIS lead</a> is not 0x00, set <a>Shift_JIS lead</a> to 0x00 and
+ return <a>error</a>.
+
+ <li><p>If <var>byte</var> is <a>end-of-stream</a> and
+ <a>Shift_JIS lead</a> is 0x00, return <a>finished</a>.
+
+ <li>
+  <p>If <a>Shift_JIS lead</a> is not 0x00, let <var>lead</var> be <a>Shift_JIS lead</a>, let
+  <var>pointer</var> be null, set <a>Shift_JIS lead</a> to 0x00, and then:
+
+  <ol>
+   <li><p>Let <var>offset</var> be 0x40, if <var>byte</var> is
+   less than 0x7F, and 0x41 otherwise.
+
+   <li><p>Let <var>lead offset</var> be 0x81, if <var>lead</var>
+   is less than 0xA0, and 0xC1 otherwise.
+
+   <li><p>If <var>byte</var> is in the range 0x40 to 0x7E, inclusive, or
+   0x80 to 0xFC, inclusive, set <var>pointer</var> to
+   (<var>lead</var> &minus; <var>lead offset</var>) × 188 + <var>byte</var> &minus; <var>offset</var>.
+
+   <li>
+    <p>If <var>pointer</var> is in the range 8836 to 10715, inclusive, return a code point whose
+    value is 0xE000 &minus; 8836 + <var>pointer</var>.
+    <!-- subtraction is done first to avoid upsetting compilers -->
+
+    <p class=note>This is interoperable legacy from Windows known as EUDC.
+    <!-- PUA -->
+
+   <li><p>Let <var>code point</var> be null, if
+   <var>pointer</var> is null, and the <a>index code point</a>
+   for <var>pointer</var> in <a>index jis0208</a> otherwise.
+
+   <li><p>If <var>code point</var> is non-null, return a code point whose value is
+   <var>code point</var>.
+
+   <li><p>If <var>byte</var> is an <a>ASCII byte</a>, <a>prepend</a> <var>byte</var> to
+   <var>stream</var>.
+
+   <li><p>Return <a>error</a>.
+  </ol>
+
+ <li><p>If <var>byte</var> is an <a>ASCII byte</a> or 0x80, return a code point
+ whose value is <var>byte</var>.
+ <!-- Opera has 0x7E -->
+
+ <li><p>If <var>byte</var> is in the range 0xA1 to 0xDF, inclusive, return
+ a code point whose value is 0xFF61 &minus; 0xA1 + <var>byte</var>.
+ <!-- Katakana; subtraction is done first to avoid upsetting compilers -->
+
+ <li><p>If <var>byte</var> is in the range 0x81 to 0x9F, inclusive, or 0xE0 to 0xFC,
+ inclusive, set <a>Shift_JIS lead</a> to <var>byte</var> and return
+ <a>continue</a>.
+
+ <li><p>Return <a>error</a>.
+</ol>
+
+
+<h4 id=shift_jis-encoder dfn export>Shift_JIS encoder</h4>
+
+<p><a>Shift_JIS</a>'s <a for=/>encoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-stream</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a> or U+0080, return
+ a byte whose value is <var>code point</var>.
+
+ <li><p>If <var>code point</var> is U+00A5, return byte 0x5C.
+
+ <li><p>If <var>code point</var> is U+203E, return byte 0x7E.
+
+ <li><p>If <var>code point</var> is in the range U+FF61 to U+FF9F, inclusive, return
+ a byte whose value is <var>code point</var> &minus; 0xFF61 + 0xA1.
+
+ <li><p>If <var>code point</var> is U+2212, set it to U+FF0D.
+
+ <li><p>Let <var>pointer</var> be the <a>index Shift_JIS pointer</a> for
+ <var>code point</var>.
+
+ <li><p>If <var>pointer</var> is null, return <a>error</a> with
+ <var>code point</var>.
+
+ <li><p>Let <var>lead</var> be <var>pointer</var> / 188.
+
+ <li><p>Let <var>lead offset</var> be 0x81, if <var>lead</var> is
+ less than 0x1F, and 0xC1 otherwise.
+ <!-- 0xA0-0x81 -->
+
+ <li><p>Let <var>trail</var> be <var>pointer</var> % 188.
+
+ <li><p>Let <var>offset</var> be 0x40, if <var>trail</var> is
+ less than 0x3F, and 0x41 otherwise.
+
+ <li><p>Return two bytes whose values are
+ <var>lead</var> + <var>lead offset</var> and
+ <var>trail</var> + <var>offset</var>.
+</ol>
+
+
+
+<h2 id=legacy-multi-byte-korean-encodings>Legacy multi-byte Korean encodings</h2>
+
+<h3 id=euc-kr dfn export>EUC-KR</h3>
+
+<h4 id=euc-kr-decoder dfn export>EUC-KR decoder</h4>
+
+<p><a>EUC-KR</a>'s <a for=/>decoder</a> has an associated
+<dfn>EUC-KR lead</dfn> (initially 0x00).
+
+<p><a>EUC-KR</a>'s <a for=/>decoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-stream</a> and
+ <a>EUC-KR lead</a> is not 0x00, set <a>EUC-KR lead</a> to 0x00
+ and return <a>error</a>.
+
+ <li><p>If <var>byte</var> is <a>end-of-stream</a> and
+ <a>EUC-KR lead</a> is 0x00, return <a>finished</a>.
+
+ <li>
+  <p>If <a>EUC-KR lead</a> is not 0x00, let <var>lead</var> be <a>EUC-KR lead</a>, let
+  <var>pointer</var> be null, set <a>EUC-KR lead</a> to 0x00, and then:
+
+  <ol>
+   <li><p>If <var>byte</var> is in the range  0x41 to 0xFE, inclusive, set
+   <var>pointer</var> to
+   (<var>lead</var> &minus; 0x81) × 190 + (<var>byte</var> &minus; 0x41).
+
+   <li><p>Let <var>code point</var> be null, if <var>pointer</var> is null,
+   and the <a>index code point</a> for <var>pointer</var> in
+   <a>index EUC-KR</a> otherwise.
+
+   <li><p>If <var>code point</var> is non-null, return a code point whose value is
+   <var>code point</var>.
+
+   <li><p>If <var>byte</var> is an <a>ASCII byte</a>, <a>prepend</a> <var>byte</var> to
+   <var>stream</var>.
+
+   <li><p>Return <a>error</a>.
+  </ol>
+
+ <li><p>If <var>byte</var> is an <a>ASCII byte</a>, return
+ a code point whose value is <var>byte</var>.
+
+ <li><p>If <var>byte</var> is in the range 0x81 to 0xFE, inclusive, set
+ <a>EUC-KR lead</a> to <var>byte</var> and return <a>continue</a>.
+
+ <li><p>Return <a>error</a>.
+</ol>
+
+
+<h4 id=euc-kr-encoder dfn export>EUC-KR encoder</h4>
+
+<p><a>EUC-KR</a>'s <a for=/>encoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-stream</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return
+ a byte whose value is <var>code point</var>.
+
+ <li><p>Let <var>pointer</var> be the <a>index pointer</a> for
+ <var>code point</var> in <a>index EUC-KR</a>.
+
+ <li><p>If <var>pointer</var> is null, return <a>error</a> with
+ <var>code point</var>.
+
+ <li><p>Let <var>lead</var> be <var>pointer</var> / 190 + 0x81.
+
+ <li><p>Let <var>trail</var> be <var>pointer</var> % 190 + 0x41.
+
+ <li><p>Return two bytes whose values are <var>lead</var> and <var>trail</var>.
+</ol>
+
+
+
+<h2 id=legacy-miscellaneous-encodings>Legacy miscellaneous encodings</h2>
+
+<h3 id=replacement dfn export>replacement</h3>
+
+<p class=note>The <a>replacement</a> <a for=/>encoding</a> exists to prevent certain
+attacks that abuse a mismatch between <a for=/>encodings</a> supported on
+the server and the client.
+
+
+<h4 id=replacement-decoder dfn export>replacement decoder</h4>
+
+<p><a>replacement</a>'s <a for=/>decoder</a> has an associated
+<dfn>replacement error returned flag</dfn> (initially unset).
+
+<p><a>replacement</a>'s <a for=/>decoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-stream</a>, return <a>finished</a>.
+
+ <li><p>If <a>replacement error returned flag</a> is unset, set the
+ <a>replacement error returned flag</a> and return <a>error</a>.
+
+ <li><p>Return <a>finished</a>.
+</ol>
+
+
+<h3 id=common-infrastructure-for-utf-16be-and-utf-16le>Common infrastructure for <a>UTF-16BE</a> and <a>UTF-16LE</a></h3>
+
+<h4 id=shared-utf-16-decoder dfn export>shared UTF-16 decoder</h4>
+
+<p class="note no-backref">A byte order mark has priority over a <a>label</a> as it
+has been found to be more accurate in deployed content. Therefore it is not part of the
+<a>shared UTF-16 decoder</a> algorithm but rather the <a>decode</a> algorithm.
+
+<p><a>shared UTF-16 decoder</a> has an associated <dfn>UTF-16 lead byte</dfn> and
+<dfn>UTF-16 lead surrogate</dfn> (both initially null), and
+<dfn>UTF-16BE decoder flag</dfn> (initially unset).
+
+<p><a>shared UTF-16 decoder</a>'s <a>handler</a>, given a <var>stream</var>
+and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-stream</a> and either
+ <a>UTF-16 lead byte</a> or <a>UTF-16 lead surrogate</a> is non-null, set
+ <a>UTF-16 lead byte</a> and <a>UTF-16 lead surrogate</a> to null, and return
+ <a>error</a>.
+
+ <li><p>If <var>byte</var> is <a>end-of-stream</a> and
+ <a>UTF-16 lead byte</a> and <a>UTF-16 lead surrogate</a> are null, return
+ <a>finished</a>.
+
+ <li><p>If <a>UTF-16 lead byte</a> is null, set <a>UTF-16 lead byte</a> to
+ <var>byte</var> and return <a>continue</a>.
+
+ <li>
+  <p>Let <var>code unit</var> be the result of:
+
+  <dl class=switch>
+   <dt><a>UTF-16BE decoder flag</a> is set
+   <dd><p>(<a>UTF-16 lead byte</a> &lt;&lt; 8) + <var>byte</var>.
+   <dt><a>UTF-16BE decoder flag</a> is unset
+   <dd><p>(<var>byte</var> &lt;&lt; 8) + <a>UTF-16 lead byte</a>.
+  </dl>
+
+  <p>Then set <a>UTF-16 lead byte</a> to null.
+
+ <li>
+  <p>If <a>UTF-16 lead surrogate</a> is non-null, let <var>lead surrogate</var> be
+  <a>UTF-16 lead surrogate</a>, set <a>UTF-16 lead surrogate</a> to null, and then:
+
+  <ol>
+   <li><p>If <var>code unit</var> is in the range U+DC00 to U+DFFF, inclusive,
+   return a code point whose value is
+   0x10000 + ((<var>lead surrogate</var> &minus; 0xD800) &lt;&lt; 10) + (<var>code unit</var> &minus; 0xDC00).
+
+   <li><p>Let <var>byte1</var> be <var>code unit</var> >> 8.
+
+   <li><p>Let <var>byte2</var> be <var>code unit</var> &amp; 0x00FF.
+
+   <li><p>Let <var>bytes</var> be two bytes whose values are <var>byte1</var> and <var>byte2</var>,
+   if the <a>UTF-16BE decoder flag</a> is set, and <var>byte2</var> and <var>byte1</var> otherwise.
+
+   <li><p><a>Prepend</a> the <var>bytes</var> to
+   <var>stream</var> and return <a>error</a>.
+   <!-- unpaired surrogates; IE/WebKit output them, Gecko/Opera U+FFFD them -->
+  </ol>
+
+ <li><p>If <var>code unit</var> is in the range U+D800 to U+DBFF, inclusive, set
+ <a>UTF-16 lead surrogate</a> to <var>code unit</var> and return
+ <a>continue</a>.
+
+ <li><p>If <var>code unit</var> is in the range U+DC00 to U+DFFF, inclusive,
+ return <a>error</a>.
+ <!-- unpaired surrogates; IE/WebKit output them, Gecko/Opera U+FFFD them -->
+
+ <li><p>Return code point <var>code unit</var>.
+</ol>
+
+
+<h3 id=utf-16be dfn export>UTF-16BE</h3>
+
+<h4 id=utf-16be-decoder dfn export>UTF-16BE decoder</h4>
+
+<p><a>UTF-16BE</a>'s <a for=/>decoder</a> is <a>shared UTF-16 decoder</a> with
+its <a>UTF-16BE decoder flag</a> set.
+
+
+<h3 id=utf-16le dfn export>UTF-16LE</h3>
+
+<p class="note no-backref">Both "<code>utf-16</code>" and
+"<code>utf-16le</code>" are <a>labels</a> for
+<a>UTF-16LE</a> to deal with deployed content.
+
+
+<h4 id=utf-16le-decoder dfn export>UTF-16LE decoder</h4>
+
+<p><a>UTF-16LE</a>'s <a for=/>decoder</a> is <a>shared UTF-16 decoder</a>.
+
+
+<h3 id=x-user-defined dfn export>x-user-defined</h3>
+
+<p class=note>While technically this is a <a>single-byte encoding</a>,
+it is defined separately as it can be implemented algorithmically.
+
+<!--
+This encoding is silly, however, the web depends on it:
+
+https://krijnhoetmer.nl/irc-logs/whatwg/20121003#l-461
+https://krijnhoetmer.nl/irc-logs/whatwg/20121010#l-812
+
+https://stackoverflow.com/questions/6986789/why-are-some-bytes-prefixed-with-0xf7-when-using-charset-x-user-defined-with-xm
+-->
+
+<h4 id=x-user-defined-decoder dfn export>x-user-defined decoder</h4>
+
+<p><a>x-user-defined</a>'s <a for=/>decoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-stream</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>byte</var> is an <a>ASCII byte</a>, return
+ a code point whose value is <var>byte</var>.
+
+ <li><p>Return a code point whose value is 0xF780 + <var>byte</var> &minus; 0x80.
+</ol>
+
+
+<h4 id=x-user-defined-encoder dfn export>x-user-defined encoder</h4>
+
+<p><a>x-user-defined</a>'s <a for=/>encoder</a>'s <a>handler</a>, given a
+<var>stream</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-stream</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return
+ a byte whose value is <var>code point</var>.
+
+ <li><p>If <var>code point</var> is in the range U+F780 to U+F7FF, inclusive, return
+ a byte whose value is <var>code point</var> &minus; 0xF780 + 0x80.
+
+ <li><p>Return <a>error</a> with <var>code point</var>.
+</ol>
+
+
+
+<h2 id=browser-ui>Browser UI</h2>
+
+<p>Browsers are encouraged to not enable overriding the encoding of a resource. If such a
+feature is nonetheless present, browsers should not offer either
+<a>UTF-16BE</a> or <a>UTF-16LE</a> as option due to aforementioned security
+issues. Browsers also should disable this feature if the resource was decoded using either
+<a>UTF-16BE</a> or <a>UTF-16LE</a>.
+
+
+
+<h2 class=no-num id=implementation-considerations>Implementation considerations</h2>
+
+<p>Instead of supporting <a for=/>streams</a> with arbitrary <a for=stream>prepend</a>, the
+<a for=/>decoders</a> for <a for=/>encodings</a> in this standard could be implemented with:
+
+<ol>
+ <li><p>The ability to unread the current byte.
+
+ <li>
+  <p>A single-byte buffer for <a>gb18030</a> (an <a>ASCII byte</a>) and <a>ISO-2022-JP</a> (0x24 or
+  0x28).
+
+  <p class=example id=example-gb18030-implementation-strategy>For <a>gb18030</a> when hitting a
+  bogus byte while <a>gb18030 third</a> is not 0x00, <a>gb18030 second</a> could be moved into the
+  single-byte buffer to be returned next, and <a>gb18030 third</a> would be the new
+  <a>gb18030 first</a>, checked for not being 0x00 after the single-byte buffer was returned and
+  emptied. This is possible as the range for the first and third byte in <a>gb18030</a> is
+  identical.
+</ol>
+
+<p>The <a>ISO-2022-JP encoder</a> needs <a>ISO-2022-JP encoder state</a> as additional state, but
+other than that, none of the <a for=/>encoders</a> for <a for=/>encodings</a> in this standard
+require additional state or buffers.
+
+
+
+<h2 class=no-num id=acknowledgments>Acknowledgments</h2>
+
+<p>There have been a lot of people that have helped make encodings more
+interoperable over the years and thereby furthered the goals of this
+standard. Likewise many people have helped making this standard what it is
+today.
+
+<p>With that, many thanks to
+Adam Rice,
+Alan Chaney,
+Alexander Shtuchkin,
+Allen Wirfs-Brock,
+Aneesh Agrawal,
+Arkadiusz Michalski,
+Asmus Freytag,
+Ben Noordhuis,
+Boris Zbarsky,
+Bruno Haible,
+Cameron McCormack,
+Charles McCathieNeville,
+Christopher Foo,
+David Carlisle,
+Domenic Denicola,
+Dominique Hazaël-Massieux,
+Doug Ewell,
+Erik van der Poel,
+譚永鋒 (Frank Yung-Fong Tang),
+Geoffrey Sneddon,
+Glenn Maynard,
+Gordon P. Hemsley,
+Henri Sivonen,
+Ian Hickson,
+James Graham,
+Jeffrey Yasskin,
+John Tamplin,
+Joshua Bell,
+村井純 (Jun Murai),
+신정식 (Jungshik Shin),
+Jxck,
+강 성훈 (Kang Seonghoon),<!-- space is intentional: https://www.w3.org/Bugs/Public/show_bug.cgi?id=27675#c2 -->
+川幡太一 (Kawabata Taichi),
+Ken Lunde,
+Ken Whistler,
+Kenneth Russell,
+田村健人 (Kent Tamura),
+Leif Halvard Silli,
+Makoto Kato,
+Mark Callow,
+Mark Crispin,
+Mark Davis,
+Martin Dürst,
+Masatoshi Kimura,
+Mattias Buelens,
+Ms2ger,
+Nigel Megitt,
+Nigel Tao,
+Norbert Lindenberg,
+Øistein E. Andersen,
+Peter Krefting,
+Philip Jägenstedt,
+Philip Taylor,
+Richard Ishida,
+Robbert Broersma,
+Robert Mustacchi,
+Ryan Dahl,
+Shawn Steele,
+Simon Montagu,
+Simon Pieters,
+Simon Sapin,
+寺田健 (Takeshi Terada),
+Vyacheslav Matva, and
+成瀬ゆい (Yui Naruse)
+for being awesome.
+
+<p>This standard is written by
+<a href=https://annevankesteren.nl/ lang=nl>Anne van Kesteren</a>
+(<a href=https://www.mozilla.org/>Mozilla</a>,
+<a href=mailto:annevk@annevk.nl>annevk@annevk.nl</a>). The <a href=#api>API</a> chapter
+was initially written by Joshua Bell (<a href=https://www.google.com/>Google</a>).

Name +	Labels +
The Encoding +
UTF-8 +	"`unicode-1-1-utf-8`" +
	"`utf-8`" +
	"`utf8`" +
Legacy single-byte encodings +
IBM866 +	"`866`" +
	"`cp866`" +
	"`csibm866`" +
	"`ibm866`" +
ISO-8859-2 +	"`csisolatin2`" +
	"`iso-8859-2`" +
	"`iso-ir-101`" +
	"`iso8859-2`" +
	"`iso88592`" +
	"`iso_8859-2`" +
	"`iso_8859-2:1987`" +
	"`l2`" +
	"`latin2`" +
ISO-8859-3 +	"`csisolatin3`" +
	"`iso-8859-3`" +
	"`iso-ir-109`" +
	"`iso8859-3`" +
	"`iso88593`" +
	"`iso_8859-3`" +
	"`iso_8859-3:1988`" +
	"`l3`" +
	"`latin3`" +
ISO-8859-4 +	"`csisolatin4`" +
	"`iso-8859-4`" +
	"`iso-ir-110`" +
	"`iso8859-4`" +
	"`iso88594`" +
	"`iso_8859-4`" +
	"`iso_8859-4:1988`" +
	"`l4`" +
	"`latin4`" +
ISO-8859-5 +	"`csisolatincyrillic`" +
	"`cyrillic`" +
	"`iso-8859-5`" +
	"`iso-ir-144`" +
	"`iso8859-5`" +
	"`iso88595`" +
	"`iso_8859-5`" +
	"`iso_8859-5:1988`" +
ISO-8859-6 +	"`arabic`" +
	"`asmo-708`" +
	"`csiso88596e`" +
	"`csiso88596i`" +
	"`csisolatinarabic`" +
	"`ecma-114`" +
	"`iso-8859-6`" +
	"`iso-8859-6-e`" +
	"`iso-8859-6-i`" +
	"`iso-ir-127`" +
	"`iso8859-6`" +
	"`iso88596`" +
	"`iso_8859-6`" +
	"`iso_8859-6:1987`" +
ISO-8859-7 +	"`csisolatingreek`" +
	"`ecma-118`" +
	"`elot_928`" +
	"`greek`" +
	"`greek8`" +
	"`iso-8859-7`" +
	"`iso-ir-126`" +
	"`iso8859-7`" +
	"`iso88597`" +
	"`iso_8859-7`" +
	"`iso_8859-7:1987`" +
	"`sun_eu_greek`" +
ISO-8859-8 +	"`csiso88598e`" +
	"`csisolatinhebrew`" +
	"`hebrew`" +
	"`iso-8859-8`" +
	"`iso-8859-8-e`" +
	"`iso-ir-138`" +
	"`iso8859-8`" +
	"`iso88598`" +
	"`iso_8859-8`" +
	"`iso_8859-8:1988`" +
	"`visual`" +
ISO-8859-8-I +	"`csiso88598i`" +
	"`iso-8859-8-i`" +
	"`logical`" +
ISO-8859-10 +	"`csisolatin6`" +
	"`iso-8859-10`" +
	"`iso-ir-157`" +
	"`iso8859-10`" +
	"`iso885910`" +
	"`l6`" +
	"`latin6`" +
ISO-8859-13 +	"`iso-8859-13`" +
	"`iso8859-13`" +
	"`iso885913`" +
ISO-8859-14 +	"`iso-8859-14`" +
	"`iso8859-14`" +
	"`iso885914`" +
ISO-8859-15 +	"`csisolatin9`" +
	"`iso-8859-15`" +
	"`iso8859-15`" +
	"`iso885915`" +
	"`iso_8859-15`" +
	"`l9`" +
ISO-8859-16 +	"`iso-8859-16`" +
KOI8-R +	"`cskoi8r`" +
	"`koi`" +
	"`koi8`" +
	"`koi8-r`" +
	"`koi8_r`" +
KOI8-U +	"`koi8-ru`" +
KOI8-U +	"`koi8-u`" +
macintosh +	"`csmacintosh`" +
	"`mac`" +
	"`macintosh`" +
	"`x-mac-roman`" +
windows-874 +	"`dos-874`" +
	"`iso-8859-11`" +
	"`iso8859-11`" +
	"`iso885911`" +
	"`tis-620`" +
	"`windows-874`" +
windows-1250 +	"`cp1250`" +
	"`windows-1250`" +
	"`x-cp1250`" +
windows-1251 +	"`cp1251`" +
	"`windows-1251`" +
	"`x-cp1251`" +
windows-1252 +	"`ansi_x3.4-1968`" +
	"`ascii`" +
	"`cp1252`" +
	"`cp819`" +
	"`csisolatin1`" +
	"`ibm819`" +
	"`iso-8859-1`" +
	"`iso-ir-100`" +
	"`iso8859-1`" +
	"`iso88591`" +
	"`iso_8859-1`" +
	"`iso_8859-1:1987`" +
	"`l1`" +
	"`latin1`" +
	"`us-ascii`" +
	"`windows-1252`" +
	"`x-cp1252`" +
windows-1253 +	"`cp1253`" +
	"`windows-1253`" +
	"`x-cp1253`" +
windows-1254 +	"`cp1254`" +
	"`csisolatin5`" +
	"`iso-8859-9`" +
	"`iso-ir-148`" +
	"`iso8859-9`" +
	"`iso88599`" +
	"`iso_8859-9`" +
	"`iso_8859-9:1989`" +
	"`l5`" +
	"`latin5`" +
	"`windows-1254`" +
	"`x-cp1254`" +
windows-1255 +	"`cp1255`" +
	"`windows-1255`" +
	"`x-cp1255`" +
windows-1256 +	"`cp1256`" +
	"`windows-1256`" +
	"`x-cp1256`" +
windows-1257 +	"`cp1257`" +
	"`windows-1257`" +
	"`x-cp1257`" +
windows-1258 +	"`cp1258`" +
	"`windows-1258`" +
	"`x-cp1258`" +
x-mac-cyrillic +	"`x-mac-cyrillic`" +
x-mac-cyrillic +	"`x-mac-ukrainian`" +
Legacy multi-byte Chinese (simplified) encodings +
GBK +	"`chinese`" +
	"`csgb2312`" +
	"`csiso58gb231280`" +
	"`gb2312`" +
	"`gb_2312`" +
	"`gb_2312-80`" +
	"`gbk`" +
	"`iso-ir-58`" +
	"`x-gbk`" +
gb18030 +	"`gb18030`" +
Legacy multi-byte Chinese (traditional) encodings +
Big5 +	"`big5`" +
	"`big5-hkscs`" +
	"`cn-big5`" +
	"`csbig5`" +
	"`x-x-big5`" +
Legacy multi-byte Japanese encodings +
EUC-JP +	"`cseucpkdfmtjapanese`" +
	"`euc-jp`" +
	"`x-euc-jp`" +
ISO-2022-JP +	"`csiso2022jp`" +
ISO-2022-JP +	"`iso-2022-jp`" +
Shift_JIS +	"`csshiftjis`" +
	"`ms932`" +
	"`ms_kanji`" +
	"`shift-jis`" +
	"`shift_jis`" +
	"`sjis`" +
	"`windows-31j`" +
	"`x-sjis`" +
Legacy multi-byte Korean encodings +
EUC-KR +	"`cseuckr`" +
	"`csksc56011987`" +
	"`euc-kr`" +
	"`iso-ir-149`" +
	"`korean`" +
	"`ks_c_5601-1987`" +
	"`ks_c_5601-1989`" +
	"`ksc5601`" +
	"`ksc_5601`" +
	"`windows-949`" +
Legacy miscellaneous encodings +
replacement +	"`csiso2022kr`" +
	"`hz-gb-2312`" +
	"`iso-2022-cn`" +
	"`iso-2022-cn-ext`" +
	"`iso-2022-kr`" +
	"`replacement`" +
UTF-16BE +	"`utf-16be`" +
UTF-16LE +	"`utf-16`" +
UTF-16LE +	"`utf-16le`" +
x-user-defined +	"`x-user-defined`" +
Index				Notes +
index Big5 +	index-big5.txt +	index Big5 visualization +	index Big5 BMP coverage +	This matches the Big5 standard in combination with the + Hong Kong Supplementary Character Set and other common extensions. +
index EUC-KR +	index-euc-kr.txt +	index EUC-KR visualization +	index EUC-KR BMP coverage +	This matches the KS X 1001 standard and the Unified Hangul Code, more commonly known together + as Windows Codepage 949. It covers the Hangul Syllables block of Unicode in its entirety. The + Hangul block whose top left corner in the visualization is at pointer 9026 is in the Unicode + order. Taken separately, the rest of the Hangul syllables in this index are in the Unicode order, + too. +
index gb18030 +	index-gb18030.txt +	index gb18030 visualization +	index gb18030 BMP coverage +	This matches the GB18030-2005 standard for code points encoded as two bytes, except for + 0xA3 0xA0 which maps to U+3000 to be compatible with deployed content. This index covers the + CJK Unified Ideographs block of Unicode in its entirety. Entries from that block that are above or + to the left of (the first) U+3000 in the visualization are in the Unicode order. + +
index gb18030 ranges +	index-gb18030-ranges.txt +			This index works different from all others. Listing all code points would result + in over a million items whereas they can be represented neatly in 207 ranges combined with trivial + limit checks. It therefore only superficially matches the GB18030-2005 standard for code points + encoded as four bytes. See also index gb18030 ranges code point and + index gb18030 ranges pointer below. +
index jis0208 +	index-jis0208.txt +	index jis0208 visualization, Shift_JIS visualization +	index jis0208 BMP coverage +	This is the JIS X 0208 standard including formerly proprietary + extensions from IBM and NEC. + +
index jis0212 +	index-jis0212.txt +	index jis0212 visualization +	index jis0212 BMP coverage +	This is the JIS X 0212 standard. It is only used by the EUC-JP decoder + due to lack of widespread support elsewhere. + +
index ISO-2022-JP katakana +	index-iso-2022-jp-katakana.txt +			This maps halfwidth to fullwidth katakana as per Unicode Normalization Form KC, except that + U+FF9E and U+FF9F map to U+309B and U+309C rather than U+3099 and U+309A. It is only used by the + ISO-2022-JP encoder. [[UNICODE]] +
IBM866	index-ibm866.txt	index IBM866 visualization	index IBM866 BMP coverage +
ISO-8859-2	index-iso-8859-2.txt	index ISO-8859-2 visualization	index ISO-8859-2 BMP coverage +
ISO-8859-3	index-iso-8859-3.txt	index ISO-8859-3 visualization	index ISO-8859-3 BMP coverage +
ISO-8859-4	index-iso-8859-4.txt	index ISO-8859-4 visualization	index ISO-8859-4 BMP coverage +
ISO-8859-5	index-iso-8859-5.txt	index ISO-8859-5 visualization	index ISO-8859-5 BMP coverage +
ISO-8859-6	index-iso-8859-6.txt	index ISO-8859-6 visualization	index ISO-8859-6 BMP coverage +
ISO-8859-7	index-iso-8859-7.txt	index ISO-8859-7 visualization	index ISO-8859-7 BMP coverage +
ISO-8859-8	index-iso-8859-8.txt	index ISO-8859-8 visualization	index ISO-8859-8 BMP coverage +
ISO-8859-8-I +	index-iso-8859-8.txt	index ISO-8859-8 visualization	index ISO-8859-8 BMP coverage +
ISO-8859-10	index-iso-8859-10.txt	index ISO-8859-10 visualization	index ISO-8859-10 BMP coverage +
ISO-8859-13	index-iso-8859-13.txt	index ISO-8859-13 visualization	index ISO-8859-13 BMP coverage +
ISO-8859-14	index-iso-8859-14.txt	index ISO-8859-14 visualization	index ISO-8859-14 BMP coverage +
ISO-8859-15	index-iso-8859-15.txt	index ISO-8859-15 visualization	index ISO-8859-15 BMP coverage +
ISO-8859-16	index-iso-8859-16.txt	index ISO-8859-16 visualization	index ISO-8859-16 BMP coverage +
KOI8-R	index-koi8-r.txt	index KOI8-R visualization	index KOI8-R BMP coverage +
KOI8-U	index-koi8-u.txt	index KOI8-U visualization	index KOI8-U BMP coverage +
macintosh	index-macintosh.txt	index macintosh visualization	index macintosh BMP coverage +
windows-874	index-windows-874.txt	index windows-874 visualization	index windows-874 BMP coverage +
windows-1250	index-windows-1250.txt	index windows-1250 visualization	index windows-1250 BMP coverage +
windows-1251	index-windows-1251.txt	index windows-1251 visualization	index windows-1251 BMP coverage +
windows-1252	index-windows-1252.txt	index windows-1252 visualization	index windows-1252 BMP coverage +
windows-1253	index-windows-1253.txt	index windows-1253 visualization	index windows-1253 BMP coverage +
windows-1254	index-windows-1254.txt	index windows-1254 visualization	index windows-1254 BMP coverage +
windows-1255	index-windows-1255.txt	index windows-1255 visualization	index windows-1255 BMP coverage +
windows-1256	index-windows-1256.txt	index windows-1256 visualization	index windows-1256 BMP coverage +
windows-1257	index-windows-1257.txt	index windows-1257 visualization	index windows-1257 BMP coverage +
windows-1258	index-windows-1258.txt	index windows-1258 visualization	index windows-1258 BMP coverage +
x-mac-cyrillic	index-x-mac-cyrillic.txt	index x-mac-cyrillic visualization	index x-mac-cyrillic BMP coverage +
Pointer	Code points	Notes +
1133	U+00CA U+0304	Ê̄ (LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND MACRON) +
1135	U+00CA U+030C	Ê̌ (LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND CARON) +
1164	U+00EA U+0304	ê̄ (LATIN SMALL LETTER E WITH CIRCUMFLEX AND MACRON) +
1166	U+00EA U+030C	ê̌ (LATIN SMALL LETTER E WITH CIRCUMFLEX AND CARON) +