Skip to content

Commit

Permalink
Fix #20: rather than relying on integer division, use floor()
Browse files Browse the repository at this point in the history
  • Loading branch information
annevk committed Dec 16, 2015
1 parent 95f85a6 commit 929a3ff
Show file tree
Hide file tree
Showing 2 changed files with 29 additions and 37 deletions.
34 changes: 15 additions & 19 deletions Overview.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

<p><a class="logo" href="https://whatwg.org/"><img alt="WHATWG" height="100" src="https://resources.whatwg.org/logo-encoding.svg" width="100"></a></p>
<h1>Encoding</h1>
<h2 class="no-num no-toc" id="living-standard-—-last-updated-15-december-2015">Living Standard — Last Updated 15 December 2015</h2>
<h2 class="no-num no-toc" id="living-standard-—-last-updated-16-december-2015">Living Standard — Last Updated 16 December 2015</h2>

<dl>
<dt>Participate:
Expand Down Expand Up @@ -214,11 +214,11 @@ <h2 id="terminology"><span class="secno">4 </span>Terminology</h2>

<p>Hexadecimal numbers are prefixed with "0x".

<p>In equations, all numbers are integers, addition is represented by "+",
subtraction by "−", multiplication by "×", division by "/",
calculating the remainder of a division (also known as modulo) by "%",
logical left shifts by "&lt;&lt;", logical right shifts by "&gt;&gt;",
bitwise AND by "&amp;", and bitwise OR by "|".
<p>In equations, all numbers are integers, addition is represented by "+", subtraction by "−",
multiplication by "×", division by "/", calculating the remainder of a division (also known as
modulo) by "%", logical left shifts by "&lt;&lt;", logical right shifts by "&gt;&gt;", bitwise AND by
"&amp;", and bitwise OR by "|". floor(<var>x</var>) is the largest integer not greater than
<var>x</var>.

<p>For logical right shifts operands must have at least twenty-one bits precision.

Expand Down Expand Up @@ -1776,8 +1776,7 @@ <h4 id="gb18030-encoder"><span class="secno">11.2.2 </span><dfn>gb18030 encoder<
<p>If <var>pointer</var> is not null, run these substeps:

<ol>
<li><p>Let <var>lead</var> be
<var>pointer</var> / 190 + 0x81.
<li><p>Let <var>lead</var> be floor(<var>pointer</var> / 190) + 0x81.

<li><p>Let <var>trail</var> be <var>pointer</var> % 190.

Expand All @@ -1794,20 +1793,17 @@ <h4 id="gb18030-encoder"><span class="secno">11.2.2 </span><dfn>gb18030 encoder<
<li><p>Set <var>pointer</var> to the
<a href="#index-gb18030-ranges-pointer">index gb18030 ranges pointer</a> for <var>code point</var>.

<li><p>Let <var>byte1</var> be
<var>pointer</var> / 10 / 126 / 10.
<li><p>Let <var>byte1</var> be floor(<var>pointer</var> / 10 / 126 / 10).

This comment has been minimized.

Copy link
@domenic

domenic Dec 16, 2015

Member

floor(pointer / 10 / 126 / 10) is different from floor(floor(floor(pointer / 10) / 126) / 10). I don't know the context but wanted to make sure the former was intended?

This comment has been minimized.

Copy link
@annevk

annevk Dec 16, 2015

Author Member

Can you give me two numbers where this matters? E.g., going from 201599 / 10 / 126 / 10 to 201600 / 10 / 126 / 10 seems fine. I suppose I should technically use more floor() though since it's supposed to all be integer division.

This comment has been minimized.

Copy link
@cscott

cscott Dec 16, 2015

Well, floor((7/4) / (1/4)) != floor(7/4) / floor(1/4). But I think if all the divisors are integers it shound be okay?

This comment has been minimized.

Copy link
@domenic

domenic Dec 16, 2015

Member

You're right it doesn't seem to matter. I wrote a quick script to check and up to 100000000 there are no differences at least :). There is probably a quick proof that could be done, although I am sadly out of practice and so have to resort to silly looping checks instead of proving it :(

This comment has been minimized.

Copy link
@cscott

cscott Dec 16, 2015

Consider (X/Y)/Z. Given nonnegative integers X, Y, Z, the exact result of X/Y will be N + f, where N is an integer and 0 <= f <= (Y-1/Y) (since X is an integer). Then (X/Y)/Z = (N + f)/Z = N/Z + f/Z = (N' + f') + f/Z where N' is an integer and 0 <= f' <= (Z-1/Z). We cwant to ensure that (f' + f/Z) < 1. Expanding, f' + f/Z <= (Z-1/Z) + (Y-1/YZ) and so f' + f/Z <= (YZ-1/YZ). That will always be less than 1 assuming X Y Z are nonnegative integers (which they seem to be in this case).


<li><p>Set <var>pointer</var> to
<var>pointer</var><var>byte1</var> × 10 × 126 × 10.

<li><p>Let <var>byte2</var> be
<var>pointer</var> / 10 / 126.
<li><p>Let <var>byte2</var> be floor(<var>pointer</var> / 10 / 126).

<li><p>Set <var>pointer</var> to
<var>pointer</var><var>byte2</var> × 10 × 126.

<li><p>Let <var>byte3</var> be
<var>pointer</var> / 10.
<li><p>Let <var>byte3</var> be floor(<var>pointer</var> / 10).

<li><p>Let <var>byte4</var> be
<var>pointer</var><var>byte3</var> × 10.
Expand Down Expand Up @@ -1916,7 +1912,7 @@ <h4 id="big5-encoder"><span class="secno">12.1.2 </span><dfn>big5 encoder</dfn><
<li><p>If <var>pointer</var> is null, return <a href="#error">error</a> with
<var>code point</var>.

<li><p>Let <var>lead</var> be <var>pointer</var> / 157 + 0x81.
<li><p>Let <var>lead</var> be floor(<var>pointer</var> / 157) + 0x81.

<li><p>Let <var>trail</var> be <var>pointer</var> % 157.

Expand Down Expand Up @@ -2023,7 +2019,7 @@ <h4 id="euc-jp-encoder"><span class="secno">13.1.2 </span><dfn>euc-jp encoder</d
<li><p>If <var>pointer</var> is null, return <a href="#error">error</a> with
<var>code point</var>.

<li><p>Let <var>lead</var> be <var>pointer</var> / 94 + 0xA1.
<li><p>Let <var>lead</var> be floor(<var>pointer</var> / 94) + 0xA1.

<li><p>Let <var>trail</var> be <var>pointer</var> % 94 + 0xA1.

Expand Down Expand Up @@ -2324,7 +2320,7 @@ <h4 id="iso-2022-jp-encoder"><span class="secno">13.2.2 </span><dfn>iso-2022-jp
<a href="#iso-2022-jp-encoder-jis0208" title="iso-2022-jp encoder jis0208">jis0208</a>, and return three bytes
0x1B 0x24 0x42.

<li><p>Let <var>lead</var> be <var>pointer</var> / 94 + 0x21.
<li><p>Let <var>lead</var> be floor(<var>pointer</var> / 94) + 0x21.

<li><p>Let <var>trail</var> be <var>pointer</var> % 94 + 0x21.

Expand Down Expand Up @@ -2431,7 +2427,7 @@ <h4 id="shift_jis-encoder"><span class="secno">13.3.2 </span><dfn>shift_jis enco
<li><p>If <var>pointer</var> is null, return <a href="#error">error</a> with
<var>code point</var>.

<li><p>Let <var>lead</var> be <var>pointer</var> / 188.
<li><p>Let <var>lead</var> be floor(<var>pointer</var> / 188).

<li><p>Let <var>lead offset</var> be 0x81, if <var>lead</var> is
less than 0x1F, and 0xC1 otherwise.
Expand Down Expand Up @@ -2520,7 +2516,7 @@ <h4 id="euc-kr-encoder"><span class="secno">14.1.2 </span><dfn>euc-kr encoder</d
<li><p>If <var>pointer</var> is null, return <a href="#error">error</a> with
<var>code point</var>.

<li><p>Let <var>lead</var> be <var>pointer</var> / 190 + 0x81.
<li><p>Let <var>lead</var> be floor(<var>pointer</var> / 190) + 0x81.

<li><p>Let <var>trail</var> be <var>pointer</var> % 190 + 0x41.

Expand Down
32 changes: 14 additions & 18 deletions Overview.src.html
Original file line number Diff line number Diff line change
Expand Up @@ -128,11 +128,11 @@ <h2>Terminology</h2>

<p>Hexadecimal numbers are prefixed with "0x".

<p>In equations, all numbers are integers, addition is represented by "+",
subtraction by "&minus;", multiplication by "&times;", division by "/",
calculating the remainder of a division (also known as modulo) by "%",
logical left shifts by "&lt;&lt;", logical right shifts by ">>",
bitwise AND by "&amp;", and bitwise OR by "|".
<p>In equations, all numbers are integers, addition is represented by "+", subtraction by "&minus;",
multiplication by "&times;", division by "/", calculating the remainder of a division (also known as
modulo) by "%", logical left shifts by "&lt;&lt;", logical right shifts by ">>", bitwise AND by
"&amp;", and bitwise OR by "|". floor(<var>x</var>) is the largest integer not greater than
<var>x</var>.

<p>For logical right shifts operands must have at least twenty-one bits precision.

Expand Down Expand Up @@ -1690,8 +1690,7 @@ <h4><dfn>gb18030 encoder</dfn></h4>
<p>If <var>pointer</var> is not null, run these substeps:

<ol>
<li><p>Let <var>lead</var> be
<var>pointer</var> / 190 + 0x81.
<li><p>Let <var>lead</var> be floor(<var>pointer</var> / 190) + 0x81.

<li><p>Let <var>trail</var> be <var>pointer</var> % 190.

Expand All @@ -1708,20 +1707,17 @@ <h4><dfn>gb18030 encoder</dfn></h4>
<li><p>Set <var>pointer</var> to the
<span>index gb18030 ranges pointer</span> for <var>code point</var>.

<li><p>Let <var>byte1</var> be
<var>pointer</var> / 10 / 126 / 10.
<li><p>Let <var>byte1</var> be floor(<var>pointer</var> / 10 / 126 / 10).

<li><p>Set <var>pointer</var> to
<var>pointer</var> &minus; <var>byte1</var> &times; 10 &times; 126 &times; 10.

<li><p>Let <var>byte2</var> be
<var>pointer</var> / 10 / 126.
<li><p>Let <var>byte2</var> be floor(<var>pointer</var> / 10 / 126).

<li><p>Set <var>pointer</var> to
<var>pointer</var> &minus; <var>byte2</var> &times; 10 &times; 126.

<li><p>Let <var>byte3</var> be
<var>pointer</var> / 10.
<li><p>Let <var>byte3</var> be floor(<var>pointer</var> / 10).

<li><p>Let <var>byte4</var> be
<var>pointer</var> &minus; <var>byte3</var> &times; 10.
Expand Down Expand Up @@ -1830,7 +1826,7 @@ <h4><dfn>big5 encoder</dfn></h4>
<li><p>If <var>pointer</var> is null, return <span>error</span> with
<var>code point</var>.

<li><p>Let <var>lead</var> be <var>pointer</var> / 157 + 0x81.
<li><p>Let <var>lead</var> be floor(<var>pointer</var> / 157) + 0x81.

<li><p>Let <var>trail</var> be <var>pointer</var> % 157.

Expand Down Expand Up @@ -1937,7 +1933,7 @@ <h4><dfn>euc-jp encoder</dfn></h4>
<li><p>If <var>pointer</var> is null, return <span>error</span> with
<var>code point</var>.

<li><p>Let <var>lead</var> be <var>pointer</var> / 94 + 0xA1.
<li><p>Let <var>lead</var> be floor(<var>pointer</var> / 94) + 0xA1.

<li><p>Let <var>trail</var> be <var>pointer</var> % 94 + 0xA1.

Expand Down Expand Up @@ -2238,7 +2234,7 @@ <h4><dfn>iso-2022-jp encoder</dfn></h4>
<span title="iso-2022-jp encoder jis0208">jis0208</span>, and return three bytes
0x1B 0x24 0x42.

<li><p>Let <var>lead</var> be <var>pointer</var> / 94 + 0x21.
<li><p>Let <var>lead</var> be floor(<var>pointer</var> / 94) + 0x21.

<li><p>Let <var>trail</var> be <var>pointer</var> % 94 + 0x21.

Expand Down Expand Up @@ -2345,7 +2341,7 @@ <h4><dfn>shift_jis encoder</dfn></h4>
<li><p>If <var>pointer</var> is null, return <span>error</span> with
<var>code point</var>.

<li><p>Let <var>lead</var> be <var>pointer</var> / 188.
<li><p>Let <var>lead</var> be floor(<var>pointer</var> / 188).

<li><p>Let <var>lead offset</var> be 0x81, if <var>lead</var> is
less than 0x1F, and 0xC1 otherwise.
Expand Down Expand Up @@ -2434,7 +2430,7 @@ <h4><dfn>euc-kr encoder</dfn></h4>
<li><p>If <var>pointer</var> is null, return <span>error</span> with
<var>code point</var>.

<li><p>Let <var>lead</var> be <var>pointer</var> / 190 + 0x81.
<li><p>Let <var>lead</var> be floor(<var>pointer</var> / 190) + 0x81.

<li><p>Let <var>trail</var> be <var>pointer</var> % 190 + 0x41.

Expand Down

0 comments on commit 929a3ff

Please sign in to comment.