Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: Explicitly note mathematical values #1135

Merged
merged 2 commits into from
Jul 4, 2019

Conversation

littledan
Copy link
Member

This patch changes math in the ECMAScript specification to be based on
a concrete division into mathematical values and Numbers. All
numeric values and operations are explicitly either one or the other
with this patch. Previously, math took place in mathematical values,
based on a system of implicit conversions between these values and
Numbers.

This patch makes a big change in the idea of how most math operations
are evaluated internally, from mathematical values to Numbers, but the
goal is to make no observable change in semantics. The notation here
is intended to be relatively unintrusive and terse but unambiguous and
readable; I'd appreciate any feedback and suggestions. The hope is that
this notation will extend cleanly to BigInt as well as specifications for
embedders such as the Web Platform if they choose.

Thanks for your patience in this patch; as @bterlson pointed out, I promised
to make this change almost three years ago as part of the SIMD.js effort :)
The change here ended up touching much less of the specification than
I expected.

Closes tc39/proposal-bigint#10

If interested, I'd really appreciate reviews from @anba @jmdyck @claudepache @domenic @annevk

Copy link
Member

@ljharb ljharb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might obsolete or impact #1086?

spec.html Outdated
<li><em>Mathematical value</em>: Arbitrary real numbers, used for specific situations.</li>
</ul>

<p>In the language of this specification, numerical values and operations (including addition, subtraction, negation, multiplication, addition, and comparison) are distinguished among different numeric kinds using subscripts. The subscript <sub><dfn id="𝕗">𝕗</dfn></sub> refers to Numbers, and the subscript <sub><dfn id="𝕧">𝕧</dfn></sub> refers to numeric values. A subscript is used following each numeric value and operation. For brevity, the <sub>𝕗</sub> subscript can be omitted on Number values--a numeric value with no subscript is interpreted to be a Number. An operation with no subscript is interpreted to be a Number operation, unless one of the parameters has a particular subscript, in which case the operation adopts that subscript. For example, 1<sub>𝕧</sub> + 2<sub>𝕧</sub> = 3<sub>𝕧</sub> is a statement about mathematical values, and 1 + 2 = 3 is a statement about Numbers. It is not defined to mix Numbers and mathematical values in either arithmetic or comparison operations, and any such undefined operation would be an editorial error in this specification text.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numeric values → mathematical values?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for the 1v + 2v operation, should that be 1v + 2 = 3 instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numeric values → mathematical values?

I don't really understand this suggestion; I used the phrase "numeric values" here in order to range over Numbers and mathematical values. Do you have a better word than "numeric" to use here? Note that the BigInt ToNumeric abstract operation also corresponds to this concept.

Also for the 1v + 2v operation, should that be 1v + 2 = 3 instead?

In the notation in this PR, it's required to put a v subscript on all literal mathematical values. The expression 1v + 2 is undefined, since you can't add a mathematical value and a number. The only implicit subscripts are for operators, not values. Do you see anything I could clarify about the text in this section to explain this better?

Copy link
Collaborator

@jmdyck jmdyck Mar 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re numeric values → mathematical values?:
I suspect ljharb is referring to a different occurrence than you're looking at, specifically
<sub><dfn id="𝕧">𝕧</dfn></sub> refers to numeric values.
Anyhow, that's the change I made in 9b68de0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence talks about the operation adopting a subscript when one of the operands has a subscript and the other does not. I’m suggesting showing an example of one explicit and one missing subscript, which to my reading means that 1v + 2 is quite defined to be “1v + 2v” - whereas 1v + 2f might be undefined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that seems very strange to have a binary operator not be defined solely based on its two operands

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? How is plus not being defined solely based on its operands?

  • If its operands are both mathematical values, it performs a mathematical addition.
  • If its operands are both Number values, it performs a Number addition.
  • If its operands are anything else, the spec explodes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, roughly speaking, the 'v' subscript on the '1' propagates to the plus, but not to the '2'.

If it solely depends on the operates, then why would anything propagate to the plus?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it solely depends on the operates [operands?], then why would anything propagate to the plus?

I don't really understand the question, because to me those are two ways of saying about the same thing.

Like I said, the "propagates" terminology is roughly speaking. It would be more correct to say that, given an unsubscripted operator, the presence of a v-subscript on an operand allows you to infer a v-subscript for the operator.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to leave the extent of "propagation" really minimal and syntactically obvious.

New plan for omitting subscripts:

  • For operations such as +, decide on the type based on the operands.
  • For values, do as @jmdyck suggested and omit subscripts on "safe" values, including them on all "unsafe" values. Note that, when we add BigInt, the set of safe values will get smaller, but AFAICT not in a way which affects any of the spec algorithms that I saw.

spec.html Outdated
<p>In the language of this specification, numerical values and operations (including addition, subtraction, negation, multiplication, addition, and comparison) are distinguished among different numeric kinds using subscripts. The subscript <sub><dfn id="𝕗">𝕗</dfn></sub> refers to Numbers, and the subscript <sub><dfn id="𝕧">𝕧</dfn></sub> refers to numeric values. A subscript is used following each numeric value and operation. For brevity, the <sub>𝕗</sub> subscript can be omitted on Number values--a numeric value with no subscript is interpreted to be a Number. An operation with no subscript is interpreted to be a Number operation, unless one of the parameters has a particular subscript, in which case the operation adopts that subscript. For example, 1<sub>𝕧</sub> + 2<sub>𝕧</sub> = 3<sub>𝕧</sub> is a statement about mathematical values, and 1 + 2 = 3 is a statement about Numbers. It is not defined to mix Numbers and mathematical values in either arithmetic or comparison operations, and any such undefined operation would be an editorial error in this specification text.</p>
<p>The Number value 0, alternatively written 0<sub>𝕗</sub>, is defined as the double-precision floating point positive zero value. In certain contexts, it may also be written as +0 for clarity.</p>
<p>This specification denotes most numeric values in base 10; it also uses numeric values of the form 0x followed by digits 0-9 or A-F as base-16 values.</p>
<p>In certain contexts, an operation is specified which is generic between Numbers and mathematical values. In these cases, the subscript can be a variable; _t_ is often used for this purpose, for example 5<sub>_t_</sub> * 10<sub>_t_</sub> =<sub>_t_</sub> 50<sub>_t_</sub> for any _t_ ranging over 𝕧 and 𝕗.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you have t50t here, should the first one be removed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the first one is attached to the equals sign (i.e. equality comparison operation).

IMO this paragraph should state (probably in more elegant words) that the reason that example works is that 5, 10, and 50 are in the right range so as to not encounter the vagaries of floating point arithmetic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@domenic Yes, that was the intention.

Good suggestion, added a note explaining this reason.

spec.html Outdated
@@ -3856,7 +3867,7 @@ <h1>ToInt32 ( _argument_ )</h1>
<emu-alg>
1. Let _number_ be ? ToNumber(_argument_).
1. If _number_ is *NaN*, *+0*, *-0*, *+&infin;*, or *-&infin;*, return *+0*.
1. Let _int_ be the mathematical value that is the same sign as _number_ and whose magnitude is floor(abs(_number_)).
1. Let _int_ be the Number value that is the same sign as _number_ and whose magnitude is floor(abs(_number_)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these changes safe to make?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the values involved are within the range where the calculation means the same thing for Numbers as well as mathematical values. Almost all of the specification is this way, and analogous changes are implicitly made everywhere else (based on the rewriting of the above section for mathematical operations).

spec.html Outdated
@@ -24131,7 +24142,7 @@ <h1>parseFloat ( _string_ )</h1>
1. If neither _trimmedString_ nor any prefix of _trimmedString_ satisfies the syntax of a |StrDecimalLiteral| (see <emu-xref href="#sec-tonumber-applied-to-the-string-type"></emu-xref>), return *NaN*.
1. Let _numberString_ be the longest prefix of _trimmedString_, which might be _trimmedString_ itself, that satisfies the syntax of a |StrDecimalLiteral|.
1. Let _mathFloat_ be MV of _numberString_.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

@jmdyck
Copy link
Collaborator

jmdyck commented Mar 10, 2018

In this scheme, do all of the phrases listed here return a Number value?

Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall quite exciting, and seems like the right approach. I made it through a bit before realizing I should stop working on the weekend; I'll try and check out more later.

One initial comment, which may prove unfounded after a more thorough review, is that you define a decent number of shorthands, but then use the longhands as well, which seems a bit wasteful. Similarly, some of the genericness seems excessive, e.g. integer<sub>_t_</sub> is only used in one place, and I'm not sure yet why abs, floor, and friends need to operate on mathematical values.

Also, it'll be important to get cross-links going on for all the notations here. I'll try to build the spec myself with these changes and make sure it all lines up.

spec.html Outdated
<p>In the language of this specification, numerical values and operations (including addition, subtraction, negation, multiplication, addition, and comparison) are distinguished among different numeric kinds using subscripts. The subscript <sub><dfn id="𝕗">𝕗</dfn></sub> refers to Numbers, and the subscript <sub><dfn id="𝕧">𝕧</dfn></sub> refers to numeric values. A subscript is used following each numeric value and operation. For brevity, the <sub>𝕗</sub> subscript can be omitted on Number values--a numeric value with no subscript is interpreted to be a Number. An operation with no subscript is interpreted to be a Number operation, unless one of the parameters has a particular subscript, in which case the operation adopts that subscript. For example, 1<sub>𝕧</sub> + 2<sub>𝕧</sub> = 3<sub>𝕧</sub> is a statement about mathematical values, and 1 + 2 = 3 is a statement about Numbers. It is not defined to mix Numbers and mathematical values in either arithmetic or comparison operations, and any such undefined operation would be an editorial error in this specification text.</p>
<p>The Number value 0, alternatively written 0<sub>𝕗</sub>, is defined as the double-precision floating point positive zero value. In certain contexts, it may also be written as +0 for clarity.</p>
<p>This specification denotes most numeric values in base 10; it also uses numeric values of the form 0x followed by digits 0-9 or A-F as base-16 values.</p>
<p>In certain contexts, an operation is specified which is generic between Numbers and mathematical values. In these cases, the subscript can be a variable; _t_ is often used for this purpose, for example 5<sub>_t_</sub> * 10<sub>_t_</sub> =<sub>_t_</sub> 50<sub>_t_</sub> for any _t_ ranging over 𝕧 and 𝕗.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the first one is attached to the equals sign (i.e. equality comparison operation).

IMO this paragraph should state (probably in more elegant words) that the reason that example works is that 5, 10, and 50 are in the right range so as to not encounter the vagaries of floating point arithmetic.

spec.html Outdated
<p>The Number value 0, alternatively written 0<sub>𝕗</sub>, is defined as the double-precision floating point positive zero value. In certain contexts, it may also be written as +0 for clarity.</p>
<p>This specification denotes most numeric values in base 10; it also uses numeric values of the form 0x followed by digits 0-9 or A-F as base-16 values.</p>
<p>In certain contexts, an operation is specified which is generic between Numbers and mathematical values. In these cases, the subscript can be a variable; _t_ is often used for this purpose, for example 5<sub>_t_</sub> * 10<sub>_t_</sub> =<sub>_t_</sub> 50<sub>_t_</sub> for any _t_ ranging over 𝕧 and 𝕗.</p>
<p>Conversions between mathematical values and numbers are never implicit, and always explicit in this document. A conversion from a mathematical value to a Number is denoted as "the <dfn id="number-value">Number value</dfn> of _x_", or 𝕗(_x_). A conversion from a Number to a mathematical value is denoted as "the <dfn id="mathematical-value">mathematical value</dfn> of _x_", or 𝕧(_x_). Note that the mathematical value of non-finite values is not defined, and the mathematical value of *+0* and *-0* is the mathematical value 0<sub>𝕧</sub>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to underdefine these conversions. For example, what is 𝕗(1 × 10^1000).

Further on it seems these are actually defined; you'll want to make the actual definition the <dfn> and have this be a link to it.

spec.html Outdated
<p>In certain contexts, an operation is specified which is generic between Numbers and mathematical values. In these cases, the subscript can be a variable; _t_ is often used for this purpose, for example 5<sub>_t_</sub> * 10<sub>_t_</sub> =<sub>_t_</sub> 50<sub>_t_</sub> for any _t_ ranging over 𝕧 and 𝕗.</p>
<p>Conversions between mathematical values and numbers are never implicit, and always explicit in this document. A conversion from a mathematical value to a Number is denoted as "the <dfn id="number-value">Number value</dfn> of _x_", or 𝕗(_x_). A conversion from a Number to a mathematical value is denoted as "the <dfn id="mathematical-value">mathematical value</dfn> of _x_", or 𝕧(_x_). Note that the mathematical value of non-finite values is not defined, and the mathematical value of *+0* and *-0* is the mathematical value 0<sub>𝕧</sub>.
<p>When the term <dfn id="integer">integer</dfn> is used in this specification, it refers to a Number value which whose mathematical value is in set of integers, unless otherwise stated: when the term <dfn id="mathematical integer">mathematical integer</dfn> is used in this specification, it refers to a mathematical value which is in the set of integers. As shorthand, integer<sub>_t_</sub> can be used to refer to either of the two, as determined by _t_.</p>
<p>The mathematical function <emu-eqn id="eqn-abs" aoid="abs">abs<sub>_t_</sub>(_x_)</emu-eqn> produces the absolute value of _x_, which is <emu-eqn>-<sub>_t_</sub>_x_</emu-eqn> if _x_ &lt;<sub>_t_</sub> 0<sub>_t_</sub> and otherwise is _x_ itself.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove "mathematical" here since it operates over both Number and mathematical values.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except then we're just calling it a function, which it isn't, in the ES sense. We could call it an operation.

spec.html Outdated
<emu-note>
<p>The bit pattern that might be observed in an ArrayBuffer (see <emu-xref href="#sec-arraybuffer-objects"></emu-xref>) or a SharedArrayBuffer (see <emu-xref href="#sec-sharedarraybuffer-objects"></emu-xref>) after a Number value has been stored into it is not necessarily the same as the internal representation of that Number value used by the ECMAScript implementation.</p>
</emu-note>
<p>There are two other special values, called *positive Infinity* and *negative Infinity*. For brevity, these values are also referred to for expository purposes by the symbols *+&infin;* and *-&infin;*, respectively. (Note that these two infinite Number values are produced by the program expressions `+Infinity` (or simply `Infinity`) and `-Infinity`.)</p>
<p>The other 18437736874454810624 (that is, <emu-eqn>2<sup>64</sup>-2<sup>53</sup></emu-eqn>) values are called the finite numbers. Half of these are positive numbers and half are negative numbers; for every finite positive Number value there is a corresponding negative value having the same magnitude.</p>
<p>The other 18437736874454810624<sub>𝕧</sub> (that is, <emu-eqn>2<sub>𝕧</sub><sup>64<sub>𝕧</sub></sup>-2<sub>𝕧</sub><sup>53<sub>𝕧</sub></sup></emu-eqn>) values are called the finite numbers. Half of these are positive numbers and half are negative numbers; for every finite positive Number value there is a corresponding negative value having the same magnitude.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Half<sub>𝕧</sub>? Kidding ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I thought about that sort of thing a little bit, maybe more than it deserves. There are all sorts of numbers in the prose of the spec, but I didn't really see any sort of semantic ambiguity with them (all seem well within range where Number and mathematical value coincide).

@jmdyck
Copy link
Collaborator

jmdyck commented Mar 11, 2018

In the current spec, there are things that are clearly Number values, and things that are clearly mathematical values, and a lot of things where it's unclear one way or the other.

As far as the unclear things are concerned, it looks like this PR makes them all Number values (other than a few that it explicitly makes mathematical). I think I'd prefer a solution in which most of them become mathematical values.

The main reason for my preference is that Number is a complicated type. Its value space is non-uniform (NaN, 2 infinities, 2 zeroes, normalized and denormalized values), and the semantics of its operations involve rounding and exceptions, defined by an external (payware) spec. So applying Number semantics where it isn't needed (i.e., where mathematical reals suffice) is an unnecessary complication.

@littledan, you refer to a subset of Number (which I'll call 'safe') for which the Number semantics are the same as for the corresponding mathematical values. This PR appears to be asserting that all of the spec's unclear cases involve the 'safe' subset. Is that correct? Have you checked?


To put it another way, every time the spec manipulates a Number value, it should be true that either:
(a) the values and operations involved are in the 'safe' subset, or
(b) they aren't (always) safe, but the IEEE 754 semantics are precisely what the spec wants at that point.

I'm worried that:

(1) If we decree that the unclear cases are Numbers without confirming that they're safe, we could be changing the spec to say things that we don't intend, which could come back to bite us.

(2) For future changes/additions to the spec, people will probably use unprefixed numbers/operators, perhaps without confirming that the above is true.

(3) Even if we convince ourselves that a given situation is 'safe', if we don't mark it as such (somehow), then we're requiring implementors to use IEEE 754 semantics unless they too can convince themselves that it's safe. (I'm assuming that at least some implementors would like to know when they're obliged to use IEEE 754 double-precision floating point semantics, and when they're at liberty to use something else.)

@claudepache
Copy link
Contributor

From the PR

A subscript is used following each numeric value and operation. For brevity, the 𝕗 subscript can be omitted on Number values--a numeric value with no subscript is interpreted to be a Number.

I think that “no subscript means Number” is the wrong default, even if we intend to use more often Number values than mathematical values. If some default is to be used, no subscript should mean ”mathematical value” (although I think we should leave it as “implied by the context” until we have reviewed and clarified all occurrences).

The reason is that in general prose (outside the spec) a bare number represents more often what corresponds to a “mathematical value” than an instance of some IEEE-defined type. In some specialised context (e.g., computer code), it may refer more often to some IEEE-defined type, but even in those cases, when it is important to make the distinction between the two notions, I think it is customary to use explicit special marking for the specialised meaning (e.g., by rendering the code with fixed-width font).

Some examples from the spec where a bare number is clearly a mathematical notion and where it would be strange to use some subscript:

  • ”a String of length 2” / ”a List of length 1”
  • ”The other 18437736874454810624 (that is, 264-253) values are called the finite numbers.”

Maybe you could argue that those occurrences are more English prose than ECMASpeak code and are thus outside the scope of this convention; but I don’t think it is a good idea to introduce a discrepancy between English and ECMASpeak.

An example where a bare number represents clearly a Number value and where it seems more appropriate to use an explicit annotation rather than relying on default if we want to be über-precise:

  • “The value of Number.MAX_SAFE_INTEGER is 9007199254740991 (253-1).”

@littledan
Copy link
Member Author

@claudepache The goal of this approach is specifically to minimize the use of real numbers in everyday algorithms in the ES specification. Real numbers seem like a funny abstraction here to base the definition of a programming language on--most of them are uncomputable, for one.

I can understand how, abstractly, it's nice if numbers have their ordinary, mathematical meaning too--I think JavaScript programmers start with the same intuition as well, which is why floating point inaccuracy can be confusing. But in both cases, we need something well-defined which can be evaluated efficiently on computers, and arbitrary real numbers don't really meet that goal in general.

The previous approach was to use mathematical values by default and rely on implicit conversions to Numbers, but this was ambiguous. The goal of this PR is to be explicit about all conversions, try to default to Number whenever possible, and avoid excessive notation overhead.

Do you see any cases which need a subscript for accuracy? I tried to put subscripts everywhere where required, but I might've missed some cases. (Seems like @jmdyck filled in a few in a PR; reviewing now).

Note that, with the BigInt feature, we'll have another numeric type that we can use internally in specification algorithms. It would be fine to add this type before it's exposed to JavaScript, but I haven't done that here as I didn't see any algorithms which would require it.

Bikeshedding: Should we actually use mathy R for mathematical values, and then mathy Z for BigInts, in spec-land?

@jmdyck
Copy link
Collaborator

jmdyck commented Mar 13, 2018

What if we defined the 'safe' subset (where the semantics of Numbers and Reals agree), and then said that unsubscripted numbers/operators use the safe subset. That is, if some unsubscripted operation would yield different results depending on whether it's interpreted as Numbers or Reals, then that's a spec bug.

Also, I'm wondering if the safe subset is (or needs to be) more than just the range of safe integers. I.e., are there unclear cases where the value(s) in question aren't integers? I'm not finding many.

@littledan
Copy link
Member Author

@jmdyck That's a great idea. It will be useful to call out those places where they differ explicitly. I think I'll redo this patch in those terms.

@littledan
Copy link
Member Author

@domenic

One initial comment, which may prove unfounded after a more thorough review, is that you define a decent number of shorthands, but then use the longhands as well, which seems a bit wasteful. Similarly, some of the genericness seems excessive, e.g. integert is only used in one place, and I'm not sure yet why abs, floor, and friends need to operate on mathematical values.

That's a really good point. I'll switch to using the shorthand everywhere; the links that we have now should make it clear what's being referred to.

About the unused generic-ness: I think it'll be used a little more when BigInt comes in, but this is a good point: for now, I'll remove it in favor of defining these as functions on Numbers; we can consider adding it back in later.

@domenic
Copy link
Member

domenic commented Mar 13, 2018

Still haven't gotten around to a full review, but I want to disagree with the above folks pushing for mathematical values as a default, and agree strongly with your reasoning for Number as the default. (Most mathematical values being uncomputable is my favorite.) I think explicitly noting a safe subset where Numbers convert losslessly to mathematical values is fine, but still all values should have a type, and that type should default to Number. And ideally all numbers should be notated one way or the other, or at least use the generic variant.

@littledan
Copy link
Member Author

I think explicitly noting a safe subset where Numbers convert losslessly to mathematical values is fine, but still all values should have a type, and that type should default to Number.

Are you saying you don't like the idea of most values noted in the specification being in a sort of superposition between Number and mathematical value? The thing I liked about that idea is that it makes it obvious where floating point lossiness might occur. This may be useful, for example, if someone is trying to understand whether it's valid to implement an algorithm in terms of integers rather than floating point values.

@littledan
Copy link
Member Author

Pushed a new patch to try to embody @claudepache and @jmdyck 's suggestion. Mixed feelings.

Editorial: Add a "safe integer" numerical kind

This patch makes three numeric kinds: safe integers, Numbers and
mathematical values. It also makes the subscript for operations always
implicit--if at least one operand is a Number or mathematical value,
then it is an operation outputting that kind, otherwise it's an operation
on safe integers which implicitly contains an assertion that the output
is a safe integer. This phrasing has significantly fewer subscripts
while (implicitly) including more assertions that semantics aren't
changing.

Overall, I'm not sure if this is the best way to go. Because operations
don't have subscripts, random constants in the expression are enlisted
to have subscripts to force the output into a certain kind. Overall, it
seems like it takes more close reading to figure out what's going on.
I'm wondering if it would be better to require subscripts on all
non-"safe integer" operations. This would require, though, that we use
some different notation for exponentiation, as there's currently no
place to put the subscript.

@domenic
Copy link
Member

domenic commented May 1, 2018

I still much prefer the two-type version :(. Mathematical values have no place in the specification of a language implemented by computers, IMO.

@littledan
Copy link
Member Author

littledan commented May 1, 2018

By two types, you mean Number and BigInt, and avoid using mathematical values at all? A couple places where this gets a little difficult:

  • Calculations which sort of span Number and large integer (e.g, converting Numbers to and from strings) are a little more awkward
  • A few places where we can be just a little bit sloppy about overflow, such as the definition of %
  • Unclear how to word definitions such as Math.PI being the Number value of pi.

I get how a computer program makes do in these situations, by doing funny calculations to avoid overflow, but it seems like such calculations would be a distraction when trying to read a specification.

@domenic
Copy link
Member

domenic commented May 1, 2018

Yeah, those two types.

Calculations which sort of span Number and large integer (e.g, converting Numbers to and from strings) are a little more awkward

How do you think a computer would perform these calculations? I'd suggest e.g. using Numbers representing the Unicode code point values.

Unclear how to word definitions such as Math.PI being the Number value of pi.

Looking at this definition I don't see where it requires mathematical integers. It requires some concept of "π, the ratio of the circumference of a circle to its diameter", and then it names a specific approximate Number value, but it doesn't have some kind of Godelian uncomputable existence proof of pi's value that would require a non-Number mathematical value.

@littledan
Copy link
Member Author

How do you think a computer would perform these calculations? I'd suggest e.g. using Numbers representing the Unicode code point values.

As others have said above, Numbers are pretty specific and complicated. We don't actually want these to be outside of the set of integers, for example. By using a "safe integer" spec type, we assert that full floating point semantics are not needed to model them.

@domenic
Copy link
Member

domenic commented May 1, 2018

I don't think that assertion is valuable, given the baggage of an entire third numeric type that it brings along.

@littledan
Copy link
Member Author

The most recent patch I uploaded in the series, c3c0838 , doesn't seem complete--there are a bunch of places which are a bit sloppy in mixing safe integers and Numbers. In general, fixing this would seem to require adding a whole bunch of assertions/casts that JavaScript values which come in are safe integers. I'm not sure if all of those assertions pay for themselves, as @domenic is saying.

@claudepache @jmdyck Where would you recommend to go from here? It seems you were skeptical of switching everything to Number, but I'm not sure what the alternative is.

@jmdyck
Copy link
Collaborator

jmdyck commented May 1, 2018

I still much prefer the two-type version. [Number and BigInt, no mathematical values]

Does this actually exist? I.e., is there a (proposed) version of the spec that has no use of mathematical values?

@littledan
Copy link
Member Author

littledan commented May 1, 2018

@jmdyck I'm thinking of going back to the Number/mathematical value version. I haven't seen a purely Number/BigInt version. You made some critical comments about that version above; I'm wondering what you think our next steps should be.

littledan added a commit to littledan/ecma402 that referenced this pull request May 1, 2018
This patch brings Intl.NumberFormat support to BigInt, and
adds a BigInt.prototype.toLocaleString method based on it.

The design here is to include overloading between BigInt and Number
as arguments for the format and formatToParts methods based on
ToNumeric. This means that, for example, string arguments are
cast to Number, rather than BigInt. This design preserves
compatibility and consistency with operators like unary -

This definition permits options in the NumberFormat to force
decimal places, e.g., 1n formatting as 1.00000 if the minimum
fractional digits is 5. Alternative semantics would be to
throw an exception in this case.

For the algorithm text itself: the specification algorithms
ToRawPrecision and ToRawFixed are now used for both Numbers
and BigInts. Given the ECMAScript specification's use of implicit
coercisions between Numbers and mathematical values, I believe
that this is valid without any special changes; the phrasing
may change in the future [1].

ICU4C-based implementations of ECMAScript can use
LocalizedNumberFormatter::formatDecimal [2] or
unum_formatDecimal [3] to implement the algorithms in this patch.

[1] tc39/ecma262#1135
[2] http://icu-project.org/apiref/icu4c/classicu_1_1number_1_1LocalizedNumberFormatter.html#a29cd3d107b784496e19175ce0115f26f
[3] http://icu-project.org/apiref/icu4c/unum_8h.html#a59870a322f012dc1b9d99cf8a7b708f1

Closes tc39#218
@littledan
Copy link
Member Author

OK, I reverted to the previous Number/mathematical value text, just taking in @jmdyck 's changes and renaming the subscripts. If you want to see a few half-finished crazy ideas, check out this branch.

@caiolima
Copy link
Contributor

IMO, this change is good. It will make the sepc very clear which numeric type we should use.

@caiolima
Copy link
Contributor

caiolima commented Jun 6, 2019

@zenparsing @ljharb FYI, I rebased this PR. It needs to land before we land #1515.

@caiolima
Copy link
Contributor

@zenparsing ping

@claudepache
Copy link
Contributor

The patch introduces the character ℝ in spec.html, but that file contains the declaration
<meta charset="ascii"> at line 2, which will make my text editor unhappy. I suggest to replace it by
<meta charset="utf-8">.

leobalter pushed a commit to tc39/ecma402 that referenced this pull request Jun 13, 2019
This patch brings Intl.NumberFormat support to BigInt, and
adds a BigInt.prototype.toLocaleString method based on it.

The design here is to include overloading between BigInt and Number
as arguments for the format and formatToParts methods based on
ToNumeric. This means that, for example, string arguments are
cast to Number, rather than BigInt. This design preserves
compatibility and consistency with operators like unary -

This definition permits options in the NumberFormat to force
decimal places, e.g., 1n formatting as 1.00000 if the minimum
fractional digits is 5. Alternative semantics would be to
throw an exception in this case.

For the algorithm text itself: the specification algorithms
ToRawPrecision and ToRawFixed are now used for both Numbers
and BigInts. Given the ECMAScript specification's use of implicit
coercisions between Numbers and mathematical values, I believe
that this is valid without any special changes; the phrasing
may change in the future [1].

ICU4C-based implementations of ECMAScript can use
LocalizedNumberFormatter::formatDecimal [2] or
unum_formatDecimal [3] to implement the algorithms in this patch.

[1] tc39/ecma262#1135
[2] http://icu-project.org/apiref/icu4c/classicu_1_1number_1_1LocalizedNumberFormatter.html#a29cd3d107b784496e19175ce0115f26f
[3] http://icu-project.org/apiref/icu4c/unum_8h.html#a59870a322f012dc1b9d99cf8a7b708f1

Closes #218
@ljharb ljharb self-assigned this Jun 26, 2019
ljharb pushed a commit to littledan/ecma262 that referenced this pull request Jun 26, 2019
This patch changes math in the ECMAScript specification to be based on
a concrete division into mathematical values and Numbers. All
numeric values and operations are explicitly either one or the other
with this patch. Previously, math took place in mathematical values,
based on a system of implicit conversions between these values and
Numbers.

Closes tc39/proposal-bigint#10

Additional commits:
 - Markup: fix 2 well-formedness errors
 - "numeric" -> "mathematical"
     You set up the dichotomy between Numbers and mathematicals, surely the f/v distinction is the same dichotomy.
 - in "the number value that is", change "number" -> "Number"
 - delete "the Number value of"
     Elsewhere, you changed "the Number value of the Element Size" to just "the Element Size". Here's a couple you missed.
 - remove space between operator and subscript
 - insert "the mathematical value of" in a few places
 - insert "<sub>v</sub>" in a few places
 - Re-work a step.
     One problem is that
         the Number value of X &times; Y
     is ambiguous -- it could mean:
         (the Number value of X) &times; Y
     or:
         the Number value of (X &times; Y)

     Another problem is that _n_ is mathematical, so in 10<sup>_n_</sup>, 10 should be mathematical too.
     And then, depending on how the above ambiguity resolves, you'd have to either convert that to a Number or change &times; to a mathematical operator.

     It's simpler (I think) to do all the arithmetic in the math realm, and then convert just the result to a Number.

     (I'm assuming that 'plus' and 'times' as *words* are mathematical operators.)
 - fix some typos in the definition of "integer"
 - Editorial: Switch from f and v to F and R
 - Editorial: Specify that general phrases mean Number
 - Editorial: "Number value of" -> "Number value for"
 - Conversion of mathematical values to Number values is already described in "The Number Type", and is denoted by the phrase "the Number value for X".
 - Editorial: insert "the Number value for" in a few spots (no implicit conversions!)
 - Editorial: reword "the number whose value is MV of |NumericLiteral|"
     The phrase:
         "the number whose value is MV of |NumericLiteral|"
     could be interpreted as:
         "the Number value for MV of |NumericLiteral|"
     but this would be incorrect, because "the Number value for" (6.1.6) doesn't completely describe the rounding that is applied to the MV when obtaining the Number represented by |NumericLiteral|.

     Instead, change to:
         "the Number value represented by |NumericLiteral|"
     which leaves MV out of it, and so refers to 11.8.3's whole process for arriving at a Number value.
 - Fixing some NITs on equations
 - Fixing some issues on text and removing redundant subscripts from operations of some equations
ljharb pushed a commit to littledan/ecma262 that referenced this pull request Jun 26, 2019
littledan and others added 2 commits July 3, 2019 23:13
This patch changes math in the ECMAScript specification to be based on
a concrete division into mathematical values and Numbers. All
numeric values and operations are explicitly either one or the other
with this patch. Previously, math took place in mathematical values,
based on a system of implicit conversions between these values and
Numbers.

Closes tc39/proposal-bigint#10

Additional commits:
 - Markup: fix 2 well-formedness errors
 - "numeric" -> "mathematical"
     You set up the dichotomy between Numbers and mathematicals, surely the f/v distinction is the same dichotomy.
 - in "the number value that is", change "number" -> "Number"
 - delete "the Number value of"
     Elsewhere, you changed "the Number value of the Element Size" to just "the Element Size". Here's a couple you missed.
 - remove space between operator and subscript
 - insert "the mathematical value of" in a few places
 - insert "<sub>v</sub>" in a few places
 - Re-work a step.
     One problem is that
         the Number value of X &times; Y
     is ambiguous -- it could mean:
         (the Number value of X) &times; Y
     or:
         the Number value of (X &times; Y)

     Another problem is that _n_ is mathematical, so in 10<sup>_n_</sup>, 10 should be mathematical too.
     And then, depending on how the above ambiguity resolves, you'd have to either convert that to a Number or change &times; to a mathematical operator.

     It's simpler (I think) to do all the arithmetic in the math realm, and then convert just the result to a Number.

     (I'm assuming that 'plus' and 'times' as *words* are mathematical operators.)
 - fix some typos in the definition of "integer"
 - Editorial: Switch from f and v to F and R
 - Editorial: Specify that general phrases mean Number
 - Editorial: "Number value of" -> "Number value for"
 - Conversion of mathematical values to Number values is already described in "The Number Type", and is denoted by the phrase "the Number value for X".
 - Editorial: insert "the Number value for" in a few spots (no implicit conversions!)
 - Editorial: reword "the number whose value is MV of |NumericLiteral|"
     The phrase:
         "the number whose value is MV of |NumericLiteral|"
     could be interpreted as:
         "the Number value for MV of |NumericLiteral|"
     but this would be incorrect, because "the Number value for" (6.1.6) doesn't completely describe the rounding that is applied to the MV when obtaining the Number represented by |NumericLiteral|.

     Instead, change to:
         "the Number value represented by |NumericLiteral|"
     which leaves MV out of it, and so refers to 11.8.3's whole process for arriving at a Number value.
 - Fixing some NITs on equations
 - Fixing some issues on text and removing redundant subscripts from operations of some equations
@ljharb ljharb merged commit dc1e21c into tc39:master Jul 4, 2019
jmdyck added a commit to jmdyck/ecma262 that referenced this pull request Jul 30, 2019
caiolima pushed a commit to caiolima/ecma262 that referenced this pull request Aug 15, 2019
caiolima pushed a commit to caiolima/ecma262 that referenced this pull request Aug 30, 2019
caiolima pushed a commit to caiolima/ecma262 that referenced this pull request Sep 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Editorial: How should the final spec deal with mathematical values, Numbers and Integers?
8 participants