Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: floating point numbers #113

Closed
alandonovan opened this issue Oct 2, 2020 · 7 comments
Closed

spec: floating point numbers #113

alandonovan opened this issue Oct 2, 2020 · 7 comments

Comments

@alandonovan
Copy link
Contributor

alandonovan commented Oct 2, 2020

This proposal argues for Starlark support for floating point numbers.

The Starlark implementations in Go and Java both support bigint (multiprecision) integers, allowing them to represent and manipulate all kinds of signed and unsigned numbers ranging from 8 bits to 64 that appear in protocol messages and other binary interchange formats. Proposal #112 argues for a byte string data type. The only remaining type required to support all values in protocol message is floating point numbers.

I propose that we support:

  • floating-point numeric literals of the form -1.23e+45, following Python.
  • a new data type, float, capable of representing all IEEE-754 double-precision (float64) values without loss.
  • arithmetic operations: unary + and -; binary +, -, *, /, //, and %, over float x float.
  • implicit conversion of int to float in binary operations of the form float x int and int x float.
  • support for %e, %f, and %g in string formatting (str % tuple).
  • total ordering <= of (non-NaN) float and integer values according to mathematical tradition.
  • float(int) and int(float) conversions.

All of this is already implemented in go.starlark.net. See https://github.com/google/starlark-go/blob/master/doc/spec.md#floating-point-numbers for details.

On the equivalence of 1 and 1.0: the Go implementation uses mathematical equivalence for == and hashing. Until recently Java used j.l.Integer for its integers, so it determines the hash and equivalence relations---and Integers are never equal to Floats. However, as of last week, the Java implementation used its own bigint type, so we are free to define our own float type and have them collaborate on hash and equals.

On ordering and NaN: in the Go implementation, all ordering is expressed in terms of <= operations, so NaN <= x and x <= NaN are both false, and this does not imply NaN == x. However, Java's Comparator mechanism uses three-way comparison, like Python's cmp, and cannot accommodate NaN. One possibility is to stop using Java's comparator and comparable, or perhaps expose it to Java but don't use it in the Starlark operators <= and sorted. After all, Comparator doesn't work the same way as Java's own <= operator w.r.t. NaN.
[UPDATE: the proposed spec change orders all floats, with NaN == NaN and Nan > +Inf.]

On rounding: past discussions have raised the specter of hardware floating-point operations revealing greater precision than IEEE-754 double-precision for intermediate results in e.g. x+y*z. I do not believe this is a real concern for Starlark. First, it's easy to enable strictfp arithmetic in the Java implementation. But more importantly, this is a Python-like language we are talking about, and there is so much loading and storing into the heap, and our interpreters are so unsophisticated, that there is no danger of a Starlark expression x+y*z being reduced to two ALU instructions on three registers. (We can dream...)

@ulfjack
Copy link

ulfjack commented Oct 2, 2020

The list of arithmetic operators is missing multiplication *. It might also be useful to define // and % operators.

@alandonovan
Copy link
Contributor Author

Thanks Ulf, I added *. The other two operators are already there; perhaps I misunderstand what you mean?

@brandjon
Copy link
Member

brandjon commented Oct 2, 2020

  • I notice Python allows float(), int(), and bool() to be called with no arguments and produce the zero value for those types. But the Java implementation only allows you to do that for bool. Is that a bug?

  • In Python 3, you can put float('nan') in a dict key. You can also retrieve it if it's the same identical nan object, but not if it's a separately constructed one. What behavior should we specify in Starlark? If nan is the first non-reflexive value to be added to Starlark, does that invalidate a bunch of identity comparison shortcuts?

  • See sys.float_repr_style. Apparently there's multiple ways to stringify floats, so we should be careful to choose the one we like now and avoid an extra backwards-incompatible repr change later on.

  • A consequence of numeric equality is that [1, 2] == [1.0, 2.0], and they hash the same as well (not that the user can observe the hash directly). I take it this has no negative consequences for us?

  • Will we match Python's behavior for coercing floats to ints? Will we add support for ceil/floor/round? (We also have no abs function.)

support for %e, %f, and %g in string formatting (str % tuple).

Depending on how far down the compatible-with-Python rabbit hole we want to go, it can be enormously complex to get all the formatting details right, especially across two different formatting systems (% operator and str.format()). See for instance Python's Format Specification Mini-Language, which applies to str.format() but not %.

capable of representing all IEEE-754 double-precision (float64) values [... and discussion on rounding]

Even if we'll behave as IEEE-754 in practice, do we want to put that in the spec or leave it implementation-defined?

@ulfjack
Copy link

ulfjack commented Oct 5, 2020

I meant: specify the semantics of // and % operators; I'm not sure it's completely clear what their meaning is for floating-point numbers, and there's not even a common definition for integers.

@brandjon
Copy link
Member

brandjon commented Oct 5, 2020

The source code (now in StarlarkInt.java) has this link explaining Python's rationale. (I tried to come up with a concise English description, but it's pretty hard to not involve extra vars, or else my arithmetic vocabulary isn't good enough.) In any case, this definition stays the same for floats.

adonovan added a commit to adonovan/starlark-spec-fork that referenced this issue Oct 16, 2020
Fixes bazelbuild#113

Change-Id: I33d7fb9f7b6e4227261ad484a97fd1c998c6355a
@alandonovan
Copy link
Contributor Author

@ulfjack, would really appreciate your review of adonovan@265dd04 as a rare expert on both Starlark and floating-point.

@alandonovan
Copy link
Contributor Author

@b5, would also appreciate your review as a user of Starlark for database queries. (I presume some of them contain floating point numbers.) Note that this spec would govern the Go implementation too, so changes are in the pipeline.

adonovan added a commit to adonovan/starlark-spec-fork that referenced this issue Oct 22, 2020
Fixes bazelbuild#113

Change-Id: I912998a190ea11c5455e6753d55e0948e80e5da1
adonovan added a commit to adonovan/starlark-spec-fork that referenced this issue Nov 2, 2020
Fixes bazelbuild#113

Change-Id: I912998a190ea11c5455e6753d55e0948e80e5da1
adonovan added a commit to adonovan/starlark-spec-fork that referenced this issue Nov 3, 2020
Fixes bazelbuild#113

Change-Id: I912998a190ea11c5455e6753d55e0948e80e5da1
adonovan added a commit to adonovan/starlark-spec-fork that referenced this issue Nov 3, 2020
Fixes bazelbuild#113

Change-Id: I912998a190ea11c5455e6753d55e0948e80e5da1
adonovan added a commit to adonovan/starlark-spec-fork that referenced this issue Nov 3, 2020
Fixes bazelbuild#113

Change-Id: I912998a190ea11c5455e6753d55e0948e80e5da1
adonovan added a commit to adonovan/starlark-spec-fork that referenced this issue Nov 10, 2020
Fixes bazelbuild#113

Change-Id: I612221610ac6e8a63c14fef2edfe5b0c72c3990c
adonovan added a commit to adonovan/starlark-spec-fork that referenced this issue Nov 11, 2020
Fixes bazelbuild#113

Change-Id: Id04ce3ce07b528c6bccb6143189da2870bb22d05
adonovan added a commit to adonovan/starlark-spec-fork that referenced this issue Nov 16, 2020
Fixes bazelbuild#113

Change-Id: Iaa676da8cd6c3142537b5bc5a654490a4a1f0887
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants