Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: add floating-point numbers #119

Merged
merged 2 commits into from
Nov 16, 2020
Merged

Conversation

alandonovan
Copy link
Contributor

No description provided.

Copy link
Member

@brandjon brandjon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's my comments, but I don't think we should approve until we have community feedback.

spec.md Show resolved Hide resolved
spec.md Outdated Show resolved Hide resolved
spec.md Show resolved Hide resolved
spec.md Outdated Show resolved Hide resolved
spec.md Show resolved Hide resolved
spec.md Show resolved Hide resolved
spec.md Show resolved Hide resolved
spec.md Show resolved Hide resolved
spec.md Show resolved Hide resolved
spec.md Show resolved Hide resolved
@laurentlb
Copy link
Contributor

laurentlb commented Oct 15, 2020

A few thoughts on this

  • nan == nan

That's interesting, because it removes the problems that appear when an object is not equal to itself (e.g. when used in a set or as a dict key).

  • float < float is a transitive relation

This is nice, as it removes lots of weird cases.

  • nan > inf

That seems to be a natural consequence of the previous choices, but this remains surprising. IEEE operations return nan when a computation is dubious. It seems to me that the alternative would be to raise an error (in IEEE 754, it seems that they use the nan value each time they would like to throw an exception, but they can't). In my experience, it's rare to have an intentional nan inside a computation. It's most likely a mistake, that should ideally be reported earlier.

Since nan is "not a number", maybe this value shouldn't support operators that expect numbers.

  • implicit conversion from int to float

Lots of languages do that, but it's worth discussing it. Languages like C typically convert types to the "bigger" type. In Starlark, int are technically larger than floats. Of course, an implicit conversion to int would be even more surprising. Have you considered raising an error here and requiring explicit conversions? Implicit conversions often make type inference more complicated, for example.

In the past, we've often preferred raising errors in cases where no design was obvious. This allowed us to change the behavior later, once we have more data and user feedback (without breaking compatibility).

@brandjon
Copy link
Member

If you make it illegal to do regular arithmetic operations on NaN, you're really just deferring an error that should've been raised at the time the NaN was generated. I think the choice should be to either allow NaN arithmetic (including ordering), or eliminate NaN entirely.

The existence of NaN indicates a design philosophy that errors should not be show stoppers. It makes sense when you consider large matrix operations -- do you really want to guard every pixel's transformation with if statements, or replace bad pixels with dummy values? (Of course, that line of reasoning says that all operations should be defined, including division by zero, for which Python raises an exception.)

Ultimately, I think our motivation for adding floats to Starlark is configuration (e.g. possibly in rule attributes) and data interchange (protos, json), not numerical computation. This is why it makes sense to me to treat NaN as a singleton value (with nan == nan), rather than as a placeholder for "don't trust this computation" (with nan != nan). You want to test if the item you serialize to a file is NaN, not whether the computation of a/b == c - d where a and b are 0 and c and d are infinity. Likewise, you probably want to sort a bunch of values for dumping to a file, not necessarily to numerical walk them in a mathematically meaningful way.

The numerical use case could be handled by libraries and not in a first-class way. And you'd need an application-defined custom float type if you wanted to support overflowing to infinity without exception. And even then you still couldn't implement NaN unorderability because we want to require that even application-defined types use strict-weak orders (if they have ordering at all).

@laurentlb
Copy link
Contributor

either allow NaN arithmetic (including ordering), or eliminate NaN entirely.

There's a middle ground where you can explicitly create the nan value and pass it to functions (e.g. application-specific functions, storing the value in a proto...), but it's not part of arithmetic operations (neither an input nor an output).

Most Starlark types are not comparable. NaN feels like None to me (if we view it as a singleton value), comparisons don't make sense. If you do need to put all the NaNs at the beginning/end of the list, it's trivial to pass your own key function in the call to sorted.

@brandjon
Copy link
Member

Well now I really don't understand github's commenting model. You force-pushed, which usually destroys comment history by disassociating it from the code, but it appears to still be there. Maybe it does some diff cleverness. In any case, I suggest being wary of force-pushing again with many unresolved comments.

@adonovan adonovan force-pushed the spec-float branch 2 times, most recently from 265dd04 to b34d5d6 Compare October 22, 2020 22:09
spec.md Outdated Show resolved Hide resolved
spec.md Outdated Show resolved Hide resolved
@brandjon
Copy link
Member

I made another pass to refresh my memory and close resolved comment threads. Let me know when I should look again -- and watch out for force pushes that may delink comments.

@alandonovan
Copy link
Contributor Author

either allow NaN arithmetic (including ordering), or eliminate NaN entirely.

There's a middle ground where you can explicitly create the nan value and pass it to functions (e.g. application-specific functions, storing the value in a proto...), but it's not part of arithmetic operations (neither an input nor an output).

The point of NaN is much like the point of U+FFFD in Unicode: it provides a predictable way to deal with the result of a bad operation on one element, or a single screwed up value, that is an inevitable possibility of the data type's representation. This approach makes tools more reliable, especially when strings or numbers coming from outside the program have occasional errors.

adonovan added a commit to google/starlark-go that referenced this pull request Nov 2, 2020
This change makes go.starlark.net's floating-point implementation
match the proposed spec (see bazelbuild/starlark#119),
and thus much more closely match the behavior of the Java implementation.

The major changes are:
- Float values are totally ordered; NaN compares greater than +Inf.
- The string form of a finite float value always contains an exponent
  or a decimal point, so they are self-evidently not int values.
- Operations that would cause a large integer to become rounded to
  an infinite float are now an error.

The resolve.AllowFloat boolean, and the corresponding -float command-line
flag, now have no effect. Floating point support is always enabled.

Change-Id: I359e17910e06aba18ab216cd6511b56e8df005d5
adonovan added a commit to google/starlark-go that referenced this pull request Nov 2, 2020
This change makes go.starlark.net's floating-point implementation
match the proposed spec (see bazelbuild/starlark#119),
and thus much more closely match the behavior of the Java implementation.

The major changes are:
- Float values are totally ordered; NaN compares greater than +Inf.
- The string form of a finite float value always contains an exponent
  or a decimal point, so they are self-evidently not int values.
- Operations that would cause a large integer to become rounded to
  an infinite float are now an error.

The resolve.AllowFloat boolean, and the corresponding -float command-line
flag, now have no effect. Floating point support is always enabled.

Change-Id: I359e17910e06aba18ab216cd6511b56e8df005d5
@adonovan adonovan force-pushed the spec-float branch 3 times, most recently from 013ead3 to bddfa71 Compare November 3, 2020 00:39
adonovan added a commit to google/starlark-go that referenced this pull request Nov 10, 2020
This change makes go.starlark.net's floating-point implementation
match the proposed spec (see bazelbuild/starlark#119),
and thus much more closely match the behavior of the Java implementation.

The major changes are:
- Float values are totally ordered; NaN compares greater than +Inf.
- The string form of a finite float value always contains an exponent
  or a decimal point, so they are self-evidently not int values.
- Operations that would cause a large integer to become rounded to
  an infinite float are now an error.

The resolve.AllowFloat boolean, and the corresponding -float command-line
flag, now have no effect. Floating point support is always enabled.

Change-Id: Ia692f765f575cf0d13438906e70a3c29dac2ae59
@alandonovan
Copy link
Contributor Author

@brandjon, could you approve?

@alandonovan
Copy link
Contributor Author

I've addressed all the comments I overlooked before. (I still haven't got the hang of GH's code review system.)

Great comments, all of them; thanks. PTAL.

@brandjon
Copy link
Member

Left a few more replies. I think the only substantial issue remaining is whether we are specifying SWO as a requirement of all Starlark values. I think that it's reasonable to require this because it makes min/max/sorted make sense. Otherwise we either have to do a lot of work to produce exact specs for those (which we wanted to avoid, hence the tradeoffs we made for NaN in this spec); or else we have to be careful about exactly what the spec of those functions is in the presence of applicatoin-defined non-SWO types.

adonovan added a commit to google/starlark-go that referenced this pull request Nov 11, 2020
This change makes go.starlark.net's floating-point implementation
match the proposed spec (see bazelbuild/starlark#119),
and thus much more closely match the behavior of the Java implementation.

The major changes are:
- Float values are totally ordered; NaN compares greater than +Inf.
- The string form of a finite float value always contains an exponent
  or a decimal point, so they are self-evidently not int values.
- Operations that would cause a large integer to become rounded to
  an infinite float are now an error.

The resolve.AllowFloat boolean, and the corresponding -float command-line
flag, now have no effect. Floating point support is always enabled.

Change-Id: Ia692f765f575cf0d13438906e70a3c29dac2ae59
alandonovan pushed a commit to google/starlark-go that referenced this pull request Nov 11, 2020
This change makes go.starlark.net's floating-point implementation
match the proposed spec (see bazelbuild/starlark#119),
and thus much more closely match the behavior of the Java implementation.

The major changes are:
- Float values are totally ordered; NaN compares greater than +Inf.
- The string form of a finite float value always contains an exponent
  or a decimal point, so they are self-evidently not int values.
- Operations that would cause a large integer to become rounded to
  an infinite float are now an error.

The resolve.AllowFloat boolean, and the corresponding -float command-line
flag, now have no effect. Floating point support is always enabled.
Copy link
Contributor Author

@alandonovan alandonovan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only substantial issue remaining is whether we are specifying SWO as a requirement of all Starlark values. I think that it's reasonable...

Alright, I've added the SWO requirement, as it must be true of all implementations.

spec.md Show resolved Hide resolved
@alandonovan
Copy link
Contributor Author

I saw only one addition comment (about ordering). Were there others?

@alandonovan
Copy link
Contributor Author

PTAL

@brandjon
Copy link
Member

Approved, thanks for bearing with me.

Nit on line 584:

float < float and its inverse (<=) are transitive relations

Did you mean >=?

Fixes bazelbuild#113

Change-Id: Iaa676da8cd6c3142537b5bc5a654490a4a1f0887
@alandonovan
Copy link
Contributor Author

Approved, thanks for bearing with me.

Nit on line 584:

float < float and its inverse (<=) are transitive relations

Did you mean >=?

Updated to mention all five transitive comparison operators.

Change-Id: I2585f7668ff5b0f382434b35206d51282a1eefba
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants