Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make float.__floordiv__ return an int #103445

Closed
Gouvernathor opened this issue Apr 11, 2023 · 11 comments
Closed

Make float.__floordiv__ return an int #103445

Gouvernathor opened this issue Apr 11, 2023 · 11 comments
Labels
type-feature A feature request or enhancement

Comments

@Gouvernathor
Copy link
Contributor

Feature or enhancement

int.__floordiv__ and float.__floordiv__ are the only two builtin implementations of the // operator.
This is to make float.__floordiv__ return an int, so that the // operator always returns an int when used on builtin types.

Pitch

Opens the result of a floor-division involving float to the int API, which includes all the int methods on the one part, and the ability to be used in range, slices, list/tuple multiplication... on another part.
Floor division was first described as a true division followed by a call to math.floor, which got broken in PEP 3141 which changed math.floor's return type but seemingly forgot to change float.__floordiv__'s as well. This puts them back together.
Read the PEP draft - though this may not require a PEP - for a more fuller and detailed rationale.

Previous discussion

Make float.__(i/r)floordiv__ return an int on Ideas.

@Gouvernathor Gouvernathor added the type-feature A feature request or enhancement label Apr 11, 2023
@JelleZijlstra
Copy link
Member

Several core developers on the linked Discourse thread were skeptical about this proposal for various technical reasons. We shouldn't make this change without carefully considering these objections, and that will probably need a PEP.

@Gouvernathor
Copy link
Contributor Author

I think that people definitely should read the Discourse thread to make up their mind.
I won't try to sum up the arguments of those opposing the proposition, as I wouldn't do so accurately enough, but you can read here how I recognize some of the drawbacks and consider them acceptable, as well as my position on just how much the existing behavior is to be considered "documented", here.
There are also the transition steps I described in the PEP draft, to be considered as well.

@mdickinson
Copy link
Member

Regardless of the desirability or not of an int result, one of the biggest blockers here is that the proposed change will break code, and no-one has yet proposed a workable backwards compatibility mechanism. Without a solution to the backwards compatibility problem, I don't see this going anywhere.

A critical part of any backwards compatibility mechanism would be a way for developers to opt-in to the new behaviour at a time of their choosing during the usual two-Python-version transition period. I just don't see a solution to this that doesn't involve either significant complexity in the Python core (__future__ imports, new bytecodes, possibly new special methods, etc.) or significant code churn in 3rd party code, or both. The complexity of any solution looks out of proportion to any benefit.

@Gouvernathor
Copy link
Contributor Author

Gouvernathor commented Apr 11, 2023

I don't think this requires any compatibility mechanism bigger than the non-togglable one I outlined in the document.
Could you define precisely what you mean by a "workable backwards compatibility mechanism", and why it would need to be bigger than this ?

When the random.seed function restricted the types it takes, and random.sample started rejecting sets and dicts, there was some transition period but no toggle between the old and the new behavior was introduced. It was the former one, right up until it was the new one. This is the most evident changes I could cite from my perspective, but there are surely dozens of other examples in various other parts of python.

I seriously doubt this change would break any more code than these changes to the random module did. Not to mention the deafening lack of documentation about this : nowhere in the doc is there any mention that float.__floordiv__ does return a float, there are vague warnings at best. There is even an ambiguous (if not plain mistaken) phrasing in the divmod doc implying that the first element, which is the same as the floordiv result, is "usually math.floor(a / b)", a.k.a an int ever since PEP 3141 was passed. There is another equally dubious phrasing in 3141 itself (see the Discourse thread for more details).

Why would an undocumented behavior with such little impact require such an extensive adaptative mechanism ?
If developers want to transition step by step, they can get the old and new behaviors by casting the calculation in the float or int callables, respectively. Yes, that would lose some execution time, but I guess that's the price to pay for relying on those.
These two recipes have the merit of existing and being very simple, as opposed to most if not all __future__ toggles. And they're much easier to batch-replace in existing code than any incompatible change to any function : taking the round builtin for example, where you could have locally-defined round_up or around mismatches, well that can't happen here, because outside a comment or string, // is only //.

@mdickinson
Copy link
Member

the non-togglable one I outlined in the document [...]

I'm afraid that that section of the document is rather unclear to me. Could you please outline (here, not in the document) precisely which code (and under what conditions) would be expected to issue a DeprecationWarning during the proposed transition period?

@Gouvernathor
Copy link
Contributor Author

Gouvernathor commented Apr 13, 2023

It would be right about here (and the same in the float_divmod function, but let's keep that one on the side while we work the floordiv case).
After the floordiv result is computed, we do something similar to this code, by testing for infinity and nan-ness of the value and emit warnings if so.
That is for the period of time before any actual change is done to the values returned by the function.

In a second period of time, if we want to be extra-careful, we can have those ifs emit the warning and return the float-converted value, while the else (the normal behavior for non-infinity values of the floordiv variable) returns an int. That would make the operator still raise no exception but it would become mixed-types, so there are pros and cons about that.
But the main place where I would put deprecation warnings would be as said above, to warn about the non-int values.

And then after the deprecation period, we would remove those ifs and let the PyLong_FromDouble exceptions take care of it (or make the ifs raise the exceptions if we want to word them differently).

Should you consider that to be insufficient, I presume it would be because of the other possible breaking points than inf and nan being returned. So these would only be for direct type checks on the floorquotient using isinstance, type or .__class__, and calls to .hex(), right ?

@mdickinson
Copy link
Member

mdickinson commented Apr 13, 2023

Sorry, I meant which Python code. I'm asking about the user-visible behaviour change, not the implementation. That is, under what conditions should a Python user expect a DeprecationWarning to be issued during the transition period? E.g., would every x // y operation in which both x and y are instances of float issue a warning?

[...] other possible breaking points [...]

Indeed it's the type change that concerns me much more than the behaviour of NaNs or infinities. The ramifications of that type change aren't limited to direct type checking in user code. As a random example, consider this NumPy-using session:

Python 3.10.10 (main, Feb 10 2023, 08:56:54) [Clang 13.1.6 (clang-1316.0.21.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> (np.arange(1000000) * (1e40 // 1e10)).dtype
dtype('float64')
>>> (np.arange(1000000) * int(1e40 // 1e10)).dtype
dtype('O')

The second result is the result that we'd get under your proposal after the type change. And that's a big difference: an efficient array of 1 million packed floats has become an inefficient array of pointers to unboxed integers. That's the sort of change which should not occur silently.

@mdickinson
Copy link
Member

mdickinson commented Apr 13, 2023

Here's another example where an integer-valued float being replaced by an int is far from innocuous:

>>> x = 9.223372036854775e+18
>>> np.arange(3) * x / x
array([0., 1., 2.])
>>> y = int(x)
>>> np.arange(3) * y / y
array([ 0.00000000e+00,  1.00000000e+00, -2.22044605e-16])

Note that last value in the result - in one case it's 2.; in the other it's something tiny.

@Gouvernathor
Copy link
Contributor Author

Gouvernathor commented Apr 13, 2023

an efficient array of 1 million packed floats has become an inefficient array of pointers to unboxed integers

Yes, but in other words, an array of imprecise values has become more precise. If developers (or modules, Numpy in this case) want to ignore that precision they should do so themselves, Python shouldn't crop things in their place. For me that's a feature, not a bug - and I'll note it only degrades performance and does not yield new errors.

Note that last value in the result - in one case it's 2.; in the other it's something tiny.

I'm afraid I don't understand precisely how the problem happens. It's very weird that a calculation involving less floats and more ints makes it less accurate. It seems to be in the conversion of y back to a float before the multiplications occur ? And that the conversion doesn't yield a float equal to x ?
If so, the bug is there, either in the Python C implementation or in Numpy's, and in that case it's covered by the broad "float maths are inaccurate" blanket.
If the code as you wrote it, which means without using the // operator (but I understand your point of course), shows that weird inaccuracy, then it seems it could happen anytime anyway, so we shouldn't take it into account right here and now.

But of course if I misunderstood the second behavior it may be another bag of beans altogether.

@mdickinson

This comment was marked as abuse.

@AlexWaygood AlexWaygood closed this as not planned Won't fix, can't repro, duplicate, stale Apr 13, 2023
@python python locked as resolved and limited conversation to collaborators Apr 13, 2023
@python python unlocked this conversation Apr 13, 2023
@ethanfurman
Copy link
Member

@Gouvernathor There is enough disagreement about this proposal that you will need to finish your PEP and get it submitted for this change to make it into Python.

@ethanfurman ethanfurman reopened this Apr 13, 2023
@ethanfurman ethanfurman closed this as not planned Won't fix, can't repro, duplicate, stale Apr 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

5 participants