-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using PyNumber_AsSsize_t in itertools.islice #74722
Comments
In a question on StackOverflow (https://stackoverflow.com/questions/44302946/itertools-does-not-recognize-numpy-ints-as-valid-inputs-on-python-3-6) it was mentioned that numpys scalars cannot be used as index for The reason for this is because the numbers are converted with I'm not sure if there are many use-cases where numpy scalars make sense as inputs for |
Note that this issue also seems to affect other methods in the itertools package, such as permutations. |
Adding support of more general int-like objects in islice() looks reasonable to me. I'm not sure about permutations(), but if make this change, use PyIndex_Check() instead of PyNumber_Check(), or don't use the special check at all, allowing PyNumber_AsSsize_t() to fail. The __setstate__() methods shouldn't be changed. We know that the __reduce__() methods return exact ints, not general int-like objects. When replace PyLong_AsSsize_t with PyNumber_AsSsize_t take into account that PyLong_AsSsize_t is atomic and thread-safe, while PyNumber_AsSsize_t can call Python code and release the GIL. |
[Michael Seifert]
Likewise, I'm not sure there is any evidence of a use-case. The incompatible relationship between numpy ints and PyLong_AsSsize_t() has been around for a long time and no one seems to have even noticed until now. Also, you're right that the error message isn't very helpful unless you already understand that np.int32 isn't a subclass of int. [Serhiy]
Conceptually, anything with an __index__ method might make sense in the context of slicing, so the proposal doesn't sound unreasonable on the face of it. Practically though, there doesn't seem to be any evidence of a real use case here. It is unclear whether allowing np.int32 would ever benefit anyone and whether dropping the current requirement for an explicit int() cast would encourage weird and inefficient code. Either way, this should be categorized as a feature request. CPython itself is passing tests and isn't breaking any promises. [Will Roberts]
If we do make some sort of change, it should be limited it to just islice(). To me, this only makes sense in the context of index arguments unless someone makes a broad and consequential executive decision that everything in Python that accepts an integer needs to also check for int-like objects as well. |
Thanks for feedback, Serhiy and Raymond! Github PR now has reverted changes except to the calls in islice_new; I am happy to squash if you would like. Serhiy, this is my first time poking around in CPython code. What are the potential consequences of making Raymond, I'd also be curious to learn about any code weirdness or inefficiency you have in mind. I agree with you that there might not be a compelling use-case here. The SO question looks to be a bit contrived; however, the interesting bits to me here are that the behaviour of numpy interacting with itertools has changed since py27, and also that the proposed workarounds/solutions seem ... inelegant? Need a numpy integer be explicitly coerced to int in this context when the type implements an __index__ method? |
My note was not directly related to your patch. It was a warning about general replacement of PyLong_AsSsize_t with PyNumber_AsSsize_t. There is a code for which this is dangerous. |
The current behaviour of islice() seems inconsistent with the rest of Python. All other functions taking start, stop and step arguments like slice(), range() and itertools.count() do accept integer-like objects. The code given as "roughly equivalent" in the documentation of islice() accepts integer-like objects, and so does regular list slicing. In fact, the __index__() method was introduced in PEP-357 specifically for slicing. In Python 2, islice() supported it as well. I think the expectation that islice() in Python 3 also supports it is entirely reasonable, and I can't see any strong arguments for breaking that assumption. |
Github PR adds simple test, as well as an entry in Misc/NEWS. |
Applied patch. In theory, this was reasonable. Since islice() deals with indices, checking __index__ seems reasonable. On the other hand, stable high-performance code was changed without any known reasonable use cases, and no one will ever likely see a benefit. A cheap atomic operation was replaced with a more complex and less thread-safe chain of calls (I'm no longer even sure that the test cases cover all the possible code paths). Users might be better off by not seeing an unhelpful error message when passing in a numpy.int32, or they might be worse-off by having lost a cue that they were writing inefficient code (which invisibly creates temporary integer objects on every call when presumably the whole reason for using numpy was a concern for efficiency). |
Just to note that this bug affects SageMath too: https://trac.sagemath.org/ticket/24528 Thanks for fixing! |
While it's true there are many Numpy users who don't understand what they're doing, any time they go into the Python interpreter they're losing the benefits of Numpy anyways (which are fast vectorized operations on arrays). |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: