Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using PyNumber_AsSsize_t in itertools.islice #74722

Closed
MSeifert04 mannequin opened this issue Jun 1, 2017 · 12 comments
Closed

Using PyNumber_AsSsize_t in itertools.islice #74722

MSeifert04 mannequin opened this issue Jun 1, 2017 · 12 comments
Assignees
Labels
3.7 stdlib type-feature

Comments

@MSeifert04
Copy link
Mannequin

@MSeifert04 MSeifert04 mannequin commented Jun 1, 2017

BPO 30537
Nosy @rhettinger, @mdickinson, @embray, @smarnach, @serhiy-storchaka, @jdemeyer, @MSeifert04, @wroberts
PRs
  • #1918
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2017-06-08.06:41:03.998>
    created_at = <Date 2017-06-01.11:26:09.913>
    labels = ['3.7', 'type-feature', 'library']
    title = 'Using PyNumber_AsSsize_t in itertools.islice'
    updated_at = <Date 2018-01-12.15:09:24.449>
    user = 'https://github.com/MSeifert04'

    bugs.python.org fields:

    activity = <Date 2018-01-12.15:09:24.449>
    actor = 'erik.bray'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2017-06-08.06:41:03.998>
    closer = 'rhettinger'
    components = ['Library (Lib)']
    creation = <Date 2017-06-01.11:26:09.913>
    creator = 'MSeifert'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 30537
    keywords = []
    message_count = 12.0
    messages = ['294934', '295000', '295088', '295173', '295185', '295192', '295327', '295330', '295379', '295383', '309854', '309859']
    nosy_count = 8.0
    nosy_names = ['rhettinger', 'mark.dickinson', 'erik.bray', 'smarnach', 'serhiy.storchaka', 'jdemeyer', 'MSeifert', 'Will Roberts']
    pr_nums = ['1918']
    priority = 'low'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue30537'
    versions = ['Python 3.7']

    @MSeifert04
    Copy link
    Mannequin Author

    @MSeifert04 MSeifert04 mannequin commented Jun 1, 2017

    In a question on StackOverflow (https://stackoverflow.com/questions/44302946/itertools-does-not-recognize-numpy-ints-as-valid-inputs-on-python-3-6) it was mentioned that numpys scalars cannot be used as index for itertools.islice.

    The reason for this is because the numbers are converted with PyLong_AsSsize_t (which requires a PyLongObject [or subclass]) instead of PyNumber_AsSsize_t (which only requires an __index__ method).

    I'm not sure if there are many use-cases where numpy scalars make sense as inputs for islice but it's definetly unexpected that itertools.islice([1, 2, 3, 4, 5, 6], np.int32(3)) currently throws a ValueError: Stop argument for islice() must be None or an integer: 0 <= x <= sys.maxsize.

    @MSeifert04 MSeifert04 mannequin added type-bug 3.7 stdlib labels Jun 1, 2017
    @wroberts
    Copy link
    Mannequin

    @wroberts wroberts mannequin commented Jun 2, 2017

    Note that this issue also seems to affect other methods in the itertools package, such as permutations.

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Jun 3, 2017

    Adding support of more general int-like objects in islice() looks reasonable to me. I'm not sure about permutations(), but if make this change, use PyIndex_Check() instead of PyNumber_Check(), or don't use the special check at all, allowing PyNumber_AsSsize_t() to fail.

    The __setstate__() methods shouldn't be changed. We know that the __reduce__() methods return exact ints, not general int-like objects.

    When replace PyLong_AsSsize_t with PyNumber_AsSsize_t take into account that PyLong_AsSsize_t is atomic and thread-safe, while PyNumber_AsSsize_t can call Python code and release the GIL.

    @rhettinger
    Copy link
    Contributor

    @rhettinger rhettinger commented Jun 5, 2017

    [Michael Seifert]

    I'm not sure if there are many use-cases where numpy scalars
    make sense as inputs for islice

    Likewise, I'm not sure there is any evidence of a use-case. The incompatible relationship between numpy ints and PyLong_AsSsize_t() has been around for a long time and no one seems to have even noticed until now.

    Also, you're right that the error message isn't very helpful unless you already understand that np.int32 isn't a subclass of int.

    [Serhiy]

    Adding support of more general int-like objects in islice()
    looks reasonable to me.

    Conceptually, anything with an __index__ method might make sense in the context of slicing, so the proposal doesn't sound unreasonable on the face of it.

    Practically though, there doesn't seem to be any evidence of a real use case here. It is unclear whether allowing np.int32 would ever benefit anyone and whether dropping the current requirement for an explicit int() cast would encourage weird and inefficient code.

    Either way, this should be categorized as a feature request. CPython itself is passing tests and isn't breaking any promises.

    [Will Roberts]

    Note that this issue also seems to affect other methods
    in the itertools package, such as permutations.

    If we do make some sort of change, it should be limited it to just islice(). To me, this only makes sense in the context of index arguments unless someone makes a broad and consequential executive decision that everything in Python that accepts an integer needs to also check for int-like objects as well.

    @rhettinger rhettinger added type-feature and removed type-bug labels Jun 5, 2017
    @wroberts
    Copy link
    Mannequin

    @wroberts wroberts mannequin commented Jun 5, 2017

    Thanks for feedback, Serhiy and Raymond! Github PR now has reverted changes except to the calls in islice_new; I am happy to squash if you would like.

    Serhiy, this is my first time poking around in CPython code. What are the potential consequences of making itertools.islice less atomic/thread-safe, and/or possibly releasing the GIL? Are there any gotchas to watch out for in this patch specifically? I've modelled my changes on the code in listobject.c [list_subscript], but I would love to hear if there's a better way to do things.

    Raymond, I'd also be curious to learn about any code weirdness or inefficiency you have in mind. I agree with you that there might not be a compelling use-case here. The SO question looks to be a bit contrived; however, the interesting bits to me here are that the behaviour of numpy interacting with itertools has changed since py27, and also that the proposed workarounds/solutions seem ... inelegant? Need a numpy integer be explicitly coerced to int in this context when the type implements an __index__ method?

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Jun 5, 2017

    My note was not directly related to your patch. It was a warning about general replacement of PyLong_AsSsize_t with PyNumber_AsSsize_t. There is a code for which this is dangerous.

    @smarnach
    Copy link
    Mannequin

    @smarnach smarnach mannequin commented Jun 7, 2017

    The current behaviour of islice() seems inconsistent with the rest of Python. All other functions taking start, stop and step arguments like slice(), range() and itertools.count() do accept integer-like objects. The code given as "roughly equivalent" in the documentation of islice() accepts integer-like objects, and so does regular list slicing. In fact, the __index__() method was introduced in PEP-357 specifically for slicing. In Python 2, islice() supported it as well. I think the expectation that islice() in Python 3 also supports it is entirely reasonable, and I can't see any strong arguments for breaking that assumption.

    @wroberts
    Copy link
    Mannequin

    @wroberts wroberts mannequin commented Jun 7, 2017

    Github PR adds simple test, as well as an entry in Misc/NEWS.

    @rhettinger
    Copy link
    Contributor

    @rhettinger rhettinger commented Jun 8, 2017

    New changeset 0ecdc52 by Raymond Hettinger (Will Roberts) in branch 'master':
    bpo-30537: use PyNumber in itertools.islice instead of PyLong (bpo-1918)
    0ecdc52

    @rhettinger
    Copy link
    Contributor

    @rhettinger rhettinger commented Jun 8, 2017

    Applied patch. In theory, this was reasonable. Since islice() deals with indices, checking __index__ seems reasonable.

    On the other hand, stable high-performance code was changed without any known reasonable use cases, and no one will ever likely see a benefit. A cheap atomic operation was replaced with a more complex and less thread-safe chain of calls (I'm no longer even sure that the test cases cover all the possible code paths).

    Users might be better off by not seeing an unhelpful error message when passing in a numpy.int32, or they might be worse-off by having lost a cue that they were writing inefficient code (which invisibly creates temporary integer objects on every call when presumably the whole reason for using numpy was a concern for efficiency).

    @jdemeyer
    Copy link
    Contributor

    @jdemeyer jdemeyer commented Jan 12, 2018

    Just to note that this bug affects SageMath too: https://trac.sagemath.org/ticket/24528

    Thanks for fixing!

    @embray
    Copy link
    Contributor

    @embray embray commented Jan 12, 2018

    Users might be better off by not seeing an unhelpful error message when passing in a numpy.int32, or they might be worse-off by having lost a cue that they were writing inefficient code (which invisibly creates temporary integer objects on every call when presumably the whole reason for using numpy was a concern for efficiency).

    While it's true there are many Numpy users who don't understand what they're doing, any time they go into the Python interpreter they're losing the benefits of Numpy anyways (which are fast vectorized operations on arrays).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 stdlib type-feature
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants