Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate using random.randrange() with non-integers #81500

Closed
serhiy-storchaka opened this issue Jun 17, 2019 · 15 comments
Closed

Deprecate using random.randrange() with non-integers #81500

serhiy-storchaka opened this issue Jun 17, 2019 · 15 comments
Labels
3.10 stdlib type-feature

Comments

@serhiy-storchaka
Copy link
Member

@serhiy-storchaka serhiy-storchaka commented Jun 17, 2019

BPO 37319
Nosy @tim-one, @rhettinger, @mdickinson, @serhiy-storchaka
PRs
  • #19112
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-01-25.21:03:21.912>
    created_at = <Date 2019-06-17.19:52:35.511>
    labels = ['type-feature', 'library', '3.10']
    title = 'Deprecate using random.randrange() with non-integers'
    updated_at = <Date 2021-01-25.21:03:21.911>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2021-01-25.21:03:21.911>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-01-25.21:03:21.912>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2019-06-17.19:52:35.511>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 37319
    keywords = ['patch']
    message_count = 15.0
    messages = ['345892', '345935', '345937', '346106', '364819', '364822', '364823', '364834', '364839', '364857', '364872', '364875', '380095', '383918', '385663']
    nosy_count = 4.0
    nosy_names = ['tim.peters', 'rhettinger', 'mark.dickinson', 'serhiy.storchaka']
    pr_nums = ['19112']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue37319'
    versions = ['Python 3.10']

    @serhiy-storchaka
    Copy link
    Member Author

    @serhiy-storchaka serhiy-storchaka commented Jun 17, 2019

    Unlike other range()-like functions random.randrange() accepts not only integers and integer-like objects (implementing __index__), but any numbers with integral values, like 3.0 or Decimal(3). In 3.8 using __int__ for implicit converting to C integers were deprecated, and likely will be forbidden in 3.9. __index__ is now preferable.

    I propose to deprecate accepting non-integer arguments in random.randrange() and use __index__ instead of __int__ for converting values. At end it will lead to simpler and faster code. Instead of

            istart = _int(start)
            if istart != start:
                raise ValueError("non-integer arg 1 for randrange()")

    we could use just

            start = _index(start)

    It will help also in converting the randrange() implementation to C.

    See also bpo-37315.

    @serhiy-storchaka serhiy-storchaka added 3.9 stdlib type-feature labels Jun 17, 2019
    @rhettinger
    Copy link
    Contributor

    @rhettinger rhettinger commented Jun 17, 2019

    While there isn't anything intrinsically wrong with using index(), the premise of this issue is essentially a claim that most Python code that uses int() is wrong. That seems rather bold and flies in the face of common practices. I don't really like the trend of constantly rearranging code in awkward ways to solve problems that no one actually has.

    Also, I disagree with the notion that the downstream functions in the random module should be written in C. That makes them more bug prone and opaque (like when Victor broken seeding for 3.6.0). The trend for rewriting in C should be restricted to core building blocks like the underlying Mersenne Twister. Otherwise, we have to do dual maintenance for years and essentially advertise to people that we don't think our own language is good enough to use. Over time, I am less and less able to show people standard library code as it has become complexified with coding strategies well beyond the norms of day to day Python programmers. ISTM that most proposals to rewrite something in C come from folks who like rewriting things in C rather than from users or from people who do long term maintenance of the modules. Sidenote: we're now getting new core devs who don't read or write C, so each transformation like them excludes their ability to participate).

    @rhettinger
    Copy link
    Contributor

    @rhettinger rhettinger commented Jun 17, 2019

    We probably should not do this until we're willing to give PEP-8 guidance recommending operator.index() as preferable to int() for all cases needing integer offsets. Maybe we do want to go down that path or maybe we just stick with "practicality beats purity" and not create any new learning issues. The int() function is popular and understandable, but it is rare to find a user who knows and cares about index(). ISTM that randrange(3.0) isn't a problem in real life.

    @serhiy-storchaka
    Copy link
    Member Author

    @serhiy-storchaka serhiy-storchaka commented Jun 20, 2019

    Why this code uses int() at all? My guess is that it was added for earlier detection of user errors, otherwise the user could get a TypeError or AttributeError with unrelated message, or even a wrong result. int() is used just for type checking. But it is not good tool for this. int() does several different actions:

    1. Parses string representation of integer.

    2. Converts a real number to an integer with some truncation.

    3. Losslessly converts an int-like number into an instance of int.

    The first two options are not applicable here, so we need an additional check

            if istart != start:
                raise ValueError("non-integer arg 1 for randrange()")

    And this is not perfect: for randrange('5') we get a ValueError instead of expected TypeError.

    operator.index() is better tool for this.

    @serhiy-storchaka
    Copy link
    Member Author

    @serhiy-storchaka serhiy-storchaka commented Mar 22, 2020

    Many functions that accepts non-integer arguments (except float) and truncate them to integers emits a deprecation warning since 3.8 (see bpo-36048 and bpo-20092). They will be errors in 3.10 (see 37999).

    @rhettinger
    Copy link
    Contributor

    @rhettinger rhettinger commented Mar 22, 2020

    I'm generally against "fixing" things that aren't broken. AFAICT, this has never been an issue for any real user over its 20 year history. Also, the current PR makes the code look gross and may impact code that is currently working. Tim, this is mostly your code, what do you think?

    @serhiy-storchaka
    Copy link
    Member Author

    @serhiy-storchaka serhiy-storchaka commented Mar 22, 2020

    Using index() simplifies the code. Adding deprecation warnings complicates it. The PR consists of two commits, look at the first one to see how the code can finally look.

    If you think that the deprecation is not needed, we can just keep the simplified version.

    @tim-one
    Copy link
    Member

    @tim-one tim-one commented Mar 23, 2020

    I have scant memory of working on this code. But back then Python wasn't at all keen to enforce type checking in harmless contexts. If, e.g., someone found it convenient to pass integers that happened to be in float format to randrange(), why not? Going "tsk, tsk, tsk" won't help them get their work done ;-)

    If we were restarting from scratch, I'd be happy enough to say "screw floats here - _and_ screw operator.index() - this function requires ints, period". But, as is, I see insufficient benefit to potentially making code that's been working fine for decades "deprecated".

    Yes, I realize the code may eventually become marginally faster then. That doesn't, to my eyes, outweigh the certainty of annoying some users.

    @tim-one tim-one removed their assignment Mar 23, 2020
    @serhiy-storchaka
    Copy link
    Member Author

    @serhiy-storchaka serhiy-storchaka commented Mar 23, 2020

    If you are against deprecating this "feature", it should be at least documented and covered by tests.

    @rhettinger
    Copy link
    Contributor

    @rhettinger rhettinger commented Mar 23, 2020

    How about we just leave it alone. AFAICT nothing is broken -- no user is having any issues. We really don't need to go through every tool in the standard library that uses int() and document that it accepts any type that defines __int__, nor do we need to test those cases. I understand that you are bugged by this but it isn't a real problem.

    @tim-one
    Copy link
    Member

    @tim-one tim-one commented Mar 23, 2020

    It's neither "bug" nor "feature" - it's an inherited quirk CPython maintains to avoid annoying users for no good reason.

    If you absolutely have to "do something", how about adding a mere footnote? The docs already imply the args have the same restrictions as for range(). A footnote could add that, to preserve historical compatibility, CPython also accepts arguments x for which int(x) == x, and uses int(x) in such cases - but portable code shouldn't rely on that.

    But nobody cares, so putting that in the main part of the docs would also annoy 99.99% of users for no good reason ;-)

    @rhettinger
    Copy link
    Contributor

    @rhettinger rhettinger commented Mar 23, 2020

    Marking this as rejected. Nothing good would come from obfuscating the docs to note a minor implementation detail.

    @serhiy-storchaka
    Copy link
    Member Author

    @serhiy-storchaka serhiy-storchaka commented Oct 31, 2020

    See bpo-42222.

    As a side effect this change provides 10% speed up (which can be lager after the end of the deprecation period).

    @rhettinger
    Copy link
    Contributor

    @rhettinger rhettinger commented Dec 28, 2020

    I merged in PR 23064. If you want to make further modifications or improvements that would be welcome.

    @serhiy-storchaka
    Copy link
    Member Author

    @serhiy-storchaka serhiy-storchaka commented Jan 25, 2021

    New changeset f066bd9 by Serhiy Storchaka in branch 'master':
    bpo-37319: Improve documentation, code and tests of randrange. (GH-19112)
    f066bd9

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 stdlib type-feature
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants