Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add random.binomialvariate() #81620

Closed
rhettinger opened this issue Jun 28, 2019 · 10 comments
Closed

Add random.binomialvariate() #81620

rhettinger opened this issue Jun 28, 2019 · 10 comments
Labels
3.11 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@rhettinger
Copy link
Contributor

rhettinger commented Jun 28, 2019

BPO 37439
Nosy @tim-one, @rhettinger, @gpshead, @mdickinson, @avinassh, @zkneupper
PRs
  • bpo-37439 - Add random.binomialvariate() #14530
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-06-28.09:28:09.146>
    labels = ['type-feature', 'library', '3.11']
    title = 'Add random.binomialvariate()'
    updated_at = <Date 2021-05-25.20:32:21.646>
    user = 'https://github.com/rhettinger'

    bugs.python.org fields:

    activity = <Date 2021-05-25.20:32:21.646>
    actor = 'mark.dickinson'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2019-06-28.09:28:09.146>
    creator = 'rhettinger'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 37439
    keywords = ['patch']
    message_count = 10.0
    messages = ['346815', '347055', '347057', '347061', '347066', '350256', '393903', '394384', '394391', '394396']
    nosy_count = 6.0
    nosy_names = ['tim.peters', 'rhettinger', 'gregory.p.smith', 'mark.dickinson', 'avi', 'zkneupper']
    pr_nums = ['14530']
    priority = 'low'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue37439'
    versions = ['Python 3.11']

    Linked PRs

    @rhettinger
    Copy link
    Contributor Author

    This a small snippet of code I written dozens of times. It would be nice to have this in the library along with the other distributions.

    def binomialvariate(n, p):
        ''' Binomial distribution for the number of successes
            in *n* Bernoulli trials each with a probability *p*
            of success.
        Returns an integer in the range:  0 <= X <= n
    '''
    return sum(random() < p for i in range(n))
    

    @rhettinger rhettinger added 3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jun 28, 2019
    @mdickinson
    Copy link
    Member

    If you want to be able to switch to something more efficient for large n, Knuth describes a simple O(log(n)) algorithm in TAOCP volume 4 (and attributes it to J. H. Ahrens). It's essentially a bisection search on the value, using the fact that we can use the beta distribution to generate order statistics. Here's a (too) simple Python implementation. It still needs thorough testing, and could be optimised in many ways - e.g., using sensible crossover point for n and not recursing all the way to n = 0.

        def binomialvariate(n, p):
            if n == 0:
                return 0
            a, b = (1 + n)//2, 1 + n//2
            x = betavariate(a, b)
            if x < p:
                return a + binomialvariate(b - 1, (p - x)/(1 - x))
            else:
                return binomialvariate(a - 1, p/x)
    >>> binomialvariate(10**10, 0.5)
    4999944649

    @avinassh
    Copy link
    Mannequin

    avinassh mannequin commented Jul 1, 2019

    @mark - Newb here and before I could see your reply, I sent a PR cos I thought simple implementation provided by Raymond Hettinger was good enough. Shall I update the PR?

    I will add the tests soon.

    @mdickinson
    Copy link
    Member

    @avi That's Raymond's call, but IMO there's absolutely nothing wrong with getting a reference implementation (with tests, documentation, and all the usual trimmings) in first and then making algorithmic improvements later.

    Thanks for the PR!

    @mdickinson
    Copy link
    Member

    [...] and then making algorithmic improvements later.

    Though thinking about it, there is *one* catch: for the random module, it's nice if reproducibility isn't broken more often than necessary in released versions, so it would be best to do any algorithmic tweaking *before* 3.9.0 is released. But that still gives us some time ...

    @rhettinger
    Copy link
    Contributor Author

    [Avi]

    Shall I update the PR?

    Yes, please.

    @zkneupper
    Copy link
    Mannequin

    zkneupper mannequin commented May 18, 2021

    Is adding random.binomialvariate() still under consideration?

    I would be willing to work on this addition to Lib/random.py, as well as the accompanying tests and docs.

    Should I use the type of algorithm hinted at by Mark on 2019-07-01 17:56?

    @gpshead gpshead added 3.11 only security fixes and removed 3.9 only security fixes labels May 25, 2021
    @mdickinson
    Copy link
    Member

    I think the NumPy implementation may be from here: https://core.ac.uk/download/pdf/11007254.pdf

    (though I'm struggling to find a clear citation in the NumPy source)

    @mdickinson
    Copy link
    Member

    Nope, that's the wrong paper. It looks as though this is the right one, but it's hidden behind a paywall: https://dl.acm.org/doi/abs/10.1145/42372.42381

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants