Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extra tests for random.binomialvariate #112325

Merged
merged 6 commits into from
Nov 23, 2023

Conversation

gaogaotiantian
Copy link
Member

@gaogaotiantian gaogaotiantian commented Nov 22, 2023

Currently random.binomialvariate does not check the type of n while the docs clearly states it should be an integer (and mathmatically it should be an integer). This could cause things like random.binomialvariate(3.5, 1) == 3.5. In this PR, a check for the type of n is added.

I also added the check for n == 0. This introduces an extra if check on the common path. n == 0 works without this check, it just requires some extra calculation. I can take it out if we want to make the common path slightly faster.

if not c is changed to if c == 0, which I believe is much better when checking a number against 0.

A couple of test cases are added as well, notably self.assertEqual(B(5, 1e-8), 0), which will trigger the uncovered path in BG method under if c == 0.

@rhettinger rhettinger self-assigned this Nov 23, 2023
@rhettinger
Copy link
Contributor

Is the solving a real problem that you've had or is this just a theoretical issue?

In general, the module shies away from such checks unless we find that they are really necessary. For example, gauss doesn't check to see if the sigma argument is negative. Mostly, these calls need to be short and fast without anything extraneous.

@gaogaotiantian
Copy link
Member Author

I happened to check the coverage of random module today and found some uncovered path which led to this PR.

I agree that the function should be short and fast, but should we keep the same standard? For example, the check for p prevents the function from a infinite loop which is considered necessary, but why do we check for n < 0? That'll just be a case of "invalud input, invalid output". Should we remove that check?

In gammavariate, we check if beta <= 0, that'll be another example of "invalid input, invalid output', as it was just used as a multiplication factor.

However, I understand if you don't want to add the extra 0 and integer check on n. I do think the comparison with 0 makes more sense than not c. Or we can only keep the new test cases.

@rhettinger
Copy link
Contributor

Or we can only keep the new test cases.

I think that would be best.

@gaogaotiantian
Copy link
Member Author

I think that would be best.

Could you share your thoughts about if not c vs if c == 0? That's actually the part I have a rather strong feeling. I think the latter statement is significantly better in this case.

@serhiy-storchaka
Copy link
Member

#81620 has been closed a long time ago. Could you please open a new issue for discussing the feasibility of such changes?

@rhettinger
Copy link
Contributor

rhettinger commented Nov 23, 2023

Could you share your thoughts about if not c vs if c == 0?

It's a bit of a personal preference. This module tends to use the former such as the if y in betavariate. The former is also faster than if c == 0 and is consistent with the practice for sequences of writing if not seq instead of if len(seq) == 0. Using the truthy value of an object is common in the Python (and some other languages). It is something you get used to.

@rhettinger
Copy link
Contributor

@serhiy-storchaka I don't think we need to open another issue for a minor code edit. I'll just remove the issue reference from this PR.

@rhettinger rhettinger changed the title gh-81620: Add extra checks in random.binomialvariate Add extra tests for random.binomialvariate Nov 23, 2023
@gaogaotiantian
Copy link
Member Author

It's a bit of a personal preference. This module tends to use the former such as the if y in betavariate. The former is also faster than if c == 0 and is consistent with the practice for sequences of writing if not seq instead of if len(seq) == 0. Using the truthy value of an object is common in the Python (and some other languages). It is something you get used to.

I'm actually perfectly fine with truthy value check for sequences. I only have hesitations about numerical values. It is faster than c == 0.

Anyway, I removed the code change in random.py and I agree we don't need a separate issue for this minor change.

@rhettinger rhettinger merged commit dc0adb4 into python:main Nov 23, 2023
29 checks passed
@rhettinger
Copy link
Contributor

Thanks for the extra tests and for the polite discussion.

@gaogaotiantian gaogaotiantian deleted the random-binomial branch November 27, 2023 05:54
aisk pushed a commit to aisk/cpython that referenced this pull request Feb 11, 2024
Glyphack pushed a commit to Glyphack/cpython that referenced this pull request Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants