-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multinomial without replacement produces samples that have zero probability #50034
Comments
Thanks for reporting it; I also faced this bug. Can this behaviour be considered part of the API now? |
I'm also facing this bug. If the input array does have at least |
I ran into this same problem. Ideally, sampling from a multinomial distribution with |
For sampling with/without replacement, we need to first align on whether
Multinomial distribution does not have a concept like replacement. If X follows MULTIVARIATE probability distribution, there should be a dimensionality k. The sampled point x is a vector, not a scalar or single value.
|
May I also know what is an unsafe API? If |
@min-jean-cho the expected behavior should be running:
should print 0 wrongs, but in PyTorch 2 results in 0.0042% while in PyTorch 1.0 it shows 0%. @ngimel FWIW, if I set the probs as a double then the issue goes away in PyTorch 2.0. I don't think we should expect the behavior change based on the data type. Specifically, running below results in 0 wrongs:
|
I'm a bit confused by the initial bug report above, and responding to @lendle: I can reproduce this if we have a single 0.0 in a larger tensor and sample only a single value. That tells me this doesn't have to do with replacement or having more non-negative entries than the number of samples. I don't follow all the versions mentioned above, so not sure if I should have a fix; my version is: '2.0.1+cu117'. Repro:
|
馃悰 Bug
Since moving to more efficient algorithm for sampling multinomial without replacement, we don't check if probability tensor has enough non-zero elements to sample the requested number of samples. We only check if at least one of the probabilities is positive, and that number of samples is less than number of classes.
To Reproduce
Below, probability tensor has 3 non-zero element, so we should be able to generate at most 3 samples without replacement. When requesting 4, numpy errors out, but pytorch produces a sample with 0 probability (
4
).Steps to reproduce the behavior:
Checking the number of non-zeros in the probability tensor is an additional perf penalty (already in many cases checking the probability array is longer than actual sampling), so we should decide if we want this, or if we should have "unsafe" API avoiding checks.
cc @fritzo @neerajprad @alicanb @vishwakftw @nikitaved @mruberry @rgommers @heitorschueroff @pbelevich @t-vi
cc @fritzo @neerajprad @alicanb @vishwakftw @nikitaved @mruberry @rgommers @heitorschueroff @pbelevich
The text was updated successfully, but these errors were encountered: