New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: random_state using scipy.stats.qmc engines #13368
Comments
I don't believe that implementing a QMC-based Implementing a class that has (at least) a subset of |
I am not sure I understand why we could not sample any distribution using QMC sampling.
Indeed I did this for the If you feel that what I propose here is ok, I could prepare a PR (well we still need to merge qmc... so right after). |
You can, but not using rejection samplers which are the dominant method used in |
Ah ok thanks. I thought only inverse CDF was used. |
There are algorithms that work with QMC sampling; they just aren't the algorithms implemented in I would not modify
Sure, but that doesn't get you the |
True for the arguments and that's why I added the small logic with the The idea was to give a user a simple way to use the QMC engines all over the place by just substituting the I agree that, regardless of this, |
That's not the issue I'm pointing out. Compare the arguments to
It's a nice thought, but I don't think it ultimately works often enough to allow |
@rkern do you, somehow, see a future for this or shall I close it? |
I don't think this strategy is profitable to explore. Code written for PRNGs are assuming things about the I think there's significant room to write some tools that help people sample from |
Thanks for the discussions here. I will close it then.
Do you have other ideas I should explore instead? |
The first most obvious thing would be to provide a function that takes a There also are variants of the acceptance-rejection method for QMC that you can implement with the same kind of interface. Some of the algorithms in UNU.RAN for automatically creating efficient acceptance-rejection algorithms for PRNGs will be useful for QMC, too (such as automatically deriving good proposal distributions, which are used in both the PRNG and QMC forms of AR). As the effort ramps up to integrate the UNU.RAN functionality, you might want to take a look and make sure that the algorithms are exposed in such a way that they will be useful both for PRNG sampling and QMC sampling. |
OK I was afraid this was too simple and would not be accepted. I could make a PR for that. Not sure about the interface and names here. But is the following what you mean? class SampleDist:
def __init__(self, qrng, dist):
self.qrng = qrng
self.dist = dist
def rvs(self):
sample = self.qrng.random(size)
return self.dist.ppf(sample)
I will keep an eye open during this integration work. Although MCMC and such are more the territory of PyMC for now. |
I was thinking more of a function rather than an object, but yes. It's possible that it falls under the category of "not every one-liner needs to be a function", but then again, not everyone knows that trick for getting QMC variates from non-uniform distributions. UNU.RAN is not an MCMC library. The stuff for MCMC in UNU.RAN is just applying the main algorithms of UNU.RAN to MCMC problems because automatically deriving fast samplers for new continuous distributions solves a particular problem with Gibbs sampling MCMC algorithms. That's not the main focus of the library or the planned integration. |
I've read the discussion above. For univariate distributions ( |
|
But having explicit functions is more flexible, especially as the UNU.RAN algorithms become available. Inverting using the |
I doubt we'll bother replacing the internals with the UNU.RAN implementation, unless if you think that's the best way to address the remaining Want to add a |
Done! Let's continue this discussion in gh-13319. |
Now that we have UNU.RAN and Right now, a user would need to be aware of these details in order to find/use these. IMHO this is not a trivial assumption to make. Following what you said @mdhaber, what about if we would add a |
I'm not sure now. I see three possibilities for the design
As a start, how about we just make them aware? We could include a note in the |
It makes me think about the discussion we had on the mailing list and In the meantime, I agree that a note in the doc seems sensible. We could write it once and make it available to all |
It seems that we have two big ideas for stats maintenance these days:
After a year of pretty intense work on There has also been some effort toward #1. @tirthasheshpatel and I have been talking in recent PRs about overhauling the test suite of distributions so that we can find all the obvious bugs where distribution methods are not living up to their public signatures. I don't know that we should work on all aspects of these projects simultaneously. So, in the case of working on overhaul of So maybe I'd suggest this order of high-level maintenance operations: I like some variety, but I'm not really efficient when I'm bouncing between dozen of PRs, waiting a few months at a time between updates and having to re-learn everything when I come back to it. I imagine that if a few of us were able to tag-team as authors and reviewers toward a common goal, we could get a lot done pretty quickly. What do you think? |
This action plan sounds reasonable 👍 The overhaul of the distribution is a massive undertaking for sure and we would need to be sure to have a few maintainers on board to first avoid late discussions and second to do the hard work. This should not be a 1-2 man only project. As usual, feel free to ping me if you feel I could help 😃 |
This is a follow up of discussions in #10844.
Describe the solution you'd like
All sampling methods in
scipy.stats
are usingrandom_state
which is anp.random.Generator
(for new code). But thisnumpy
generator is not aware of dimensions. Also, the newscipy.stats.qmc
allow to generate samples efficiently in n-dimensions. It would be nice to bridge the gap between the two.Currently there is a
qmc.MultivariateNormalQMC
. Instead of duplicating this for other distributions, a solution could be to make the QMC engines inherit fromnp.random.BitGenerator
. We could then use the new QMC engines with all the existing distributions.Describe alternatives you've considered
BitGenerator
: Seemingly, it must be done in Cython. Otherwise I saw that there was a wrapper aroundBitGenerator
, but this is not available innumpy
... https://bashtage.github.io/randomgen/bit_generators/userbitgenerator.html from @bashtage. This is working but the underlyingnumpy
code is not aware of dimensions (https://github.com/numpy/numpy/blob/e4feb7027e397925d220a10dd58b581b87ca1fec/numpy/random/_generator.pyx#L3562-L3568).np.random.Generator
: I am using__getattr__
to mock calls to the distributions. So calls likerandom_state.uniform(...)
. Seems to be working and n-dimensions is ok too.The text was updated successfully, but these errors were encountered: