Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

weighted sampling #18

Open
gzt opened this issue Mar 12, 2019 · 5 comments · Fixed by #47 · May be fixed by #72
Open

weighted sampling #18

gzt opened this issue Mar 12, 2019 · 5 comments · Fixed by #47 · May be fixed by #72
Labels
enhancement New feature or request

Comments

@gzt
Copy link

gzt commented Mar 12, 2019

I ran into this thread (http://r.789695.n4.nabble.com/Bias-in-R-s-random-integers-td4752563.html) and saw your discussion of the sample feature - I have a suggestion if you're still thinking of implementing it: you could do stochastic acceptance in order to perform weighted sampling in this framework. See here for an example: https://jbn.github.io/fast_proportional_selection/ It has the overhead of having to find the maximum weight and burns at least one more call to runif(), but you might provide an optional argument of max weight to spare the need for searching - this can also be done so that the weights don't have to add up to 1 (ie sparing the user the need to add up and divide by the sum). When I have time I might be able to submit a pull request if you are interested and if I don't forget, as I have a use for something similar

@rstub
Copy link
Member

rstub commented Mar 12, 2019

Thanks for the suggestion! I am still interested in adding sample functionality to dqrng. However, I have no immediate need for it, so I cannot give a time frame for it. A PR would be welcome.

Some general comments:

  • Additional calls to the RNG shouldn't be to bad, since the RNGs used in dqrng are pretty fast. In addition, there already is code for generating an int-float-pair with a single RNG call, which could be made more general. Of course, that works only up to 11 bits of integer precision.

  • The python algorithm has the same bias that started that thread on R-devel: i = int(n * random.random()). One should use random.randomint() instead. BTW, the bias in R has been fixed (as good as possible) for the upcoming R 3.6.0.

  • I think it would be best to have at least BisectionSearch and StochasticAcceptance available at the C++ level. I am not sure yet how to design the R interface. Maybe give the user the ability to select between the different algorithms.

@rstub
Copy link
Member

rstub commented Mar 12, 2019

BTW, in the original paper the timings look much better for the StochasticAcceptance algorithm, being faster than BisectionSearch even without the (IMO) artificial change in weights. I guess that for the python examples the bisection algorithm is implemented in compiled code, while the other two are implemented in python.

@gzt
Copy link
Author

gzt commented Mar 12, 2019

All right, if I get around to doing this (not likely to be soon), I'll see about putting together a PR - I have something I'm working on that could use something similar (in C) and it might be easy to get it into a form compatible with this project.

  • I'll drop a note to the author of that post - it does the right thing in several other places.

@rstub rstub added the enhancement New feature or request label Mar 13, 2019
@rstub
Copy link
Member

rstub commented Dec 29, 2022

I am finally working on this: https://stubner.me/2022/12/roulette-wheel-selection-for-dqrng/, looks promising.

@rstub
Copy link
Member

rstub commented Aug 30, 2023

As noted in #52, there are some more things I need to consider w.r.t. to weighted sampling. I will have to back out that code for now in order to release the other changes that have accumulated.

@rstub rstub reopened this Aug 30, 2023
@rstub rstub linked a pull request Oct 7, 2023 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants