New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing questionable PNRG behavior #5613
Conversation
…fy its behavior. Closes elastic#5454 and elastic#5578
well, the reason why we added the doc id is to have better support for consistent pagination, so it guarantees consistent random number generation on a consistent point of view over the indexes. |
But, if you are allowing multiple indexes to invoke it, then isn't it going Personally, I think that having only the seed gives it a more consistent On Tue, Apr 1, 2014 at 6:51 AM, uboness notifications@github.com wrote:
|
Thinking it through some more, the Of course, I am not an expert on random number theory, so take that with a grain of salt. |
yes, the idea initially was the be consistent with pagination across indices (where the starts at 0 for each index). But thinking about it more, indeed there are too many "if"s here... we can remove the id and stick to the normal PRNG implementation, and who ever wants consistent pagination can just use seed+scroll. btw, the seed is already biased here as the original seed that the user sends (or the default current time millis) is merged with the shard id (which is a must so each shard will generate different random number per doc) I'll work on your PR, thx! |
Works for me. The shard id modification of the |
@uboness I labeled this issue and assigned it to you - please feel free to re-label |
thx @s1monw |
Closes #5454 and #5578
This strengthens and simplifies the PNRG used by
random_score
by more closely mirroring theRandom.nextFloat()
method, rather than a mix of that,nextInt
andnextDouble
. ThedocBase
ordocId
are no longer used as they were biasing the result (particularly if it was0
, which consistently made it the highest scoring result in tests), which partially defeats the purpose of random scoring.