-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random.sample() results are biased #7
Comments
Your bug definitely seems valid. I'll try to reproduce and fix soon. |
I also encounter this problem, below is how I solve it temporarily, but this will break the original shuffle if I directly using myShuffle function as a replacement of the old shuffle var random = new Random(Random.engines.browserCrypto);
random.myShuffle = function (array, downTo) {
var length = array.length;
if (length) {
if (downTo == null) {
downTo = 0;
}
for (var i = (length - 1) >>> 0; i >= downTo; --i) {
var distribution = Random.integer(0, i);
var j = distribution(this.engine);
if (i !== j) {
var tmp = array[i];
array[i] = array[j];
array[j] = tmp;
}
}
}
return array;
};
random.mySample = function (population, sampleSize) {
if (sampleSize < 0 || sampleSize > population.length || !isFinite(sampleSize)) {
throw new RangeError("Expected sampleSize to be within 0 and the length of the population");
}
if (sampleSize === 0) {
return [];
}
var clone = population.slice();
var length = clone.length;
if (length === sampleSize) {
return this.myShuffle(clone, 0);
}
var tailLength = length - sampleSize;
return this.myShuffle(clone, tailLength).slice(tailLength);
}; |
Yep, logically, there was an off-by-one error where if you sampled by one, it would shuffle exactly 0 items rather than 1. This applied to sample and not shuffle because shuffle has no point in shuffling the very last item (since it can only shuffle with itself), whereas with sample, it needs to take the next item into account. |
Thanks for making a nice library. I was looking for a seed-able PRNG and random-js fits the bill.
I was using
Random.sample(population, 1)
and noticed that it always returns the same result, namely the last item in thepopulation
array.Here is test code:
When I run it with
node
, it prints out the following:{ e: 100000 }
When I use
sampleSize = 2
, the results also seem to heavily favor one of the elements of the population array:{ 'd,b': 20013,
'd,c': 19884,
'd,e': 20011,
'd,a': 20069,
'e,d': 20023 }
This smells like an off-by-one somewhere... Or am I missing something?
Thanks.
The text was updated successfully, but these errors were encountered: