New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lcg #35
Conversation
When this lands we can remove the dev dependency on seedrandom |
src/lcg.js
Outdated
export default seed => { | ||
const a = 1664525, | ||
c = 1013904223, | ||
m = 4294967296; // 2^32 | ||
let s = Math.abs(a * +seed) || 1; | ||
return () => (s = (a * s + c) % m) / m; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
export default seed => { | |
const a = 1664525, | |
c = 1013904223, | |
m = 4294967296; // 2^32 | |
let s = Math.abs(a * +seed) || 1; | |
return () => (s = (a * s + c) % m) / m; | |
}; | |
const a = 1664525; | |
const c = 1013904223; | |
const m = 4294967296; // 2^32 | |
export default function lcg(s = 1) { | |
if (!(s >= 0)) throw new Error(“invalid seed”); | |
return () => (s = (a * s + c) % m) / m; | |
} |
(I can’t figure out how to get iOS to type non-smart quotes…)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the seed being used as given—for reference multiplying by a was motivated by this comment d3/d3-force#160 (comment)
Can you clarify why the default export should be a named function in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question here, again out of curiosity: when we define constants outside the function, don't they "exist" in memory even if they're not used? (Not that it matters here, but what if we needed to init a big chunk of memory?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the seed, it’s a question of what the expected seed values are. If the expected value of a seed is between 0 and m - 1, then we shouldn’t do anything with it. If the expected value is a float between 0 and 1, or any JavaScript number, then we should make suitable adjustments. But my take was to make the definition as simple as possible and expect the caller to provide a good seed — but we should document what a good seed is, and probably give examples, e.g., using a large number for the seed rather than 1, 2, or 3.
As for the named function, that’s totally optional. It’s nice for debugging the nonminified JavaScript sometimes but I think often it ends up getting stripped or renamed by Rollup. That name is generally only visible within the debugger (though it can also be used for recursive invocations, but not needed here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the explanations.
I've now "fixed" the seed, so we accept any number: if negative it will be negated, and numbers < 1 will be inflated, so it matches usual expectations.
Note: seedrandom accepts string seeds, which is nice, but would need more code. If we wanted to offer seedrandom (maybe as d3.randomRc4), then we could share its seed generator. However I don't think we need to overcomplicate this and I'm happy with not adding seedrandom, and the current state of the PR.
I've just tried to dogfood this by replacing seedrandom by d3.randomLcg, and a lot of tests fail… 👎 I'm pushing the failed tests in case it helps someone see what's wrong. One example (among 10):
|
Should we incorporate another open-source PRNG algorithm then, say alea? It’s slightly more code, but that seems reasonable for d3-random (and we can keep d3-force’s small because jiggle should be rarely used so it doesn’t matter than the randomness is poor quality). https://github.com/nquinlan/better-random-numbers-for-javascript-mirror#alea Do you have a notebook that shows how “bad” LCG is? I’m not familiar with techniques for quantitative evaluation of PRNGs. |
Co-authored-by: Mike Bostock <mbostock@gmail.com>
Co-authored-by: Mike Bostock <mbostock@gmail.com>
I just have my initial implementation notebook. I've tested various LCG parameters from the wp list, with no better success. alea and its friends look very interesting but they don't pass the tests either. I've tried substituting Math.random also—with more fails (on average) than LCG. So, either there is something truly magical with seedrandom, or the tests were tailored to pass for it… In fact, when I change the seed for the pareto(3) test above… it fails (and fails on first try, I haven't fished for a bad seed).
So—LCG is not that bad after all? |
I did another experiment by "randomly" changing all the seeds of seedrandom in the tests, and the average number of fails is ~18. I also prepared a stripped-down version of seedrandom, which could be used to create d3.randomSeedrandom if we wanted (and if the licence allows it). My current conclusion is that LCG is not as bad as I thought when 10 tests failed :-) |
negative numbers will be negated; numbers between 0 and 1 inflated (follow-up on #35 (comment))
Some of the tests were too stringent and almost never passed with Math.random. Rather than finding the right seed to have them pass, it seems better to make them a bit more tolerant. Changing variance to deviation (and taking the sqrt of the expected result) also allows a more balanced testing of distribution spread. A few tests in Pareto were testing the wrong values.
The tests in d3-random are not testing the quality of the random generator, but the shape of the transformed distributions. As such, they should work often (say, at least 95% of the time) with the uniform Math.random source, and should not require a specific seed. I've fixed the most offending tests, and checked that we could substitute various generators and various seeds. Now, a consequence of testing so many distributions is that it also somehow tests the uniformity of our random source. (If the source was not uniform many means, variances, skewness or kurtosis would fail.) But it's not in any case a quality test of the random source. For example, it does not test whether the difference between two successive values is correctly distributed. So, the fact that these ~10 tests failed for LCG was not saying that LCG was bad. OTOH the fact that LCG now passes does not say it's good :) |
…97450 Note: we do not want to guarantee that the generator will return the same sequence *across versions of d3-random*; mixing more, or taking a different seeding method would not be a breaking change.
Co-authored-by: Mike Bostock <mbostock@gmail.com>
update more tests that were too stringent
closes #33
many tests fail 🌶fixed!