Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lcg #35

Merged
merged 18 commits into from Aug 19, 2020
Merged

lcg #35

merged 18 commits into from Aug 19, 2020

Conversation

Fil
Copy link
Member

@Fil Fil commented Aug 13, 2020

closes #33

many tests fail 🌶 fixed!

@Fil Fil added the feature label Aug 13, 2020
@Fil Fil requested a review from mbostock August 13, 2020 09:35
@Fil
Copy link
Member Author

Fil commented Aug 13, 2020

When this lands we can remove the dev dependency on seedrandom

src/index.js Outdated Show resolved Hide resolved
src/lcg.js Outdated
Comment on lines 1 to 7
export default seed => {
const a = 1664525,
c = 1013904223,
m = 4294967296; // 2^32
let s = Math.abs(a * +seed) || 1;
return () => (s = (a * s + c) % m) / m;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export default seed => {
const a = 1664525,
c = 1013904223,
m = 4294967296; // 2^32
let s = Math.abs(a * +seed) || 1;
return () => (s = (a * s + c) % m) / m;
};
const a = 1664525;
const c = 1013904223;
const m = 4294967296; // 2^32
export default function lcg(s = 1) {
if (!(s >= 0)) throw new Error(“invalid seed”);
return () => (s = (a * s + c) % m) / m;
}

(I can’t figure out how to get iOS to type non-smart quotes…)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the seed being used as given—for reference multiplying by a was motivated by this comment d3/d3-force#160 (comment)

Can you clarify why the default export should be a named function in this case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question here, again out of curiosity: when we define constants outside the function, don't they "exist" in memory even if they're not used? (Not that it matters here, but what if we needed to init a big chunk of memory?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the seed, it’s a question of what the expected seed values are. If the expected value of a seed is between 0 and m - 1, then we shouldn’t do anything with it. If the expected value is a float between 0 and 1, or any JavaScript number, then we should make suitable adjustments. But my take was to make the definition as simple as possible and expect the caller to provide a good seed — but we should document what a good seed is, and probably give examples, e.g., using a large number for the seed rather than 1, 2, or 3.

As for the named function, that’s totally optional. It’s nice for debugging the nonminified JavaScript sometimes but I think often it ends up getting stripped or renamed by Rollup. That name is generally only visible within the debugger (though it can also be used for recursive invocations, but not needed here).

Copy link
Member Author

@Fil Fil Aug 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanations.

I've now "fixed" the seed, so we accept any number: if negative it will be negated, and numbers < 1 will be inflated, so it matches usual expectations.

Note: seedrandom accepts string seeds, which is nice, but would need more code. If we wanted to offer seedrandom (maybe as d3.randomRc4), then we could share its seed generator. However I don't think we need to overcomplicate this and I'm happy with not adding seedrandom, and the current state of the PR.

Fil and others added 3 commits August 13, 2020 18:04
Co-authored-by: Mike Bostock <mbostock@gmail.com>
Co-authored-by: Mike Bostock <mbostock@gmail.com>
@Fil
Copy link
Member Author

Fil commented Aug 13, 2020

I've just tried to dogfood this by replacing seedrandom by d3.randomLcg, and a lot of tests fail… 👎
So I guess this will need more research. (It's still fine for d3-force).

I'm pushing the failed tests in case it helps someone see what's wrong.

One example (among 10):

randomPareto(3) returns randoms with mean of 1.5 and deviation of 0.9

not ok 215 should be in delta

operator: inDelta
expected: [ 0.85, 0.9500000000000001 ]
actual: 0.7331891550506523

@Fil Fil mentioned this pull request Aug 13, 2020
Closed
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@mbostock
Copy link
Member

Should we incorporate another open-source PRNG algorithm then, say alea? It’s slightly more code, but that seems reasonable for d3-random (and we can keep d3-force’s small because jiggle should be rarely used so it doesn’t matter than the randomness is poor quality).

https://github.com/nquinlan/better-random-numbers-for-javascript-mirror#alea

Do you have a notebook that shows how “bad” LCG is? I’m not familiar with techniques for quantitative evaluation of PRNGs.

Fil and others added 2 commits August 13, 2020 19:59
Co-authored-by: Mike Bostock <mbostock@gmail.com>
Co-authored-by: Mike Bostock <mbostock@gmail.com>
@Fil
Copy link
Member Author

Fil commented Aug 13, 2020

I just have my initial implementation notebook.

I've tested various LCG parameters from the wp list, with no better success.

alea and its friends look very interesting but they don't pass the tests either.

I've tried substituting Math.random also—with more fails (on average) than LCG.

So, either there is something truly magical with seedrandom, or the tests were tailored to pass for it… In fact, when I change the seed for the pareto(3) test above… it fails (and fails on first try, I haven't fished for a bad seed).

d3.randomPareto.source(seedrandom("aadffb6828b2b"));
expected: [ 0.85, 0.9500000000000001 ]
actual: 0.7777836018059778

So—LCG is not that bad after all?

@Fil
Copy link
Member Author

Fil commented Aug 13, 2020

I did another experiment by "randomly" changing all the seeds of seedrandom in the tests, and the average number of fails is ~18. I also prepared a stripped-down version of seedrandom, which could be used to create d3.randomSeedrandom if we wanted (and if the licence allows it).

My current conclusion is that LCG is not as bad as I thought when 10 tests failed :-)

negative numbers will be negated; numbers between 0 and 1 inflated
(follow-up on #35 (comment))
Some of the tests were too stringent and almost never passed with Math.random. Rather than finding the right seed to have them pass, it seems better to make them a bit more tolerant.

Changing variance to deviation (and taking the sqrt of the expected result) also allows a more balanced testing of distribution spread.

A few tests in Pareto were testing the wrong values.
@Fil
Copy link
Member Author

Fil commented Aug 14, 2020

The tests in d3-random are not testing the quality of the random generator, but the shape of the transformed distributions. As such, they should work often (say, at least 95% of the time) with the uniform Math.random source, and should not require a specific seed. I've fixed the most offending tests, and checked that we could substitute various generators and various seeds.

Now, a consequence of testing so many distributions is that it also somehow tests the uniformity of our random source. (If the source was not uniform many means, variances, skewness or kurtosis would fail.) But it's not in any case a quality test of the random source. For example, it does not test whether the difference between two successive values is correctly distributed.

So, the fact that these ~10 tests failed for LCG was not saying that LCG was bad. OTOH the fact that LCG now passes does not say it's good :)

…97450

Note: we do not want to guarantee that the generator will return the same sequence *across versions of d3-random*; mixing more, or taking a different seeding method would not be a breaking change.
README.md Outdated Show resolved Hide resolved
src/lcg.js Show resolved Hide resolved
Fil and others added 4 commits August 18, 2020 19:53
Co-authored-by: Mike Bostock <mbostock@gmail.com>
update more tests that were too stringent
@mbostock mbostock merged commit 1db9ce0 into master Aug 19, 2020
@mbostock mbostock deleted the lcg-33 branch August 19, 2020 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

LCG?
2 participants