Deterministic option (fixed seed?) for Monte Carlo results #714

abyrd · 2021-03-22T05:25:49Z

Some of the Simpson Desert tests occasionally fail in GH actions test runs. This is probably because they test the closeness of our Monte Carlo results to theoretical results, and there's always some probability that the MC results will be way off. For reproducible testing we could seed all our random number generators but arguably that reduces thoroughness, and in any case small changes to routing could still change the order in which numbers are produced and cause the tests to fail again. Maybe we should just use really high numbers of MC draws on these tests.

abyrd · 2021-05-07T05:53:54Z

Note that some of the slowness in these tests is due to building histograms at every destination, even though the test only looks at one of them. Removing that behavior would probably speed them up significantly, making it more reasonable to spend more time on more MC draws.

ansoncfit · 2023-05-04T13:58:57Z

Recent tests, including with the frequency-heavy network of Sao Paulo, prompted me to think about this again. Letting users toggle on deterministic seeding would be straightforward to implement (e.g. using a similar approach to the one in the multi-criteria router, at https://github.com/conveyal/r5/blob/v6.9/src/main/java/com/conveyal/r5/profile/McRaptorSuboptimalPathProfileRouter.java#L120-L123) and would help users resolve a common headache when they are doing scenario comparisons. We could still recommend networks with frequency-based routes be analyzed with fully randomized schedules first, to get a sense of the noise/uncertainty.

Addresses #714

abyrd · 2023-08-04T07:16:12Z

We discussed this again recently. Results are expected to converge on stable values with an adequate number of MC draws. In the context of these stochastic methods, there does not seem to be a legitimate use for fixed seeds. Any use would amount to an illusion of artificial precision and could lead to inadvertent cherry-picking of results.

If expectations for stable results are not met, there are two main explanations:

The number of MC draws is just too low. Trying to simulate a truly frequency based line without adequately sampling the possible departure times for a given (long) headway.
The modeler is assuming constraints on the line that are not included in the scenario. Departure times are assumed to be synchronized to other routes or specific clock times. Scenario should be updated to use exact-times or phasing.

For tests, the solution is probably to increase the number of MC draws until they pass reliably. Test results would remain nondeterministic by nature, but the probability of failure can be lowered until it essentially never happens.

For regular use we should provide guidance on exact-times, phasing, and number of MC draws, and explain clearly in documentation how and why these stabilize results.

ansoncfit · 2023-09-18T14:23:07Z

If we want to allow increasing the number of MC draws, we should also test and adjust the socket timeout settings (referenced at https://github.com/conveyal/r5/blob/v6.9/src/main/java/com/conveyal/analysis/controllers/BrokerController.java)

abyrd self-assigned this Mar 22, 2021

ansoncfit mentioned this issue May 4, 2023

"Snapshot tests" of analysis results #299

Closed

ansoncfit changed the title ~~Monte Carlo tests fail~~ Deterministic option (fixed seed?) for Monte Carlo results May 9, 2023

ansoncfit added a commit that referenced this issue Jun 3, 2023

Allow setting seed for frequency offsets

cf9a0af

Addresses #714

ansoncfit mentioned this issue Jun 13, 2023

Allow setting seed for frequency offsets #881

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deterministic option (fixed seed?) for Monte Carlo results #714

Deterministic option (fixed seed?) for Monte Carlo results #714

abyrd commented Mar 22, 2021

abyrd commented May 7, 2021

ansoncfit commented May 4, 2023

abyrd commented Aug 4, 2023 •

edited

Loading

ansoncfit commented Sep 18, 2023

Deterministic option (fixed seed?) for Monte Carlo results #714

Deterministic option (fixed seed?) for Monte Carlo results #714

Comments

abyrd commented Mar 22, 2021

abyrd commented May 7, 2021

ansoncfit commented May 4, 2023

abyrd commented Aug 4, 2023 • edited Loading

ansoncfit commented Sep 18, 2023

abyrd commented Aug 4, 2023 •

edited

Loading