-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unit tests broken under Julia 1.7 due to RNG change #121
Comments
Speculation on the GaussianProcessEmulator above is incorrect: |
tsj5
added a commit
to tsj5/CalibrateEmulateSample.jl
that referenced
this issue
Dec 5, 2021
fixes CliMA#121 After fixing the RNG algorithm used, alter the parameters of broken tests so that they pass. - Increase number of training points and decrease noise covariance in GaussianProcessEmulator tests so that "Test case 2: 2D input, 2D output" passes. - Change the sample observation compared against the return value of get_obs_sample() in Utilities tests.
bors bot
added a commit
to CliMA/EnsembleKalmanProcesses.jl
that referenced
this issue
Jan 22, 2022
95: Pass explicit RNG object to methods involving stochasticity r=ilopezgp a=tsj5 This is a rebase and squash of the feature branch for PR #81 and contains no new commits. The PR message for #81 is reproduced below. <hr> The motivation for this PR comes from CliMA/CalibrateEmulateSample.jl#121; see discussion there. In order to have reproducible tests involving randomness, it's not sufficient to set the global seed for the default RNG: the default RNG algorithm used by Julia was [changed](https://julialang.org/blog/2021/11/julia-1.7-highlights/#new_rng_reproducible_rng_in_tasks) in version 1.7, and the [docs](https://docs.julialang.org/en/v1/stdlib/Random/#Reproducibility) for the Random package note that the same seed may give different random values in different versions. For this reason, [recommended practice](https://discourse.julialang.org/t/set-the-state-of-the-global-rng/12599/3) for reproducible tests is to pass an RNG object explicitly to all methods which need it. Julia convention is that the RNG is an optional first positional argument in functions which require it (see e.g. Random, StatsBase, Turning.jl...), and this is done here. This PR implements that change; in brief - An `rng` field is added to `EnsembleKalmanProcesses` (i.e. independent of `Process`, for all implementations). - An optional `rng` kwarg is added to `construct_initial_ensemble()`, used in lieu of `rng_seed` if provided. - An additional method for `construct_initial_ensemble()` takes the rng as the first argument, causing kwargs to be ignored. - All methods for `sample_distribution()` now accept an rng as an optional first argument. In all of the cases above, `Random.GLOBAL_RNG` is used by default if an RNG is not given: this PR is fully backwards compatible with `main`. In particular, tests are updated to use an explicit rng, but examples have been left unchanged. It may be desirable to update them as well. Co-authored-by: Thomas Jackson <tom.jackson314@gmail.com>
bors bot
added a commit
to CliMA/EnsembleKalmanProcesses.jl
that referenced
this issue
Jan 22, 2022
95: Pass explicit RNG object to methods involving stochasticity r=ilopezgp a=tsj5 This is a rebase and squash of the feature branch for PR #81 and contains no new commits. The PR message for #81 is reproduced below. <hr> The motivation for this PR comes from CliMA/CalibrateEmulateSample.jl#121; see discussion there. In order to have reproducible tests involving randomness, it's not sufficient to set the global seed for the default RNG: the default RNG algorithm used by Julia was [changed](https://julialang.org/blog/2021/11/julia-1.7-highlights/#new_rng_reproducible_rng_in_tasks) in version 1.7, and the [docs](https://docs.julialang.org/en/v1/stdlib/Random/#Reproducibility) for the Random package note that the same seed may give different random values in different versions. For this reason, [recommended practice](https://discourse.julialang.org/t/set-the-state-of-the-global-rng/12599/3) for reproducible tests is to pass an RNG object explicitly to all methods which need it. Julia convention is that the RNG is an optional first positional argument in functions which require it (see e.g. Random, StatsBase, Turning.jl...), and this is done here. This PR implements that change; in brief - An `rng` field is added to `EnsembleKalmanProcesses` (i.e. independent of `Process`, for all implementations). - An optional `rng` kwarg is added to `construct_initial_ensemble()`, used in lieu of `rng_seed` if provided. - An additional method for `construct_initial_ensemble()` takes the rng as the first argument, causing kwargs to be ignored. - All methods for `sample_distribution()` now accept an rng as an optional first argument. In all of the cases above, `Random.GLOBAL_RNG` is used by default if an RNG is not given: this PR is fully backwards compatible with `main`. In particular, tests are updated to use an explicit rng, but examples have been left unchanged. It may be desirable to update them as well. Co-authored-by: Thomas Jackson <tom.jackson314@gmail.com>
bors bot
added a commit
that referenced
this issue
Feb 4, 2022
122: [WIP] Use EKP.jl 0.2.0 & pass explicit RNG object where used r=odunbar a=tsj5 [*Edited 1 Feb 2022 in response to updates*] This PR updates CES to use the [0.2.0 release](https://github.com/CliMA/EnsembleKalmanProcesses.jl/releases/tag/v0.2.0) of EnsembleKalmanProcesses.jl (EKP). In addition to using changed module and method names, the PR adds support in CES for use of an explicit random number generator where needed; this was done in EKP [#81](CliMA/EnsembleKalmanProcesses.jl#81). [Recommended practice](https://discourse.julialang.org/t/set-the-state-of-the-global-rng/12599/3) for reproducible tests is to pass an RNG object explicitly to all methods which need it. It's not sufficient to set the global seed for the default RNG: the default RNG algorithm used by Julia was [changed](https://julialang.org/blog/2021/11/julia-1.7-highlights/#new_rng_reproducible_rng_in_tasks) in version 1.7. This will affect us in the future when we update CES to support Julia > 1.6 (#121): the [docs](https://docs.julialang.org/en/v1/stdlib/Random/#Reproducibility) for the Random package note that the same seed may give different random values in different versions. Changes made to the CES API in this PR are minor (which is why the RNG change is combined with the EKP update): - An `rng` field is added to `MCMC`; the constructor sets this to `Random.GLOBAL_RNG` by default. - `rng` is recognized as an optional first argument to `get_obs_sample`, used in lieu of the `rng_seed` kwarg if provided. All changes are fully backwards compatible. In particular, tests are updated to use an explicit rng, but examples have been left unchanged. It may be desirable to update them as well. Co-authored-by: Thomas Jackson <tom.jackson314@gmail.com>
bors bot
added a commit
that referenced
this issue
Feb 15, 2022
122: [WIP] Use EKP.jl 0.2.0 & pass explicit RNG object where used r=odunbar a=tsj5 [*Edited 1 Feb 2022 in response to updates*] This PR updates CES to use the [0.2.0 release](https://github.com/CliMA/EnsembleKalmanProcesses.jl/releases/tag/v0.2.0) of EnsembleKalmanProcesses.jl (EKP). In addition to using changed module and method names, the PR adds support in CES for use of an explicit random number generator where needed; this was done in EKP [#81](CliMA/EnsembleKalmanProcesses.jl#81). [Recommended practice](https://discourse.julialang.org/t/set-the-state-of-the-global-rng/12599/3) for reproducible tests is to pass an RNG object explicitly to all methods which need it. It's not sufficient to set the global seed for the default RNG: the default RNG algorithm used by Julia was [changed](https://julialang.org/blog/2021/11/julia-1.7-highlights/#new_rng_reproducible_rng_in_tasks) in version 1.7. This will affect us in the future when we update CES to support Julia > 1.6 (#121): the [docs](https://docs.julialang.org/en/v1/stdlib/Random/#Reproducibility) for the Random package note that the same seed may give different random values in different versions. Changes made to the CES API in this PR are minor (which is why the RNG change is combined with the EKP update): - An `rng` field is added to `MCMC`; the constructor sets this to `Random.GLOBAL_RNG` by default. - `rng` is recognized as an optional first argument to `get_obs_sample`, used in lieu of the `rng_seed` kwarg if provided. All changes are fully backwards compatible. In particular, tests are updated to use an explicit rng, but examples have been left unchanged. It may be desirable to update them as well. Co-authored-by: Thomas Jackson <tom.jackson314@gmail.com>
bors bot
added a commit
that referenced
this issue
Feb 16, 2022
122: [WIP] Use EKP.jl 0.2.0 & pass explicit RNG object where used r=tsj5 a=tsj5 [*Edited 1 Feb 2022 in response to updates*] This PR updates CES to use the [0.2.0 release](https://github.com/CliMA/EnsembleKalmanProcesses.jl/releases/tag/v0.2.0) of EnsembleKalmanProcesses.jl (EKP). In addition to using changed module and method names, the PR adds support in CES for use of an explicit random number generator where needed; this was done in EKP [#81](CliMA/EnsembleKalmanProcesses.jl#81). [Recommended practice](https://discourse.julialang.org/t/set-the-state-of-the-global-rng/12599/3) for reproducible tests is to pass an RNG object explicitly to all methods which need it. It's not sufficient to set the global seed for the default RNG: the default RNG algorithm used by Julia was [changed](https://julialang.org/blog/2021/11/julia-1.7-highlights/#new_rng_reproducible_rng_in_tasks) in version 1.7. This will affect us in the future when we update CES to support Julia > 1.6 (#121): the [docs](https://docs.julialang.org/en/v1/stdlib/Random/#Reproducibility) for the Random package note that the same seed may give different random values in different versions. Changes made to the CES API in this PR are minor (which is why the RNG change is combined with the EKP update): - An `rng` field is added to `MCMC`; the constructor sets this to `Random.GLOBAL_RNG` by default. - `rng` is recognized as an optional first argument to `get_obs_sample`, used in lieu of the `rng_seed` kwarg if provided. All changes are fully backwards compatible. In particular, tests are updated to use an explicit rng, but examples have been left unchanged. It may be desirable to update them as well. Co-authored-by: Thomas Jackson <tom.jackson314@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Problem description
Running unit tests on a fresh installation of the
master
branch results in 3 failures: two inGaussianProcessEmulator
, one inUtilities
.Steps to reproduce
Under
arm64
julia 1.7.0 on macOS 12.0.1,julia --project -e 'using Pkg; Pkg.test()'
givesCLI output
(package dependency output omitted)
I haven't been able to verify that the tests in question pass on 1.6.x (since I'm on macOS, this involves installing intel-architecture versions of not only julia but the python dependencies, and I'm running into trouble rebuilding PyCall/ScikitLearn to pick everything up correctly).
Speculative diagnosis
The failures appear to be RNG-related:
get_obs_sample
simply returning the wrong randomly selected observation.noise_learn=true
and comparisons with a large abs tolerance, so I'm suspecting the same cause. For what it's worth, the failing values aren't far outside tolerance.Julia 1.7 changed the default RNG algorithm, but beyond that, the docs state that minor version changes may change the sequence of numbers returned from a given random seed due to internal implementation changes.
Proposed fix
A quick and dirty fix would be to monkey-patch
Random
in the unit tests and assign a MersenneTwister object (the algorithm used in julia 1.6 and earlier) toRandom.GLOBAL_RNG
. I've tried several constructions withcopy!
andeval
, but haven't managed to do this.Recommended best practice is to add an RNG argument to all functions that need to generate random variates, which allows reproducibility in terms of both seed and algorithm used. This shouldn't be painful from an end-user point of view, as the RNG object can be carried around in the top-level configuration object used in each stage.
Of course, proper MCMC convergence means aggregate statistics should be independent of the RNG sequence: this is only to guarantee reproducible tests.
The text was updated successfully, but these errors were encountered: