Reproducibly set random number generators #528

dschlaep · 2022-09-06T13:48:09Z

pull in SOILWAT2 branch "feature_pcg_seeding"

now, a random number sequence can be exactly reproduced ("initstate" and "initseq"), see Feature pcg seeding SOILWAT2#327
see also RandNorm() may not reproduce first random number SOILWAT2#326
function RandSeed() has now two arguments "initstate" and "initseq"
-> update STEPWAT2 calls to RandSeed() with new arguments

new set_all_rngs() to set each STEPWAT2 random number generator to produce sequences of random numbers that are reproducible (if user-provided "seed" is non-zero) and

unique among RNGs, iterations, years, and grid cells (most RNGs)
unique among RNGs, iterations, and years but identical among grid cells (weather generator RNG).
A user-provided "seed" of zero produces non-reproducible random number sequences which are non-coinciding among RNGs, iterations, and grid cells

- pull in SOILWAT2 branch "feature_pcg_seeding" * now, a random number sequence can be exactly reproduced ("initstate" and "initseq"), see DrylandEcology/SOILWAT2#327 * see also DrylandEcology/SOILWAT2#326 * function `RandSeed()` has now two arguments "initstate" and "initseq" -> update STEPWAT2 calls to `RandSeed()` with new arguments - new `set_all_rngs()` to set each STEPWAT2 random number generator to produce sequences of random numbers that are reproducible (if user-provided "seed" is non-zero) and * unique among RNGs, iterations, years, and grid cells (most RNGs) * unique among RNGs, iterations, and years but identical among grid cells (weather generator RNG). * A user-provided "seed" of zero produces non-reproducible random number sequences which are non-coinciding among RNGs, iterations, and grid cells. - both non-gridded and gridded mode exactly reproduce weather among runs if seed != 0; weather is not reproduced among runs if seed == 0 - note: gridded mode reproduces weather almost but not exactly among cells (this is not the intended behavior and requires further investigation)

dschlaep · 2022-09-06T13:51:31Z

@kpalmqui I finally got around to implement the updates to the random number generators that we discussed a couple weeks ago! It appears to work ok except that grid cells do not exactly (but almost exactly) reproduce weather in gridded mode.

- SOILWAT2 commit 947ae2f1b8a67119ff70762be5115d0ada60d15c updated "weathsetup.in" - changes are SOILWAT2-standalone but reading that file correctly is required for STEPWAT2

This includes - `SW_WTH_init_run()` now also initializes yesterday's weather values

The problem was that gridded mode used the last values in a year of one cell as first values of the next cell. This went mostly unnoticed but when the last day contains precipitation, then this affects the weather generator's behavior for the first day of the next cell. This is fixed by (i) SOILWAT2's `SW_WTH_init_run()` did not zero out yesterday's weather values; this was an issue only for STEPWAT2's gridded mode which does not deconstruct/construct each SOILWAT2 run (fixed with commit 8a0a754). (ii) STEPWAT2's `load_cell()` did not call `SW_CTL_init_run()` (and `SW_WTH_init_run()`) or equivalently to prevent the carry-over of values from one cell to the next (fixed with this commit). This now works as expected: i.e., * if seed != 0 (output is reproduced among runs) ** weather is exactly identical among runs and cells ** weather is different among years, iterations and seeds * if seed == 0 (output cannot be reproduced among runs) ** weather is different among cells, years, iterations and runs However and ideally, a grid cell should continue with the state that it ended at during the previous year (and not be zeroed out).

dschlaep · 2022-09-12T17:44:13Z

This now works as expected: i.e.,

if seed != 0 (output is reproduced among runs)
** weather is exactly identical among runs and cells
** weather is different among years, iterations and seeds
if seed == 0 (output cannot be reproduced among runs)
** weather is different among cells, years, iterations and runs

Script to check expectations (requires to manually adjust seed during interactive use):

make clean

#-------------------------------------------------------------------------------
#--- Test nongridded mode ------------------------------------------------------

#-------------------------------------------------------------------------------
#--- * if seed != 0 (output is reproduced among runs) ------
# Set model.in: 3 100 7 (niter, nyrs, seed)
make bint_testing_nongridded
cp -r testing.sagebrush.master/Stepwat_Inputs/Output testing.sagebrush.master/Stepwat_Inputs/Output_i3y100s7_r1
make bint_testing_nongridded


#--- ** weather is exactly identical among runs ------
# Expect no differences
diff -aqr testing.sagebrush.master/Stepwat_Inputs/Output testing.sagebrush.master/Stepwat_Inputs/Output_i3y100s7_r1


#--- ** weather is different among years ------
Rscript -e 'x <- read.csv("testing.sagebrush.master/Stepwat_Inputs/Output/bmassavg.csv")[, c("PPT", "Temp")]; apply(x, 2, sd) > 0'

#--- ** weather is different among iterations ------
Rscript -e 'x <- read.csv("testing.sagebrush.master/Stepwat_Inputs/Output/bmassavg.csv")[, c("StdDev", "StdDev.1")]; apply(x, 2, sd) > 0'


#--- ** weather is different among seeds ------
# Set model.in: 3 100 6 (niter, nyrs, seed) # or any seed other than 0 and 7
make bint_testing_nongridded

# Expect differences
Rscript -e 'x1 <- read.csv("testing.sagebrush.master/Stepwat_Inputs/Output/bmassavg.csv")[, c("PPT", "Temp")]; x0 <- read.csv("testing.sagebrush.master/Stepwat_Inputs/Output_i3y100s7_r1/bmassavg.csv")[, c("PPT", "Temp")]; !isTRUE(all.equal(x1, x0))'




#-------------------------------------------------------------------------------
#--- * if seed == 0 (output cannot be reproduced among runs) ------
# Set model.in: 3 100 0 (niter, nyrs, seed)
make bint_testing_nongridded
cp -r testing.sagebrush.master/Stepwat_Inputs/Output testing.sagebrush.master/Stepwat_Inputs/Output_i3y100s0_r1
make bint_testing_nongridded


#--- ** weather is different among years ------
Rscript -e 'x <- read.csv("testing.sagebrush.master/Stepwat_Inputs/Output/bmassavg.csv")[, c("PPT", "Temp")]; apply(x, 2, sd) > 0'

#--- ** weather is different among iterations ------
Rscript -e 'x <- read.csv("testing.sagebrush.master/Stepwat_Inputs/Output/bmassavg.csv")[, c("StdDev", "StdDev.1")]; apply(x, 2, sd) > 0'


#--- ** weather is different among runs ------
Rscript -e 'x1 <- read.csv("testing.sagebrush.master/Stepwat_Inputs/Output/bmassavg.csv")[, c("PPT", "Temp")]; x0 <- read.csv("testing.sagebrush.master/Stepwat_Inputs/Output_i3y100s0_r1/bmassavg.csv")[, c("PPT", "Temp")]; !isTRUE(all.equal(x1, x0))'




#-------------------------------------------------------------------------------
#--- Test gridded mode ---------------------------------------------------------

#-------------------------------------------------------------------------------
#--- * if seed != 0 (output is reproduced among runs) ------
# Set model.in: 3 100 7 (niter, nyrs, seed)
make bint_testing_gridded
cp -r testing.sagebrush.master/Output testing.sagebrush.master/Output_i3y100s7_r1
make bint_testing_gridded


#--- ** weather is exactly identical among runs and cells ------
# Expect no differences among runs
diff -aqr testing.sagebrush.master/Output testing.sagebrush.master/Output_i3y100s7_r1

# Expect no differences among cells
Rscript -e 'x <- lapply(seq_len(4), function(k) read.csv(paste0("testing.sagebrush.master/Output/g_bmassavg", k - 1, ".csv"))[, c("PPT", "Temp")]); sapply(seq_len(4), function(k) all.equal(x[[k]], x[[1]]))'


#--- ** weather is different among years ------
Rscript -e 'x <- read.csv("testing.sagebrush.master/Output/g_bmass_cell_avg.csv")[, c("PPT", "Temp")]; apply(x, 2, sd) > 0'

#--- ** weather is different among iterations ------
Rscript -e 'x <- lapply(seq_len(4), function(k) read.csv(paste0("testing.sagebrush.master/Output/g_bmassavg", k - 1, ".csv"))[, c("StdDev", "StdDev.1")]); sapply(x, function(xk) apply(xk, 2, sd) > 0)'


#--- ** weather is different among seeds ------
# Set model.in: 3 100 6 (niter, nyrs, seed) # or any seed other than 0 and 7
make bint_testing_gridded

# Expect differences
Rscript -e 'x1 <- read.csv("testing.sagebrush.master/Output/g_bmass_cell_avg.csv")[, c("PPT", "Temp")]; x0 <- read.csv("testing.sagebrush.master/Output_i3y100s7_r1/g_bmass_cell_avg.csv")[, c("PPT", "Temp")]; !isTRUE(all.equal(x1, x0))'



#-------------------------------------------------------------------------------
#--- * if seed == 0 (output cannot be reproduced among runs) ------
# Set model.in: 3 100 0 (niter, nyrs, seed)
make bint_testing_gridded
cp -r testing.sagebrush.master/Output testing.sagebrush.master/Output_i3y100s0_r1
make bint_testing_gridded


#--- ** weather is different among cells ------
Rscript -e 'x <- lapply(seq_len(4), function(k) read.csv(paste0("testing.sagebrush.master/Output/g_bmassavg", k - 1, ".csv"))[, c("PPT", "Temp")]); sapply(seq_len(4)[-1], function(k) !isTRUE(all.equal(x[[k]], x[[1]])))'


#--- ** weather is different among years ------
Rscript -e 'x <- read.csv("testing.sagebrush.master/Output/g_bmass_cell_avg.csv")[, c("PPT", "Temp")]; apply(x, 2, sd) > 0'

#--- ** weather is different among iterations ------
Rscript -e 'x <- lapply(seq_len(4), function(k) read.csv(paste0("testing.sagebrush.master/Output/g_bmassavg", k - 1, ".csv"))[, c("StdDev", "StdDev.1")]); sapply(x, function(xk) apply(xk, 2, sd) > 0)'


#--- ** weather is different among runs ------
Rscript -e 'x1 <- read.csv("testing.sagebrush.master/Output/g_bmass_cell_avg.csv")[, c("PPT", "Temp")]; x0 <- read.csv("testing.sagebrush.master/Output_i3y100s0_r1/g_bmass_cell_avg.csv")[, c("PPT", "Temp")]; !isTRUE(all.equal(x1, x0))'

- SOILWAT2 branch "feature_read_weather" isolated the handling of daily weather data and moved it from within the simulation loop to the overall setup process - STEPWAT2 needs now to handle daily weather data itself; there are two basic options: i) follow SOILWAT2's new approach and generate daily weather for all years of a simulation run (for each grid cell and iteration); this would require that each grid cell stores and handles a local copy of `SW_Weather` ii) stick with the previous approach which generated daily weather for each year - this commit follows option (ii), i.e., generate daily weather for each year ** new `_sxw_generate_weather()` handles the generation of daily weather for the current year ** `Env_Generate()` now calls `_sxw_generate_weather()` before running SOILWAT2 for the current year ** non-gridded mode needed to set RNGs for each year (so that `markov_rng` gets updated with fresh values for each year) -> this commit satisfies expectations, i.e., (script based on #528 (comment)) - if seed != 0 (output is reproduced among runs) ** weather is exactly identical among runs and cells ** weather is different among years, iterations and seeds - if seed == 0 (output cannot be reproduced among runs) ** weather is different among cells, years, iterations and runs

STEPWAT2 should meet reproducibility expectations as formulated with PR #528 (#528) that was merged into the main branch on Sep 18, 2022 with commit (7dce3c7) "Exactly reproduce random number sequences" This new bash script automatically runs gridded and nongridded example runs with STEPWAT2 using different seeds and uses 'diff' and 'Rscript' to check the following: * if seed != 0 (output is reproduced among runs) ** weather is exactly identical among runs and cells ** weather is different among years, iterations and seeds * if seed == 0 (output cannot be reproduced among runs) ** weather is different among cells, years, iterations and runs Note that 'master' and 'Seed_Dispersal' branches use different naming schemes of output files -> these need to be manually adjusted with variables `tag_gridded_biomass_cellk` (line 30) and `fname_gridded_biomass_meancell` (line 35) -- they are currently set to the 'Seed_Dispersal' scheme

dschlaep requested a review from kpalmqui September 6, 2022 13:48

dschlaep added 3 commits September 12, 2022 09:44

Update sxw example input "weathsetup.in"

5f4d4a3

- SOILWAT2 commit 947ae2f1b8a67119ff70762be5115d0ada60d15c updated "weathsetup.in" - changes are SOILWAT2-standalone but reading that file correctly is required for STEPWAT2

Update SOILWAT2 to latest commit on "feature_pcg_seeding"

8a0a754

This includes - `SW_WTH_init_run()` now also initializes yesterday's weather values

kpalmqui approved these changes Sep 14, 2022

View reviewed changes

Set SOILWAT2 to v6.6.0 (that includes new RNG functionality)

dc73ffc

dschlaep merged commit 7dce3c7 into master Sep 19, 2022

dschlaep deleted the reproducible_rngs branch September 19, 2022 02:33

dschlaep mentioned this pull request Jul 24, 2023

Inconsistent behavior of generated weather among cells in gridded mode if random seed is 0 #553

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducibly set random number generators #528

Reproducibly set random number generators #528

dschlaep commented Sep 6, 2022

dschlaep commented Sep 6, 2022

dschlaep commented Sep 12, 2022

Reproducibly set random number generators #528

Reproducibly set random number generators #528

Conversation

dschlaep commented Sep 6, 2022

dschlaep commented Sep 6, 2022

dschlaep commented Sep 12, 2022