Checkpoint the lfric_apps random seed#427
Checkpoint the lfric_apps random seed#427Steve Mullerworth (stevemullerworth) wants to merge 6 commits intoMetOffice:mainfrom
Conversation
This branch is developed from stable, so the test won't work anyway till head (including the checkpointing of the Stochastic Physics fields) is merged in. It could be done once sci/tech is passed, or as a separate PR. The change adds an nrun/crun test for ral3-seuk 32-bit which I think used random numbers. |
Ah yes, that makes sense - I think it would be worth doing that, either on this PR when merging up to main, or a separate PR, whichever is easier. |
|
Your CLA signature was found on the base branch, but you appear to have modified the CONTRIBUTORS.md file in this PR. Please do not edit the CONTRIBUTORS.md file. If you have already signed the CLA, revert changes to the file and your signature will be picked up. |
PR Summary
Sci/Tech Reviewer: allynt
Code Reviewer: Pierre Siddall (@Pierre-siddall)
Several
lfric_atmconfigurations use random number generators. This PR improves checkpoint/restart for the random seed to support bit-reproducibility across checkpoint/restart boundaries.The XIOS API does not support writing integers to the checkpoint dump. Previously, the random seed was converted to a real type for checkpointing (using the
io_value_type). However,io_value_typeis hard-wired to user_defkind, which offers insufficient precision for storing a random number seed.This change uses a new
integer_io_value_typedeveloped inlfric_corelinked MetOffice/lfric_core#326 which checkpoints integer values regardless of real precision settings.The change affects KGOs of CRUNs of configurations that use random numbers.
The change enabled ral3-seuk to bit-compare across checkpoint/restart boundaries, so nrun/crun comparison was enabled for this test (it requires an extra 2:30 minute run of a 16 node job to generate the long run checksum to compare with the matching nrun+crun)
Other configurations with CRUNs are not, as yet, fully fixed by this change as they use other arrays whose evolution depends on the random seed: PR #383 deals with some
clim_gal9stochastic physics SKEB and SPT schemes (where nrun/crun comparison is possible for 64-bit configurations). Issue #426 refers to the stochastic physics random parameter scheme.MetOffice/lfric_core#326
Code Quality Checklist
Testing
trac.log
Test Suite Results - lfric_apps - chkpt_seed_review/run1
Suite Information
Task Information
✅ succeeded tasks - 1513
Security Considerations
Performance Impact
AI Assistance and Attribution
Documentation
PSyclone Approval
Sci/Tech Review
(Please alert the code reviewer via a tag when you have approved the SR)
Code Review