Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary call to recover_fields for restart files can cause crashes. #694

Closed
SamuelTrahanNOAA opened this issue Sep 8, 2023 · 0 comments · Fixed by #695 or ufs-community/ufs-weather-model#1893
Labels
bug Something isn't working

Comments

@SamuelTrahanNOAA
Copy link
Contributor

Description

While trouble-shooting ufs-community/ufs-weather-model#1882, @junwang-noaa and @DusanJovic-NOAA discovered an unnecessary call to recover_fields. It only needs to be called for history files, not restart files. Presently, the call is done for both.

When 32-bit physics is enabled, the quilting restart will frequently abort on Hera with the gnu compiler in debug mode. It encounters a floating-point exception due to corrupted longitudes in recover_fields. This only happens for restart files. Removing the call eliminates that crash. Unfortunately, the model still crashes due to another, unknown, bug.

To Reproduce:

What compilers/machines are you seeing this with?
Give explicit steps to reproduce the behavior.

  1. Run the conus13km_debug test with quilting restart and the gnu compiler on Hera. You will need initialize two arrays and fix fortran coding error plus PRs #285 and #276 NOAA-GFDL/GFDL_atmos_cubed_sphere#280 to avoid another crash.
  2. Keep running until it fails in recover_fields. (It may fail earlier, in InitializeAdvertise.)

Additional context

The broader problem of quilting restart failing for 32-bit physics is discussed here:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment