Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault - invalid memory with FMS-2023.02-01: MOM_io_infra.F90 #2070

Closed
jkbk2004 opened this issue Jan 3, 2024 · 16 comments
Closed
Labels
bug Something isn't working

Comments

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Jan 3, 2024

Description

  1. cpld_control_p8_gnu regression tests fail on Hercules gnu with FMS-2023.02-01. The failure was reported with spack stack 1.5.1 update. Develop branch runs ok with FMS-2023.01. It only fails with FMS-2023.02-01 (installed in both spack stack 1.5.1 and 1.6.0). Even FMS-2023.04 runs ok with spack stack 1.6.0.
  2. error messages: 151: Inconsistencies between FV3 cap configuration parameters and atmos model namelist parameters #4 0x3521462 in __mom_io_infra_MOD_read_field_2d
    151: at /work/noaa/epic/jongkim/rt-2013/MOM6-interface/MOM6/config_src/infra/FMS2/MOM_io_infra.F90:905

To Reproduce:

  1. run Spack-stack 1.5.1, ESMF 8.5.0, FMS 2023.02.01 + Remove Gaea C4 + Fix build system to allow CMAKE_<COMPILER>_FLAGS to be specified for submodules #2052 #2013 on Hercules/Gnu
  2. for the case cpld_control_p8_gnu

Additional context

Output

@jkbk2004 jkbk2004 added the bug Something isn't working label Jan 3, 2024
@jkbk2004
Copy link
Collaborator Author

jkbk2004 commented Jan 3, 2024

err.log

@jkbk2004
Copy link
Collaborator Author

jkbk2004 commented Jan 3, 2024

@laurenchilutti @bensonr @junwang-noaa @jiandewang @DeniseWorthen I am wondering if any update is needed to update with FMS-2023.02-01 on MOM6 side for the lines that use fms_io ?

@jkbk2004
Copy link
Collaborator Author

jkbk2004 commented Jan 3, 2024

@junwang-noaa
Copy link
Collaborator

@jkbk2004 My understanding is that the gnu compiler issue is known in fms 2023.02.01/2023.03. It is fixed in fms 2023.04. Can we comment out gnu with gnu test, and turned it back on when we update the model to use spack-stack 1.6.0? Thanks

@jkbk2004
Copy link
Collaborator Author

jkbk2004 commented Jan 3, 2024

@junwang-noaa That makes a sense since fms-2023.04 runs on hercules/gnu ok. We can turn off failed gnu cases and move on.

@jiandewang
Copy link
Collaborator

see https://github.com/jiandewang/MOM6/blob/dev/emc/config_src/infra/FMS2/MOM_io_infra.F90#L902-L906
I think this version of fms code had a bug when reading fixed files which doesn't have timeleve and dimension

@jkbk2004
Copy link
Collaborator Author

jkbk2004 commented Jan 3, 2024

NOAA-GFDL/FMS#1254

@jkbk2004 jkbk2004 mentioned this issue Jan 12, 2024
11 tasks
@junwang-noaa
Copy link
Collaborator

@climbfuji I want to confirm with you that the failed gnu tests listed in this issue were able to run when a new gnu compiler (v12) is used, it is not a fms 2023.02.01 issue, is that right?

@climbfuji
Copy link
Collaborator

@climbfuji I want to confirm with you that the failed gnu tests listed in this issue were able to run when a new gnu compiler (v12) is used, it is not a fms 2023.02.01 issue, is that right?

My recollection is poor, but from the comments above your assumption sounds about right.

@jkbk2004
Copy link
Collaborator Author

@junwang-noaa @BrianCurtis-NOAA @DeniseWorthen @zach1221 @FernandoAndrade-NOAA I will take my words back. There is still seg fault issue with Hercules/gnu cpld cases. It's related to the mom6 fms_io call. I agree the issue will be resolved with new fms version.

@junwang-noaa
Copy link
Collaborator

@RatkoVasic-NOAA Is spack-stacl 1.6.0 ready on hercules and hera? If yes, @jkbk2004 would you please confirm this is resolved? Thansk

@RatkoVasic-NOAA
Copy link
Collaborator

@RatkoVasic-NOAA Is spack-stacl 1.6.0 ready on hercules and hera? If yes, @jkbk2004 would you please confirm this is resolved? Thansk

@junwang-noaa
Hera: /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/unified-env-rocky8/install/modulefiles/Core
Hercules: /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/unified-env/install/modulefiles/Core

@jkbk2004
Copy link
Collaborator Author

@RatkoVasic-NOAA Is spack-stacl 1.6.0 ready on hercules and hera? If yes, @jkbk2004 would you please confirm this is resolved? Thansk

@junwang-noaa Hera: /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/unified-env-rocky8/install/modulefiles/Core Hercules: /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/unified-env/install/modulefiles/Core

I think we will be able to close this issue when we update for spack stack 1.6. Tests run ok on hera and other RDHPCS for the update of the fms 2023.04 of the #2093.

@junwang-noaa
Copy link
Collaborator

@jkbk2004 have you turned those tests back on and run them successfully? Just want to confirm.

@jkbk2004
Copy link
Collaborator Author

@jkbk2004 have you turned those tests back on and run them successfully? Just want to confirm.

Yes, I checked again. It ran ok on hercules at /work2/noaa/stmp/jongkim/stmp/jongkim/FV3_RT/rt_1331586/cpld_control_p8_gnu. On hera, we still have the OSC pt2pt issue with gnu that we found from Rocky8 migration.

@junwang-noaa
Copy link
Collaborator

cpld_control_p8 was turned on in PR#2093. The issue is closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants