Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compile GSI on Hera with Rocky 8 #710

Closed
hu5970 opened this issue Mar 8, 2024 · 19 comments · Fixed by #715
Closed

compile GSI on Hera with Rocky 8 #710

hu5970 opened this issue Mar 8, 2024 · 19 comments · Fixed by #715

Comments

@hu5970
Copy link
Collaborator

hu5970 commented Mar 8, 2024

Hera/Jet OS will be updated to Rocky 8 soon.

Tried compile GSI on Hera with Rocky 8 node and found error message:

[ 95%] Built target gsi_fortran_obj
[ 95%] Linking Fortran static library libgsi.a
/usr/bin/ar: Relink `/apps/oneapi/compiler/2022.0.2/linux/compiler/lib/intel64_lin/libimf.so' with `/lib64/libm.so.6' for IFUNC symbol `sinf'
Error running link command: Segmentation fault
make[2]: *** [src/gsi/CMakeFiles/gsi.dir/build.make:1336: src/gsi/libgsi.a] Error 1
make[2]: *** Deleting file 'src/gsi/libgsi.a'
make[1]: *** [CMakeFiles/Makefile2:212: src/gsi/CMakeFiles/gsi.dir/all] Error 2
make: *** [Makefile:146: all] Error 2
@hu5970
Copy link
Collaborator Author

hu5970 commented Mar 8, 2024

There are new spack-stack for Rocky 8 on Hera and Jet.
ufs-community/ufs-weather-model#2167

Could we have "gsi-addon-dev" for Rocky 8 as soon as possible?

@hu5970
Copy link
Collaborator Author

hu5970 commented Mar 8, 2024

@DavidHuber-NOAA @RussTreadon-NOAA All Hera will go to Rocky 8 on 03/12/2024. Could you make a
spack-stack-1.5.1/envs/gsi-addon for Rocky 8?

@RussTreadon-NOAA
Copy link
Contributor

@hu5970 , GSI develop @ fca6bea (the current head) builds on Hera Rocky8 nodes.

Here's what I did

  • clone the current head of GSI develop
  • update modulefiles/gsi_hera.intel.lua to use rocky8 modules
  • execute ush/build.sh on hfe10 (Rocky-8 node)
  • configure ctests run the same rocky8 executable for updat and contrl

6 of the 7 ctests passed. hafs_4denvar_glbens failed due to

The runtime for hafs_4denvar_glbens_loproc_updat is 325.055592 seconds.  This has exceeded maximum allowable threshold time of 320.512357 seconds,
resulting in Failure time-thresh of the regression test.

A check of gsi.x wall times shows

hafs_4denvar_glbens_hiproc_contrl/stdout:The total amount of wall time                        = 234.970212
hafs_4denvar_glbens_hiproc_updat/stdout:The total amount of wall time                        = 231.002190
hafs_4denvar_glbens_loproc_contrl/stdout:The total amount of wall time                        = 291.374870
hafs_4denvar_glbens_loproc_updat/stdout:The total amount of wall time                        = 325.055592

Remember that updat and contrl ran the same gsi.x. The wall time variability is due to system issues (compute, network, disk), not gsi.x.

@DavidHuber-NOAA
Copy link
Collaborator

@hu5970: @HenryWinterbottom-NOAA is working on upgrading the global workflow, including the GSI, to Rocky8 on Hera. The overarching issue is NOAA-EMC/global-workflow#2329.

@RussTreadon-NOAA
Copy link
Contributor

@hu5970 , here's the change I made to modulefiles/gsi_hera.intel.lua

-prepend_path("MODULEPATH", "/scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev/install
/modulefiles/Core")
+prepend_path("MODULEPATH", "/scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev-rocky8/
install/modulefiles/Core")

I simply added -rocky8 to the gsi-addon-dev path. Make the same change in your GSI clone of develop. The build should run to completion and ctests pass.

@hu5970
Copy link
Collaborator Author

hu5970 commented Mar 8, 2024

@RussTreadon-NOAA Thanks, I am testing "gsi-addon-dev-rocky8" with the current GSI develop now. I will test RRFS with "gsi-addon-dev-rocky8" and let you know how it goes.

@hu5970
Copy link
Collaborator Author

hu5970 commented Mar 8, 2024

@RussTreadon-NOAA I can successfully compile GSI with
prepend_path("MODULEPATH", "/scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev-rocky8/install/modulefiles/Core")

But I have trouble to use it to compile ufs-srweather-model and UPP. The ufs-srweather-model is still using spack-stack-1.5.1. So could David to install "gsi-addon-dev-rocky8" with spack-stack-1.5.1?

The UPP is using "spack-stack-1.6.0" and I can compile UPP with Rocky 8 version from "unified-env-rocky8".
But the "gsi-addon-dev-rocky8" gave the following error message:

/lib64/libc.so.6: version GLIBC_2.25' not found (required by /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev-rocky8/install/intel/2021.5.0/openssl-3.1.3-2qax7pk/lib64/libcrypto.so.3)`

I guess some lib is missing in "gsi-addon-dev-rocky8"?

@RussTreadon-NOAA
Copy link
Contributor

I see gsi-addon-env-rocky8 in /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/

Hera(hfe10):/scratch1/NCEPDEV/da/Russ.Treadon/git/GDASApp/feature_rocky8/build$ ls -l /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/
total 32
drwxr-sr-x 7 role.epic nems 4096 Mar  6 20:09 fms-test-mar24
drwxr-sr-x 7 role.epic nems 4096 Mar  5 20:10 fms-test-mar24-rocky8
drwxr-sr-x 7 role.epic nems 4096 Nov 14 20:09 gsi-addon
drwxr-sr-x 7 role.epic nems 4096 Feb 14 16:49 gsi-addon-env-rocky8
drwxr-sr-x 7 role.epic nems 4096 Nov 16 19:27 gsi-addon-w3emc
drwxr-sr-x 7 role.epic nems 4096 Nov 16 17:52 gsi-w3emc-debug
drwxr-sr-x 6 role.epic nems 4096 Nov 17 19:54 unified-env
drwxr-sr-x 6 role.epic nems 4096 Feb 14 16:02 unified-env-rocky8

As you note, a library could be missing from 1.5.1 gsi-addon-env-rocky8

@hu5970
Copy link
Collaborator Author

hu5970 commented Mar 8, 2024

@RussTreadon-NOAA Thanks for finding "gsi-addon-env-rocky8".

I just tried it:
"/scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/gsi-addon-env-rocky8/install/modulefiles/Core"

Seems it does miss some libs:

Lmod has detected the following error: These module(s) or extension(s) exist but cannot be loaded as requested: "crtm/2.4.0.1", "python/3.11.6", "sp/2.5.0",
"netcdf-fortran/4.6.1", "prod_util/2.1.1"

@RussTreadon-NOAA
Copy link
Contributor

@hu5970, I, like you, am a spack-stack user. Please reach out to the spack-stack team. I don't think the spack-stack team monitors GSI issues. Did you open a spack-stack issue to report this problem?

@hu5970
Copy link
Collaborator Author

hu5970 commented Mar 8, 2024

@RussTreadon-NOAA Do you know who is responsible for "gsi-addon-env-rocky8"? Is the spack-stack team also take care of "gsi-addon-env-rocky8"?

@DavidHuber-NOAA
Copy link
Collaborator

@hu5970 I checked out an older version of the GSI that used spack-stack v1.5.1 (hash 44a8f59ef) on a Rocky8 node (hfe11) then updated gsi_hera.intel.lua to point to the gsi-addon-env-rocky8 environment:

-prepend_path("MODULEPATH", "/scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/gsi-addon/install/modulefiles/Core")
+prepend_path("MODULEPATH", "/scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/gsi-addon-env-rocky8/install/modulefiles/Core")

After this, I was able to compile this version of the GSI.

The gsi-addon-env-rocky8 environment was installed by EPIC as part of this issue: JCSDA/spack-stack#1005. Though I developed the gsi-addon environment, the spack-stack community handles the installations. If you are still experiencing issues with it, then either commenting on that issue or opening a new one within the spack-stack repository would make sense.

@RussTreadon-NOAA
Copy link
Contributor

@hu5970 , I am a spack-stack user. I do not know who is responsible for gsi-addon-env-rocky8.

As Hera admins have been telling users, they created a Rocky 8 transition document. Here is the link from their emails: https://docs.google.com/document/d/1oLqDkslD-99-zpkKD4MtKMmqdm2D4oAo1l7gHHfvKBM/edit?usp=sharing

I've been using this to guide me. When I get stuck I start by sending an email to rdhpcs.hera.help@noaa.gov with Rocky 8: in the subject line.

@hu5970
Copy link
Collaborator Author

hu5970 commented Mar 8, 2024

@RussTreadon-NOAA Thanks, I just send help request to Hera helpdesk on adding "gsi-addon" for Rocky 8.

@RussTreadon-NOAA
Copy link
Contributor

@DavidHuber-NOAA , what's the plan for the Hera Rocky 8 upgrade in terms of what needs to change in various apps? Specifically for the GSI should we

  1. open a GSI PR to change the path to spack-stack as done in the test above?
  2. spack-stack team moves rocky8 stacks to existing name when the Rocky 8 transition occurs? This way we don't change anything in the GSI.

Either approach requires precise timing. If we update too soon, the GSI won't build on non-Rocky 8 nodes. If we update too late, the GSI won't build on Rocky 8 nodes.

Hence my question about strategy and timing.

@DavidHuber-NOAA
Copy link
Collaborator

@RussTreadon-NOAA @HenryWinterbottom-NOAA The current installations of spack-stack will remain the same (so gsi-addon and gsi-addon-rocky8 will continue to coexist after the transition). Though I might suggest to the spack-stack team that they remove the CentOS versions once the transition is complete to reduce confusion. Renaming installations will, unfortunately, break the module files (and possibly other things).

So, that's a long-winded way to say we should probably open up a PR ASAP. Since Hera is now 2/3 Rocky8, I would vote for transitioning the GSI to Rocky8 sooner than later.

@RussTreadon-NOAA
Copy link
Contributor

Thank you @DavidHuber-NOAA for the guidance.

@hu5970 , would you mind opening a GSI PR to update modulefiles/gsi_hera.intel.lua?

@DavidHuber-NOAA
Copy link
Collaborator

A new openmpi package needs to be built on Hera-Rocky8 to support slurm compatibility on that system. The system admins will build that after which time GNU compilation can be tested.

@RussTreadon-NOAA
Copy link
Contributor

Thank you, @DavidHuber-NOAA , for the update. We'll hit pause on GSI Hera Rocky 8 gnu testing until the software infrastructure is in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants