Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Hercules support for GSI #574

Closed
aerorahul opened this issue May 11, 2023 · 12 comments
Closed

Add Hercules support for GSI #574

aerorahul opened this issue May 11, 2023 · 12 comments

Comments

@aerorahul
Copy link
Contributor

Though it shares a file system with Orion, its system specs and software stack are significantly different. Therefore, to enable running the GSI on Hercules, several of the RT configs/scripts need updated. A new gsi_hercules.intel.lua file is also required.

Description

MSU Hercules was recently made available for NOAA R&D use.
The GSI needs to be ported to Hercules.

Solution

Update all necessary files to enable GSI functionality on Hercules.
A spack-stack is available on Hercules and is accessible via:

module use /work/noaa/epic-ps/role-epic-ps/spack-stack/modulefiles
@BijuThomas-NOAA
Copy link

Wondering if anybody tries to run GSI with spack-stack (stack-intel/2021.7.1) on Hercules. I have issues during running while it compiled successfully on Hercules.

@jswhit
Copy link
Contributor

jswhit commented Jun 28, 2023

I have compiled and run it with spack-stack 1.4.0 successfully on orion, but on hercules I get a sefault in CRTM (JCSDA/spack-stack#643).

@RussTreadon-NOAA
Copy link
Contributor

Any updates on this issue?

Adding @dtkleist to keep him in the loop.

@aerorahul
Copy link
Contributor Author

Any updates on this issue?

Adding @dtkleist to keep him in the loop.

None at this time.
A lot of progress was made thanks to @DavidHuber-NOAA on debugging, building and running the GSI (and GSI-related components) will spack-stack (and updated compilers and libraries).

@DavidHuber-NOAA
Copy link
Collaborator

@RussTreadon-NOAA There are a couple of upstream dependencies I need to get in first. First is a spack-stack version 1.5.0 build with gsi-ncdiag/1.1.2 and bufr/11.7.0. I've requested that build in JCSDA/spack-stack#841 on Hera and Orion to start. Once tested, it should be installed everywhere, including Hercules. Secondly, the UFS is currently in the process of upgrading to spack-stack/1.5.0 (ufs-community/ufs-weather-model#1920). They are having some trouble finding an open MPI solution that will work for Hercules. This would likely be an issue for the GSI as well, so waiting for that to resolve first would likely save a lot of work.

@RussTreadon-NOAA
Copy link
Contributor

Thank you @aerorahul for the update. Thank you, @DavidHuber-NOAA for getting us this far. We can't move forward until the the items you mention are resolved.

@junwang-noaa
Copy link

junwang-noaa commented Oct 30, 2023

@DavidHuber-NOAA @RussTreadon-NOAA we are committing PR#1920. The issue on Hercules is resolved. Thanks

@RussTreadon-NOAA
Copy link
Contributor

Thank you @junwang-noaa for letting us know.

@junwang-noaa
Copy link

The model is updated. You can see hercules module files here:

https://github.com/ufs-community/ufs-weather-model/blob/develop/modulefiles/ufs_hercules.intel.lua

Thanks

@DavidHuber-NOAA
Copy link
Collaborator

Thank you @junwang-noaa!

DavidHuber-NOAA pushed a commit to DavidHuber-NOAA/GSI that referenced this issue Nov 9, 2023
DavidHuber-NOAA added a commit to DavidHuber-NOAA/GSI that referenced this issue Nov 9, 2023
DavidHuber-NOAA added a commit to DavidHuber-NOAA/GSI that referenced this issue Nov 9, 2023
@DavidHuber-NOAA
Copy link
Collaborator

I ran the regression tests successfully on Hercules noting that this establishes the baseline and thus it compared against itself. All tests passed except hafs_4denvar_glbens which failed for timethresh. This is fascinating given that that test has had erratic results on Orion (see here and here). Perhaps there is an issue with the HAFS tests on orion/hercules OR a periodic issue with spack-stack. This should probably be investigated further, though I think this should get its own issue. I will also note that I ran these tests on /work2 which showed to have more consistent results on Orion.

Below are the runtimes for all tests:

test hi/lo proc branch (same for both) run time
global_4denvar hiproc spack-stack 281.424148
global_4denvar hiproc spack-stack 281.449064
global_4denvar loproc spack-stack 350.825092
global_4denvar loproc spack-stack 356.102541
global_enkf hiproc spack-stack 69.015155
global_enkf hiproc spack-stack 65.570708
global_enkf loproc spack-stack 108.236097
global_enkf loproc spack-stack 85.835989
hafs_3denvar_hybens hiproc spack-stack 205.907236
hafs_3denvar_hybens hiproc spack-stack 213.127396
hafs_3denvar_hybens loproc spack-stack 310.071541
hafs_3denvar_hybens loproc spack-stack 315.533777
hafs_4denvar_glbens hiproc spack-stack 241.022350
hafs_4denvar_glbens hiproc spack-stack 247.217117
hafs_4denvar_glbens loproc spack-stack 345.320915
hafs_4denvar_glbens loproc spack-stack 391.665063
netcdf_fv3_regional hiproc spack-stack 71.834018
netcdf_fv3_regional hiproc spack-stack 71.269336
netcdf_fv3_regional loproc spack-stack 87.092422
netcdf_fv3_regional loproc spack-stack 84.989027
rrfs_3denvar_glbens hiproc spack-stack 90.392998
rrfs_3denvar_glbens hiproc spack-stack 94.202024
rrfs_3denvar_glbens loproc spack-stack 128.244605
rrfs_3denvar_glbens loproc spack-stack 117.200131
rtma hiproc spack-stack 188.807147
rtma hiproc spack-stack 189.574274
rtma loproc spack-stack 199.250219
rtma loproc spack-stack 201.834955

@RussTreadon-NOAA
Copy link
Contributor

PR #624 has been merged into develop. This PR added modulefiles/gsi_hercules.lua.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants