Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop using stand-alone UPP #2437

Closed
4 tasks
WalterKolczynski-NOAA opened this issue Mar 28, 2024 · 10 comments · Fixed by #2663
Closed
4 tasks

Stop using stand-alone UPP #2437

WalterKolczynski-NOAA opened this issue Mar 28, 2024 · 10 comments · Fixed by #2663
Assignees
Labels
maintenance Regular updates and maintenance work

Comments

@WalterKolczynski-NOAA
Copy link
Contributor

WalterKolczynski-NOAA commented Mar 28, 2024

What new functionality do you need?

As part of the Rocky 8 upgrade for Hera (PR #2421), we had to move to a stand-alone UPP version because the one in UFS has not yet been updated. Once the UPP version in UFS is updated to include the Rocky 8 updates, we should move back to using that version instead of checking out a separate version.

What are the requirements for the new functionality?

No separate UPP submodule

Acceptance Criteria

Dependency: ufs-community/ufs-weather-model#2213

  • Update UFS to hash containing UPP updates for Rocky 8
  • Remove UPP submodule
  • Update build_upp.sh to use ufs_model.fd/FV3/upp/tests instead of upp.fd/tests
  • Restore linking of ufs_model.fd/FV3/upp to sorc/upp.fd in link_workflow.sh

Suggest a solution (optional)

No response

@JessicaMeixner-NOAA
Copy link
Contributor

FYI @WenMeng-NOAA

@WenMeng-NOAA
Copy link
Contributor

@JessicaMeixner-NOAA @WalterKolczynski-NOAA I have been preparing my UFS PR for updating upp submodule.

@RussTreadon-NOAA
Copy link
Contributor

@WalterKolczynski-NOAA and @WenMeng-NOAA: I assume from this issue that ~HOMEgfs/sorc/upp.fd is the stand-alone UPP. Execution of sorc/build_all.sh in a working copy of develop at d6be3b5 on Hera reports a upp build failure

Running "module reset". Resetting modules to system default. The following $MODULEPATH directories have been removed: None
Building gsi_enkf, ufs, gfs_utils, gdas, ww3prepost, ufs_utils, gsi_utils, gsi_monitor, upp
Starting build_gsi_enkf.sh
Starting build_ufs.sh
Starting build_gfs_utils.sh
Starting build_gdas.sh
Starting build_ww3prepost.sh
Starting build_ufs_utils.sh
Starting build_gsi_utils.sh
Starting build_gsi_monitor.sh
Starting build_upp.sh
build_gsi_enkf.sh completed successfully!
build_gfs_utils.sh completed successfully!
build_ufs_utils.sh completed successfully!
build_gsi_utils.sh completed successfully!
build_gsi_monitor.sh completed successfully!
build_ww3prepost.sh completed successfully!
build_upp.sh failed with status 2!
build_ufs.sh completed successfully!
build_gdas.sh completed successfully!
BUILD ERROR: One or more components failed to build
  Check the associated build log(s) for details.

A check of sorc/logs/build_upp.log shows

[ 88%] Building Fortran object sorc/ncep_post.fd/CMakeFiles/upp.dir/OTLIFT.f.o
[ 89%] Building Fortran object sorc/ncep_post.fd/CMakeFiles/upp.dir/SURFCE.f.o
[ 90%] Linking Fortran static library libupp.a
/usr/bin/ar: Relink `/apps/oneapi/compiler/2022.0.2/linux/compiler/lib/intel64_lin/libimf.so' with `/lib64/libm.so.6' for IFUNC symbol `sinf'
Error running link command: Segmentation fault
make[2]: *** [sorc/ncep_post.fd/CMakeFiles/upp.dir/build.make:2182: sorc/ncep_post.fd/libupp.a] Error 1
make[2]: *** Deleting file 'sorc/ncep_post.fd/libupp.a'
make[1]: *** [CMakeFiles/Makefile2:133: sorc/ncep_post.fd/CMakeFiles/upp.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

CI testing using C96C48_hybatmDA and C96C48_ufs_hybatmDA encounter failed jobs for gdasatmanlupp and gfsatmanlupp because $HOMEgfs/exec/upp.x does not exist. This is a soft link pointing at HOMEgfs/sorc/upp.fd/exec/upp.x.

It this failure expected?

@WalterKolczynski-NOAA
Copy link
Contributor Author

The failure is not expected from a fresh clone. If you tried to pull in develop to an existing clone, you should've gotten a warning about it couldn't overwrite the upp.fd symlink. If that is the case, delete the symlink and then pull again (recursively or run submodule update afterwards).

@RussTreadon-NOAA
Copy link
Contributor

Manually remove sorc/upp.fd followed by git submodule sync and git submodule update. Then manually execute ./build_upp.sh in $HOMEgfs/sorc. This worked. upp.x created. Rerun of gdasatmanlupp and gfsatmanlupp was successful.

@WenMeng-NOAA
Copy link
Contributor

@WalterKolczynski-NOAA @aerorahul The ufs-weather-model PR #2213 was submitted for updating upp submodule.

@WalterKolczynski-NOAA
Copy link
Contributor Author

@WenMeng-NOAA thanks for keeping us updated. First time we updated UFS after that is merged we can remove the temporary submodule.

@guoqing-noaa
Copy link
Contributor

Manually remove sorc/upp.fd followed by git submodule sync and git submodule update. Then manually execute ./build_upp.sh in $HOMEgfs/sorc. This worked. upp.x created. Rerun of gdasatmanlupp and gfsatmanlupp was successful.

I got the same error on Hera (Rocky8) and the manual method did not work for me.

[ 89%] Building Fortran object sorc/ncep_post.fd/CMakeFiles/upp.dir/SURFCE.f.o     
[ 90%] Linking Fortran static library libupp.a
/usr/bin/ar: Relink `/apps/oneapi/compiler/2022.0.2/linux/compiler/lib/intel64_lin/libimf.so' with `/lib64/libm.so.6' for IFUNC symbol `sinf'
Error running link command: Segmentation fault
make[2]: *** [sorc/ncep_post.fd/CMakeFiles/upp.dir/build.make:2182: sorc/ncep_post.fd/libupp.a] Error 1
make[2]: *** Deleting file 'sorc/ncep_post.fd/libupp.a'
make[1]: *** [CMakeFiles/Makefile2:133: sorc/ncep_post.fd/CMakeFiles/upp.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

I started from a clean recursive clone. I tried twice but got the same error.

Here are the steps I repeat the error:

git clone --recursive https://github.com/NOAA-EMC/global-workflow
cd global-workflow/sorc
./build_upp.sh

Could this be related to any of my environment settings?

@WalterKolczynski-NOAA
Copy link
Contributor Author

@guoqing-noaa we found there is actually an issue with the UPP hash. We added the fix into #2442, which should be merged soon.

@WenMeng-NOAA
Copy link
Contributor

@WalterKolczynski-NOAA My UFS PR #2213 was merged today. You may update the global-workflow accordingly to solve this issue.

@HenryRWinterbottom HenryRWinterbottom self-assigned this Apr 22, 2024
@aerorahul aerorahul mentioned this issue Jun 6, 2024
7 tasks
RussTreadon-NOAA pushed a commit that referenced this issue Jun 24, 2024
Updates ufs-weather-model, this updates RDHPCS to the newer spack-stack
allowing some temporary fixes to be reverted.
* removes upp submodule
* uses upp from the ufs-weather-model
* restores the build and link that were hacked during the Hera Rocky 8
transition to allow for UPP submodule
* Removes forecast directories in clean-up

Resolves #2617 
Resolves #2437

---------

Co-authored-by: Rahul Mahajan <aerorahul@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Regular updates and maintenance work
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants