Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roll out spack-stack-1.6.0 #924

Closed
45 of 47 tasks
climbfuji opened this issue Dec 28, 2023 · 49 comments
Closed
45 of 47 tasks

Roll out spack-stack-1.6.0 #924

climbfuji opened this issue Dec 28, 2023 · 49 comments
Assignees
Labels
INFRA JEDI Infrastructure NOAA-EMC OAR-EPIC NOAA Oceanic and Atmospheric Research and Earth Prediction Innovation Center

Comments

@climbfuji
Copy link
Collaborator

climbfuji commented Dec 28, 2023

Is your feature request related to a problem? Please describe.

This issue captures the necessary tasks for rolling out spack-stack-1.6.0.

Installation procedure:

  1. Install unified-env from release/1.6.0 branch of spack-stack (and spack submodule)
  2. Install gsi-addon-env on selected platforms as chained environment (see list of platforms below)
  3. Update documentation and site config for each site (as necessary), including information about gsi-addon-env

Describe the solution you'd like

  1. Installation completed following the above procedure:
  2. Prepare release tags
    • Tag spack submodule as spack-stack-1.6.0
    • Update release/1.6.0 to tag spack-stack-1.6.0 of spack submodule in container recipes and .gitmodules
    • Update documentation (container doc, readthedocs): version, etc.
    • Tag spack-stack as 1.6.0 and spack-stack-1.6.0
    • Switch readthedocs from release/1.6.0 to 1.6.0
    • Create and publish release notes
  3. Remove spack-stack-v1 (containing skylab-3.0.0 releases and older) from all supported platforms. Check if any of the legacy thirdparty packages can be removed. Configure remotes and check out final tags 1.6.0/spack-stack-1.6.0 in spack-stack-1.6.0 directory.
    • Hercules
    • Orion
    • Nautilus
    • Narwhal
    • S4
    • Discover
    • Hera
    • Jet
    • Gaea
    • Acorn
    • Casper
    • Derecho
    • AWS Single Node AMI Red Hat 8 (GNU)
    • JCSDA AWS Parallel Cluster (R&D)
    • NOAA Parallelworks Gcloud JCSDA
    • JCSDA CI containers n/a
  4. Merge release branches back into jcsda_emc_spack_stack / develop:

Additional context
n/a

@junwang-noaa
Copy link

@climbfuji can you write a list of updated libraries in release/1.6.0? I think the library team (@Hang-Lei-NOAA @AlexanderRichert-NOAA ) need to test them on acorn and once confirmed, make requests for installation on wcoss2, which could take some time to finish. We are trying to avoid the libraries to diverge on wcoss2.

@climbfuji
Copy link
Collaborator Author

Can EPIC or EMC do this please? @AlexanderRichert-NOAA @ulmononian

@climbfuji
Copy link
Collaborator Author

Otherwise I'll do that when I draft the release notes in a few days

@AlexanderRichert-NOAA
Copy link
Collaborator

Yeah I can do that. @junwang-noaa I assume we're just talking about UFS WM-related libraries?

@climbfuji
Copy link
Collaborator Author

Thanks very much @AlexanderRichert-NOAA !

@junwang-noaa
Copy link

@AlexanderRichert-NOAA Yes. Also would you please install the spack-stack 1.6.0 on acorn for use to do testing? Once it is done, we will ask @Hang-Lei-NOAA to make requests to install the new libraries on wcoss2. Thanks

@RatkoVasic-NOAA
Copy link
Collaborator

@climbfuji you can add me in description for gaea-c5, hera and jet.

@climbfuji
Copy link
Collaborator Author

@climbfuji you can add me in description for gaea-c5, hera and jet.

Thanks @RatkoVasic-NOAA !

@natalie-perlin
Copy link
Collaborator

@climbfuji - please add me in description for NOAA Parallelworks (AWS, GCloud, Azure)

@climbfuji
Copy link
Collaborator Author

@climbfuji - please add me in description for NOAA Parallelworks (AWS, GCloud, Azure)

Thanks very much @natalie-perlin! I had no problem rolling it out on gcloud, no site config updates needed. So hopefully this works just fine for you out of the box as it did for me.

@AlexanderRichert-NOAA
Copy link
Collaborator

AlexanderRichert-NOAA commented Jan 5, 2024

@junwang-noaa:
The UFSWM-relevant packages that are changing versions between spack-stack 1.5.1 and 1.6.0 are fms, netcdf-fortran, crtm, and sp; here are the packages that will differ from the current (spack-stack 1.5.0-based) UFS WM:

Package Current UFSWM (ufs_common.lua) spack-stack-1.5.1 (PR#2013) spack-stack-1.6.0
esmf 8.4.2 8.5.0 8.5.0 & 8.6.0
fms 2023.01 2023.02.01 2023.04
gftl-shared 1.5.0 1.6.1 1.6.1
mapl 2.35.2 2.40.3 2.40.3
netcdf-fortran 4.6.0 4.6.0 4.6.1
crtm 2.4.0 2.4.0 2.4.0.1
sp 2.3.3 2.3.3 2.5.0

Everything else will be the same as in the current ufs_common.lua:
{["jasper"] = "2.0.32"},
{["zlib"] = "1.2.13"},
{["libpng"] = "1.6.37"},
{["hdf5"] = "1.14.0"},
{["netcdf-c"] = "4.9.2"},
{["parallelio"] = "2.5.10"},
{["bacio"] = "2.4.1"},
{["g2"] = "3.4.5"},
{["g2tmpl"] = "1.10.2"},
{["ip"] = "4.3.0"},
{["w3emc"] = "2.10.0"},
{["scotch"] = "7.0.4"},

@AlexanderRichert-NOAA
Copy link
Collaborator

Acorn is done (no duplicates, no missing shared libs, permissions good): /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.6.0/envs/unified-env

@climbfuji
Copy link
Collaborator Author

@RatkoVasic-NOAA Can you please provide the locations of the unified-env installs and whether gsi-addon-env is also there (hera, jet)? Also, did you have to make any site config updates? If yes, please create a PR for these to the release/1.6.0 branch. Thank you!

@junwang-noaa
Copy link

@climbfuji @AlexanderRichert-NOAA may I ask what new feature is included in the netcdf-fortran 4.6.1? There are several libraries are depends on netcdf, those libraries need to be rebuilt on wcoss2 with hpc-stack. It may take longer time to get all these libraries installed.

@RatkoVasic-NOAA
Copy link
Collaborator

@RatkoVasic-NOAA Can you please provide the locations of the unified-env installs and whether gsi-addon-env is also there (hera, jet)? Also, did you have to make any site config updates? If yes, please create a PR for these to the release/1.6.0 branch. Thank you!

@climbfuji
I) Installation locations:

  1. Hera: /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/unified-env
  2. Gaea-C5: /lustre/f2/dev/role.epic/contrib/spack-stack/c5/spack-stack-1.6.0/envs/unified-env
  3. Jet: /mnt/lfs4/HFIP/hfv3gfs/role.epic/spack-stack/spack-stack-1.6.0/envs/unified-env

II) gsi-addon-env: not sure. What is procedure for that?

III) Changes:
I edited configs/sites/*/packages.yaml
following: https://github.com/JCSDA/spack-stack/pull/927/files#diff-b6d64c02a1900eb793faf0f0e1e7c0d7b9ac660229ae2e695254416d2fb9ca5c
I'll create PR for this changes to release/1.6.0 branch.

@AlexanderRichert-NOAA
Copy link
Collaborator

@junwang-noaa looking at the release notes, it appears the main functional difference between 4.6.0 and 4.6.1 is a fix in the quantization.

@AlexanderRichert-NOAA
Copy link
Collaborator

@RatkoVasic-NOAA re: II, see the instructions at https://spack-stack.readthedocs.io/en/latest/AddingTestPackages.html#add-test-packages. Basically, to install gsi-addon-env, use the --upstream /path/to/spack-stack-1.6.0/envs/unified-env/install option for the spack stack create env command, and use the --upstream-modules option for spack module lmod refresh to use the existing unified-env as an upstream environment. When concretizing, most packages (all of them other than the ones specially set in spack.yaml) should have "[^]" at the beginning of the line, indicating that the package was found in the upstream environment.

@ulmononian
Copy link
Collaborator

@AlexanderRichert-NOAA are you doing spack add gsi-addon-env or feeding spack stack create env the gsi-addon as the --template?

@natalie-perlin
Copy link
Collaborator

natalie-perlin commented Jan 10, 2024

@climbfuji -
It's all done on AWS, GCloud, and being moved to a common location on Azure.
There were different approaches tested on all three platforms, leaving few things yet to be polished out.

All of the stacks are installed in
/contrib/spack-stack/spack-stack-1.6.0/envs/unified-env

@climbfuji
Copy link
Collaborator Author

Thanks @natalie-perlin !

@DavidHuber-NOAA
Copy link
Collaborator

I see that 1.6.0 was installed on f2 on Gaea-C5. Was this intentional? I thought it was supposed to be installed on f5.

@climbfuji
Copy link
Collaborator Author

climbfuji commented Jan 11, 2024 via email

@DavidHuber-NOAA
Copy link
Collaborator

Thanks for clarifying, @climbfuji!

@junwang-noaa
Copy link

@AlexanderRichert-NOAA I tested the spack-stack 1.6.0 with the updated libraries. The RT passed on acorn.
@Hang-Lei-NOAA would you please make requests to install the new libraries: ESMF/8.6.0, FMS/2023.04, crtm/2.4.0.1 and sp/2.5.0.

As for netcdf-fortran 4.6.1, @edwardhartnett may I ask what the major updaters are in netcdf-fortran 4.6.1? I remember NCO may have some reservation on installing minor updates for major libraries that have many dependencies. Not sure if they will approve the request if we ask for 4.6.1. Thanks

@edwardhartnett
Copy link
Collaborator

edwardhartnett commented Jan 11, 2024 via email

@climbfuji
Copy link
Collaborator Author

So, typically patch version updates are bug fixes of existing versions that address real problems such as undefined constants for the above example, or security vulnerabilities, etc, without adding new functionality that have greater potential to break things. I think it would be highly advisable to revert the policy and always go for patch level updates. For spack-stack in general, I suggest we always opt to use the bug-fixed version of a package with otherwise the same functionality.

@DavidHuber-NOAA
Copy link
Collaborator

From @malloryprow:

I see in /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/unified-env/install/modulefiles/intel/2021.5.0 there is py-matplotlib/3.7.3.lua and py-cartopy/0.21.1.lua. These do not exist in /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev/install/modulefiles/intel/2021.5.0.

Could these be populated into gsi-addon? They are needed for the EMC_verif-global repo.

@climbfuji
Copy link
Collaborator Author

From @malloryprow:

I see in /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/unified-env/install/modulefiles/intel/2021.5.0 there is py-matplotlib/3.7.3.lua and py-cartopy/0.21.1.lua. These do not exist in /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev/install/modulefiles/intel/2021.5.0.

Could these be populated into gsi-addon? They are needed for the EMC_verif-global repo.

THat will need to be done via the gsi-addon-env template

@DavidHuber-NOAA
Copy link
Collaborator

DavidHuber-NOAA commented Jan 11, 2024

@climbfuji since the gsi-addon-env uses unified-env as the upstream via

spack stack create env --site <site name> --template gsi-addon-dev --name gsi-addon --upstream <path/to/unified-env/install>

and subsequently

spack module lmod refresh --upstream-modules

shouldn't all of the unified-env modules populate into gsi-addon?

@climbfuji
Copy link
Collaborator Author

No - only the ones that are requested by gsi-addon-env for which the gsi-addon-env requirements match existing packages in unified-env

@DavidHuber-NOAA
Copy link
Collaborator

Alright, thanks for that info. I will update and test the template and handle any needed installations in another issue.

@Hang-Lei-NOAA
Copy link
Collaborator

Hang-Lei-NOAA commented Jan 11, 2024 via email

@climbfuji
Copy link
Collaborator Author

@Hang-Lei-NOAA Can I ask an unrelated question on how boost is installed on WCOSS2?

@Hang-Lei-NOAA
Copy link
Collaborator

Hang-Lei-NOAA commented Jan 11, 2024 via email

@junwang-noaa
Copy link

@Hang-Lei-NOAA would you please provide a module file with the HPC-stack libraries on acorn? Currently the model only has spack stack module file on acorn:

https://github.com/ufs-community/ufs-weather-model/blob/develop/modulefiles/ufs_acorn.intel.lua

I am not sure how to combine these libraries with the libraries you installed with HPC stack.
Thanks

@Hang-Lei-NOAA
Copy link
Collaborator

Hang-Lei-NOAA commented Jan 11, 2024 via email

@climbfuji
Copy link
Collaborator Author

All done!

@DavidHuber-NOAA
Copy link
Collaborator

@AlexanderRichert-NOAA Apologies for coming in late on this conversation, but is it possible to add a few additional install requests for WCOSS2 needed by the global workflow? In particular, we would need

gsi-ncdiag@1.1.2 built against netcdf-fortran@4.6.1
ncio@1.1.2 built against netcdf-fortran@4.6.1
nco@5.0.6
cdo@2.0.5
jasper@2.0.32
grib_util@1.3.0
w3emc@2.10.0

I'm happy to open another issue if that's what's needed here.

@AlexanderRichert-NOAA
Copy link
Collaborator

@Hang-Lei-NOAA can you request those on WCOSS2?

@Hang-Lei-NOAA
Copy link
Collaborator

@AlexanderRichert-NOAA I will handle these.

@DavidHuber-NOAA
#This has been discussed in EMC for a while. Ed, and others also comments on this. The netcdf-fortran/4.6.1 update is a minor change. This will not affect our EMC models. Therefore no changes will be delivered to NCO to update the whole netcdf/4.9.2 frame (by NCO way, the current netcdf-fortran/4.6.1 is installed in netcdf/4.9.2).
gsi-ncdiag@1.1.2 built against netcdf-fortran@4.6.1
ncio@1.1.2 built against netcdf-fortran@4.6.1

nco@5.0.6 I deliver it before new year. Has been installed by GDIT on wcoss2.
cdo@2.0.5 will prepare the test on acorn, and then delivery
jasper@2.0.32 is been tested on acorn before delivery
grib_util@1.3.0 will prepare the test on acorn, and then delivery
w3emc@2.10.0 is been tested on acorn before delivery

@DavidHuber-NOAA
Copy link
Collaborator

@Hang-Lei-NOAA Thank you very much for the explanations and for working on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
INFRA JEDI Infrastructure NOAA-EMC OAR-EPIC NOAA Oceanic and Atmospheric Research and Earth Prediction Innovation Center
Projects
No open projects
Development

No branches or pull requests

9 participants