Skip to content

Create stack-venv meta modules for environment views & generate fewer modules#1966

Merged
climbfuji merged 18 commits intoJCSDA:developfrom
climbfuji:feature/mod_exclude_neptune_env_view
Apr 1, 2026
Merged

Create stack-venv meta modules for environment views & generate fewer modules#1966
climbfuji merged 18 commits intoJCSDA:developfrom
climbfuji:feature/mod_exclude_neptune_env_view

Conversation

@climbfuji
Copy link
Copy Markdown
Collaborator

@climbfuji climbfuji commented Mar 30, 2026

Description

To avoid loading a huge number of modules every time, in particular every time a Python stack is loaded, this PR does the following:

  1. For every view configured for an environment, a module stack-venv/<view-name> is created in the modules/Core directory of the environment (same directory as for stack-<compiler>/<version>). The stack-venv module requires the stack-<compiler> and, if applicable, <stack-mpi> modules being loaded. While working on this, I discovered that the current syntax for specifying module requirements were incorrect for lua/lmod. I updated the syntax to match the tcl environment modules syntax.
  2. Add a few basic packages to the module-exclude lists in configs/common/modules_*.yaml.

Note 1. Spack generated an activate script in /path/to/view/bin/ for every view. Unfortunately, this script isn't sufficient for using the view, because it doesn't set LD_LIBRARY_PATH. This causes problems, for example when trying to import eccodes in Python, because the ecCodes library (separate package) cannot be found. Further, the prompt modification for the environment is long and cryptic (contains part of the view hash). An alternative or addition to the stack-venv/<view-name> module is to create a separate directory with activation scripts that do the same as the stack-venv module. Those scripts could also manipulate the prompt, which is very difficult to do with modules across platforms and module types. Since activation scripts are not modules, that logic doesn't really fit into the setup-meta-modules extension. But making it its own extension will inevitably lead to code duplication. Thus, in the spirit of making smaller, incremental changes instead of a waterfall approach, I suggest to move ahead with the stack-venv module approach first and see how this pans out across teams and applications.

Note 2. I tested this on HPCMP Blueback (Cray, tcl modules) and NRL Atlantis (lua modules) with the NEPTUNE end-to-end system (NEPTUNE + cylc).

Dependencies

None

Issues addressed

Loading too many modules can take a very long time, especially on Cray systems using tcl environment modules, and even lead to odd failures where module li displays nothing because one of the environment variables that the module system uses exceeds the maximum length allowed for the string.

Applications affected

NEPTUNE

Systems affected

None

Testing

  • CI: Note whether the automatic tests (GitHub actions tests that run automatically for every commit) pass or not
    • GitHub actions CI tests pass
    • GitHub actions CI tests do not pass (provide explanation)
    • GitHub actions CI tests skipped (provide explanation if necessary)
  • New tests added: List and describe any new tests added to GitHub actions
    • ...
  • Additional testing: Add information on any additional tests conducted
    • ...

Checklist

  • This PR addresses one issue/problem/enhancement or has a very good reason for not doing so.
  • These changes have been tested on the affected systems and applications.
  • All dependency PRs/issues have been resolved and this PR can be merged.
  • All necessary updates to the documentation (spack-stack wiki) will be made when this PR is merged

@climbfuji climbfuji changed the title WIP Create stack-venv meta modules for environment views WIP Create stack-venv meta modules for environment views & generate fewer modules Mar 31, 2026
@climbfuji climbfuji changed the title WIP Create stack-venv meta modules for environment views & generate fewer modules Create stack-venv meta modules for environment views & generate fewer modules Mar 31, 2026
Comment on lines +11 to +17
- neptune-env +debug +openmp
- neptune-env ~debug ~openmp
- neptune-env +debug ~openmp
- esmf +debug +openmp
- esmf ~debug ~openmp
- esmf +debug ~openmp
- ip ~openmp
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A view can't have duplicates (see lines 24-27). If needed, we can add other views (debug-openmp, release-noopenmp, ...) later.

@climbfuji climbfuji moved this from Todo to In Progress in spack-stack-2.2.x (2026 Q?) Mar 31, 2026
@climbfuji climbfuji marked this pull request as ready for review March 31, 2026 14:29
@mathomp4
Copy link
Copy Markdown
Collaborator

@climbfuji So, does this mean when we do our instructions on how to load an env (like say geos-gcm-env?) when we do module list we won't see 150+ modules loaded? Or just less than before?

if module_choice == "lmod":
return f"""load("{module}")
prereq("{module}")\n"""
return f"""if (mode() == "load" and not isloaded("{module}")) then
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic matches the tcl environment module logic, and it prevents issues such as the unintentional unloading of the compiler and mpi module. That is, only unload compiler/mpi modules if they weren't loaded before the stack-venv module was loaded.

@climbfuji
Copy link
Copy Markdown
Collaborator Author

@climbfuji So, does this mean when we do our instructions on how to load an env (like say geos-gcm-env?) when we do module list we won't see 150+ modules loaded? Or just less than before?

For now, what I am doing with this is:

  1. Load the regular neptune-env modules and its dependencies for building the NEPTUNE ecosystem. That's about 50 or so modules in total.
  2. During the installation of NEPTUNE Python packages and in the cylc workflow, I load stack-venv/release-openmp (one additional module) instead of neptune-python-env (another 30 modules or so).

We can take this further, and this might well be the way to go, and not load any of the modules but just stack-venv/name-of-view. This would make it very easy to load a debug environment (stack-venv/debug-openmp in the neptune-dev example) and one would only see three modules (compiler, mpi, venv). Or make our own activate scripts instead of loading a stack-venv module. This allows us to do prompt modifications and so on.

The downside is that the user doesn't see which versions of which libraries are used. We could generate a script/reuse the existing script in utils/ to provide this information and display it after loading stack-venv.

I see this PR as a first step to greatly speed up loading the Python environments, especially on Cray and in workflows where these modules are loaded over and over again.

@climbfuji climbfuji enabled auto-merge (squash) March 31, 2026 20:42
@climbfuji climbfuji merged commit 4e8d04e into JCSDA:develop Apr 1, 2026
6 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in spack-stack-2.2.x (2026 Q?) Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

3 participants