Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configure Lmod via $LMOD_CONFIG_DIR and $LMOD_PACKAGE_PATH + don't set $LMOD_RC #524

Merged
merged 2 commits into from
Apr 2, 2024

Conversation

casparvl
Copy link
Collaborator

The deployment of #496 (comment) was forgotten. We thought it happened in #511 (comment) , but that didn't include the lmodrc.lua file for some reason. Thus, we are now stuck with duplication: both the lmodrc.lua and SitePackage.lua now contain hook definitions...

[casparl@tcn1 ~]$ cat /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/.lmod/lmodrc.lua | grep hook
local hook = require("Hook")
local function eessi_cuda_enabled_load_hook(t)
local function eessi_openmpi_load_hook(t)
-- Combine both functions into a single one, as we can only register one function as load hook in lmod
function eessi_load_hook(t)
    eessi_cuda_enabled_load_hook(t)
    eessi_openmpi_load_hook(t)
hook.register("load", eessi_load_hook)
[casparl@tcn1 ~]$ cat /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/.lmod/SitePackage.lua | grep hook
local hook = require("Hook")
local function eessi_cuda_enabled_load_hook(t)
-- Combine both functions into a single one, as we can only register one function as load hook in lmod
function eessi_load_hook(t)
    eessi_cuda_enabled_load_hook(t)
hook.register("load", eessi_load_hook)

We need to make some small change in order to convince the installation script to reinstall the lmodrc file. I've done that by simply adding a comment - not essential, but it doesn't hurt either.

Also, we forgot to remove setting the LMOD_RC environment variable. This is no longer needed, as we now set LMOD_CONFIG_DIR so that we allow host sites to still have something that comes 'later' in the search order (see search order on https://lmod.readthedocs.io/en/latest/145_properties.html ). Also, Alan moved setting the LMOD_RC outside of the if-condition for minimal EESSI environment. We should do the same for the full LMOD configuration, i.e. both for setting LMOD_CONFIG_DIR and LMOD_PACKAGE_PATH. That is done here.

Copy link

eessi-bot-aws bot commented Mar 29, 2024

Instance eessi-bot-mc-aws is configured to build:

  • arch x86_64/generic for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/generic for repo eessi-hpc.org-2023.06-software
  • arch x86_64/generic for repo eessi.io-2023.06-compat
  • arch x86_64/generic for repo eessi.io-2023.06-software
  • arch x86_64/intel/haswell for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi-hpc.org-2023.06-software
  • arch x86_64/intel/haswell for repo eessi.io-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi.io-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi.io-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi.io-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi.io-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi.io-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi.io-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi.io-2023.06-software
  • arch aarch64/generic for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/generic for repo eessi-hpc.org-2023.06-software
  • arch aarch64/generic for repo eessi.io-2023.06-compat
  • arch aarch64/generic for repo eessi.io-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi-hpc.org-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi.io-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi.io-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi-hpc.org-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi.io-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi.io-2023.06-software

@casparvl
Copy link
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic
bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell
bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512
bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen2
bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3
bot: build repo:eessi.io-2023.06-software arch:aarch64/generic
bot: build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1
bot: build repo:eessi.io-2023.06-software arch:aarch64/neoverse_v1

Copy link

eessi-bot-aws bot commented Mar 29, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)

Copy link

eessi-bot-aws bot commented Mar 29, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_524/8760

date job status comment
Mar 29 14:58:30 UTC 2024 submitted job id 8760 awaits release by job manager
Mar 29 14:59:31 UTC 2024 released job awaits launch by Slurm scheduler
Mar 29 15:05:50 UTC 2024 running job 8760 is running
Mar 29 15:21:02 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-8760.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-generic-1711724752.tar.gzsize: 0 MiB (1762 bytes)
entries: 2
modules under 2023.06/software/linux/x86_64/generic/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/generic/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/generic
2023.06/init/eessi_environment_variables
.lmod/lmodrc.lua
Mar 29 15:21:02 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8760.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Mar 29 16:05:09 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-generic-1711724752.tar.gz to S3 bucket succeeded
Apr 02 13:52:00 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-generic-1711724752.tar.gz to S3 bucket succeeded

Copy link

eessi-bot-aws bot commented Mar 29, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-intel-haswell for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_524/8761

date job status comment
Mar 29 14:58:34 UTC 2024 submitted job id 8761 awaits release by job manager
Mar 29 14:59:33 UTC 2024 released job awaits launch by Slurm scheduler
Mar 29 15:06:57 UTC 2024 running job 8761 is running
Mar 29 15:21:04 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-8761.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-haswell-1711724777.tar.gzsize: 0 MiB (1768 bytes)
entries: 2
modules under 2023.06/software/linux/x86_64/intel/haswell/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/intel/haswell/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/intel/haswell
2023.06/init/eessi_environment_variables
.lmod/lmodrc.lua
Mar 29 15:21:04 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8761.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Mar 29 16:05:28 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-intel-haswell-1711724777.tar.gz to S3 bucket succeeded
Apr 02 13:52:20 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-intel-haswell-1711724777.tar.gz to S3 bucket succeeded

Copy link

eessi-bot-aws bot commented Mar 29, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-intel-skylake_avx512 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_524/8762

date job status comment
Mar 29 14:58:37 UTC 2024 submitted job id 8762 awaits release by job manager
Mar 29 14:59:34 UTC 2024 released job awaits launch by Slurm scheduler
Mar 29 15:06:58 UTC 2024 running job 8762 is running
Mar 29 15:20:00 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-8762.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-skylake_avx512-1711724776.tar.gzsize: 0 MiB (1778 bytes)
entries: 2
modules under 2023.06/software/linux/x86_64/intel/skylake_avx512/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/intel/skylake_avx512/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/intel/skylake_avx512
2023.06/init/eessi_environment_variables
.lmod/lmodrc.lua
Mar 29 15:20:00 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8762.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Mar 29 16:05:51 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-intel-skylake_avx512-1711724776.tar.gz to S3 bucket succeeded
Apr 02 13:52:39 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-intel-skylake_avx512-1711724776.tar.gz to S3 bucket succeeded

Copy link

eessi-bot-aws bot commented Mar 29, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-amd-zen2 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_524/8763

date job status comment
Mar 29 14:58:41 UTC 2024 submitted job id 8763 awaits release by job manager
Mar 29 14:59:27 UTC 2024 released job awaits launch by Slurm scheduler
Mar 29 15:05:46 UTC 2024 running job 8763 is running
Mar 29 15:24:07 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-8763.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen2-1711724752.tar.gzsize: 0 MiB (1765 bytes)
entries: 2
modules under 2023.06/software/linux/x86_64/amd/zen2/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/amd/zen2/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/amd/zen2
2023.06/init/eessi_environment_variables
.lmod/lmodrc.lua
Mar 29 15:24:07 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8763.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Mar 29 16:06:10 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-amd-zen2-1711724752.tar.gz to S3 bucket succeeded
Apr 02 13:52:58 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-amd-zen2-1711724752.tar.gz to S3 bucket succeeded

Copy link

eessi-bot-aws bot commented Mar 29, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-amd-zen3 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_524/8764

date job status comment
Mar 29 14:58:44 UTC 2024 submitted job id 8764 awaits release by job manager
Mar 29 14:59:29 UTC 2024 released job awaits launch by Slurm scheduler
Mar 29 15:05:48 UTC 2024 running job 8764 is running
Mar 29 15:16:49 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-8764.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen3-1711724749.tar.gzsize: 0 MiB (1766 bytes)
entries: 2
modules under 2023.06/software/linux/x86_64/amd/zen3/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/amd/zen3/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/amd/zen3
2023.06/init/eessi_environment_variables
.lmod/lmodrc.lua
Mar 29 15:16:49 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8764.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Mar 29 16:06:29 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-amd-zen3-1711724749.tar.gz to S3 bucket succeeded
Apr 02 13:53:17 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-amd-zen3-1711724749.tar.gz to S3 bucket succeeded

Copy link

eessi-bot-aws bot commented Mar 29, 2024

New job on instance eessi-bot-mc-aws for architecture aarch64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_524/8765

date job status comment
Mar 29 14:58:48 UTC 2024 submitted job id 8765 awaits release by job manager
Mar 29 14:59:22 UTC 2024 released job awaits launch by Slurm scheduler
Mar 29 15:03:36 UTC 2024 running job 8765 is running
Mar 29 15:13:35 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-8765.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-generic-1711724639.tar.gzsize: 0 MiB (1765 bytes)
entries: 2
modules under 2023.06/software/linux/aarch64/generic/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/generic/software
no software packages in tarball
other under 2023.06/software/linux/aarch64/generic
2023.06/init/eessi_environment_variables
.lmod/lmodrc.lua
Mar 29 15:13:35 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8765.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Mar 29 16:06:48 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-aarch64-generic-1711724639.tar.gz to S3 bucket succeeded
Apr 02 13:53:36 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-aarch64-generic-1711724639.tar.gz to S3 bucket succeeded

Copy link

eessi-bot-aws bot commented Mar 29, 2024

New job on instance eessi-bot-mc-aws for architecture aarch64-neoverse_n1 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_524/8766

date job status comment
Mar 29 14:58:51 UTC 2024 submitted job id 8766 awaits release by job manager
Mar 29 14:59:24 UTC 2024 released job awaits launch by Slurm scheduler
Mar 29 15:03:38 UTC 2024 running job 8766 is running
Mar 29 15:13:37 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-8766.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_n1-1711724639.tar.gzsize: 0 MiB (1764 bytes)
entries: 2
modules under 2023.06/software/linux/aarch64/neoverse_n1/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/neoverse_n1/software
no software packages in tarball
other under 2023.06/software/linux/aarch64/neoverse_n1
2023.06/init/eessi_environment_variables
.lmod/lmodrc.lua
Mar 29 15:13:37 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8766.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Mar 29 16:07:07 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-aarch64-neoverse_n1-1711724639.tar.gz to S3 bucket succeeded
Apr 02 13:53:55 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-aarch64-neoverse_n1-1711724639.tar.gz to S3 bucket succeeded

Copy link

eessi-bot-aws bot commented Mar 29, 2024

New job on instance eessi-bot-mc-aws for architecture aarch64-neoverse_v1 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_524/8767

date job status comment
Mar 29 14:58:55 UTC 2024 submitted job id 8767 awaits release by job manager
Mar 29 14:59:26 UTC 2024 released job awaits launch by Slurm scheduler
Mar 29 15:03:39 UTC 2024 running job 8767 is running
Mar 29 15:11:24 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-8767.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_v1-1711724637.tar.gzsize: 0 MiB (1764 bytes)
entries: 2
modules under 2023.06/software/linux/aarch64/neoverse_v1/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/neoverse_v1/software
no software packages in tarball
other under 2023.06/software/linux/aarch64/neoverse_v1
2023.06/init/eessi_environment_variables
.lmod/lmodrc.lua
Mar 29 15:11:24 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8767.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Mar 29 16:07:25 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-aarch64-neoverse_v1-1711724637.tar.gz to S3 bucket succeeded
Apr 02 13:54:14 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-aarch64-neoverse_v1-1711724637.tar.gz to S3 bucket succeeded

@trz42
Copy link
Collaborator

trz42 commented Mar 29, 2024

Checklist before starting deployment (setting bot:deploy label):

  • Check if the SPDX license identifier is provided
    • not needed, only updating files provided by this EESSI GitHub repo
  • Check whether builds for all required architectures succeed (SUCCESS message + reasonably sized tarball)
  • Check if the PR is up-to-date with the target branch 2023.06-software.eessi.io in the repository (if not what are the differences)
    • it is actually lacks 834786e, but this should not result in conflicts
  • Assess if all requested changes are sound (checking files changed on GitHub.com)
    • changes make sense and are well described
  • Verify that all easyconfig/s being built are included with the EB version used (if not why not)
    • no easyconfigs were used
  • Review changes (if any) needed to get the build(s) succeed (common changes for all architectures, changes for a single architecture, changes because of build environment specifics, etc.)

@trz42 trz42 added the bot:deploy Ask bot to deploy missing software installations to EESSI label Mar 29, 2024
@casparvl casparvl added the bug Something isn't working label Mar 29, 2024
@bedroge bedroge added bot:deploy Ask bot to deploy missing software installations to EESSI and removed bot:deploy Ask bot to deploy missing software installations to EESSI labels Apr 2, 2024
Copy link
Collaborator

@bedroge bedroge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tarballs have been ingested.

@bedroge bedroge merged commit 3bca5b7 into EESSI:2023.06-software.eessi.io Apr 2, 2024
33 checks passed
@boegel boegel changed the title Fix lmodrc configure Lmod via $LMOD_CONFIG_DIR and $LMOD_PACKAGE_PATH + don't set $LMOD_RC Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:deploy Ask bot to deploy missing software installations to EESSI bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants