Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split off the Lmod hooks from lmodrc.lua into a seperate SitePackage.lua file #496

Merged

Conversation

casparvl
Copy link
Collaborator

@casparvl casparvl commented Mar 11, 2024

Fixes #491

This PR splits the current lmodrc.lua files in two parts: an lmodrc.lua containing the stuff that actually belongs there, i.e. the definition of propT and scDescriptT tables. The second part is a SitePackage.lua containing the stuff that actually belongs there, i.e. the Lmod hooks https://lmod.readthedocs.io/en/latest/170_hooks.html

Copy link

eessi-bot-aws bot commented Mar 11, 2024

Instance eessi-bot-mc-aws is configured to build:

  • arch x86_64/generic for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/generic for repo eessi-hpc.org-2023.06-software
  • arch x86_64/generic for repo eessi.io-2023.06-compat
  • arch x86_64/generic for repo eessi.io-2023.06-software
  • arch x86_64/intel/haswell for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi-hpc.org-2023.06-software
  • arch x86_64/intel/haswell for repo eessi.io-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi.io-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi.io-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi.io-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi.io-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi.io-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi.io-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi.io-2023.06-software
  • arch aarch64/generic for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/generic for repo eessi-hpc.org-2023.06-software
  • arch aarch64/generic for repo eessi.io-2023.06-compat
  • arch aarch64/generic for repo eessi.io-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi-hpc.org-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi.io-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi.io-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi-hpc.org-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi.io-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi.io-2023.06-software

@casparvl casparvl changed the title Split off the LMOD hooks from lrmodrc.lua into a seperate SitePackage.lua file Split off the LMOD hooks from lmodrc.lua into a seperate SitePackage.lua file Mar 11, 2024
@casparvl casparvl marked this pull request as ready for review March 11, 2024 15:24
Caspar van Leeuwen added 6 commits March 12, 2024 19:12
…n an explicit list from one dir to another. Use that generic function to replace the existing code copying the scripts dir, and the scripts/gpu_support/nvidia dir. Add to that a copy of the init dir and the current subdirs, again listing the files to copy explicitely for safety.
…ck for three arguments, it can be anything greater than 2
@casparvl
Copy link
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic

Copy link

eessi-bot-aws bot commented Mar 12, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)

Copy link

eessi-bot-aws bot commented Mar 12, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_496/7600

date job status comment
Mar 12 18:26:16 UTC 2024 submitted job id 7600 awaits release by job manager
Mar 12 18:26:32 UTC 2024 released job awaits launch by Slurm scheduler
Mar 12 18:31:44 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7600.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-generic-1710268279.tar.gzsize: 0 MiB (206275 bytes)
entries: 5
modules under 2023.06/software/linux/x86_64/generic/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/generic/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/generic
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp
.lmod/lmodrc.lua
.lmod/SitePackage.lua
Mar 12 18:31:44 UTC 2024 test result (no tests yet)

@casparvl
Copy link
Collaborator Author

From the logs:

**Files /project/60006/SHARED/jobs/2024.03/pr_496/event_00e423b0-e09e-11ee-9fcb-e53cb1a7fa7c/run_000/linux_x86_64_generic/eessi.io-2023.06-software/init/eessi_environment_variables and /cvmfs/software.eessi.io/versions/2023.06/init/eessi_environment_variables differ
File /project/60006/SHARED/jobs/2024.03/pr_496/event_00e423b0-e09e-11ee-9fcb-e53cb1a7fa7c/run_000/linux_x86_64_generic/eessi.io-2023.06-software/init/eessi_environment_variables copied to /cvmfs/software.eessi.io/versions/2023.06/init/eessi_environment_variables

It does get copied, but looking at the artefacts, it does not end up in the tarball. Pretty sure our create_tarball.sh needs a change to also consider anything in init. E.g. for scripts it does:

# include scripts that were copied by install_scripts.sh, which we want to ship in EESSI repository
if [ -d ${eessi_version}/scripts ]; then
    find ${eessi_version}/scripts -type f | grep -v '/\.wh\.' >> ${files_list}
fi

@casparvl
Copy link
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic

Copy link

eessi-bot-aws bot commented Mar 14, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from casparvl
    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic

Copy link

eessi-bot-aws bot commented Mar 14, 2024

496.diff:187: trailing whitespace. -- If we try to load CUDA itself, check if the full CUDA SDK was installed on the host in host_injections. error: patch failed: init/eessi_environment_variables:76 error: init/eessi_environment_variables: patch does not apply Unable to download or merge changes between the source branch and the destination branch.Tip: This can usually be resolved by syncing your branch and resolving any merge conflicts.

@casparvl
Copy link
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic

Copy link

eessi-bot-aws bot commented Mar 14, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)

Copy link

eessi-bot-aws bot commented Mar 14, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_496/8113

date job status comment
Mar 14 08:40:46 UTC 2024 submitted job id 8113 awaits release by job manager
Mar 14 08:41:08 UTC 2024 released job awaits launch by Slurm scheduler
Mar 14 08:45:28 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-8113.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-generic-1710405898.tar.gzsize: 0 MiB (208739 bytes)
entries: 6
modules under 2023.06/software/linux/x86_64/generic/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/generic/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/generic
2023.06/init/eessi_environment_variables
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp
.lmod/lmodrc.lua
.lmod/SitePackage.lua
Mar 14 08:45:28 UTC 2024 test result (no tests yet)

Caspar van Leeuwen added 2 commits March 14, 2024 17:47
Copy link
Collaborator

@trz42 trz42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Nice work. Walked through this with @casparvl during hackathon on 2024-03-26.

The output of the contents of the tarball in the PR here are misleading. We looked at the contents and it is ok. Also the deployment/ingestion procedure is essentially the same for init and software tarballs, hence it should work to combine these in a single tarball for now.

@trz42 trz42 merged commit 784df83 into EESSI:2023.06-software.eessi.io Mar 26, 2024
34 checks passed
@bedroge
Copy link
Collaborator

bedroge commented Mar 26, 2024

bot: build repo:eessi.io-2023.06-software arch:aarch64/generic
bot: build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1
bot: build repo:eessi.io-2023.06-software arch:aarch64/neoverse_v1
bot: build repo:eessi.io-2023.06-software arch:x86_64/generic
bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen2
bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3
bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell
bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512

Copy link

eessi-bot-aws bot commented Mar 26, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/generic from bedroge
    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/generic
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1 from bedroge
    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/neoverse_v1 from bedroge
    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_v1
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from bedroge
    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen2 from bedroge
    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen2
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from bedroge
    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell from bedroge
    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/haswell
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512 from bedroge
    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512

Copy link

eessi-bot-aws bot commented Mar 26, 2024

496.diff:187: trailing whitespace. -- If we try to load CUDA itself, check if the full CUDA SDK was installed on the host in host_injections. error: patch failed: EESSI-install-software.sh:248 error: EESSI-install-software.sh: patch does not apply error: patch failed: create_lmodrc.py:17 error: create_lmodrc.py: patch does not apply error: create_lmodsitepackage.py: already exists in working directory error: patch failed: create_tarball.sh:51 error: create_tarball.sh: patch does not apply error: patch failed: init/eessi_environment_variables:85 error: init/eessi_environment_variables: patch does not apply error: patch failed: install_scripts.sh:25 error: install_scripts.sh: patch does not apply Unable to download or merge changes between the source branch and the destination branch.Tip: This can usually be resolved by syncing your branch and resolving any merge conflicts.

@bedroge
Copy link
Collaborator

bedroge commented Mar 26, 2024

This should have been built and deployed (in order to get the new SitePackage.lua into the repo), but doesn't seem like we can still do that.

@casparvl
Copy link
Collaborator Author

Note that this did get deployed in the end, as part of another PR... See the artifacts in #511 (comment) :)

casparvl pushed a commit to casparvl/software-layer that referenced this pull request Mar 29, 2024
@boegel boegel changed the title Split off the LMOD hooks from lmodrc.lua into a seperate SitePackage.lua file Split off the Lmod hooks from lmodrc.lua into a seperate SitePackage.lua file Apr 4, 2024
trz42 pushed a commit to trz42/software-layer that referenced this pull request Apr 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hooks should be defined in SitePackage.lua, not in lmodrc.lua
3 participants