Skip to content

clean ${tmpdir} after iterations in install_cuda_and_libraries.sh#814

Merged
trz42 merged 3 commits intoEESSI:2023.06-software.eessi.iofrom
TopRichard:eessi-2023.06-fix-clean-up-dir-install_cuda_and_libraries.sh
Nov 18, 2024
Merged

clean ${tmpdir} after iterations in install_cuda_and_libraries.sh#814
trz42 merged 3 commits intoEESSI:2023.06-software.eessi.iofrom
TopRichard:eessi-2023.06-fix-clean-up-dir-install_cuda_and_libraries.sh

Conversation

@TopRichard
Copy link
Copy Markdown
Collaborator

The ${tmpdir} can be cleaned after all iterations are done when having multiple easystacks/eessi-*CUDA*.yml files

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Nov 14, 2024

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-software, eessi-hpc.org-2023.06-compat, eessi.io-2023.06-software

@riscv-eessi-io-bot
Copy link
Copy Markdown

Instance eessi-bot-riscv is configured to build for:

  • architectures: riscv64/generic
  • repositories: riscv.eessi.io-20240402

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Nov 14, 2024

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi-hpc.org-2023.06-software, eessi-hpc.org-2023.06-compat, eessi.io-2023.06-software, eessi.io-2023.06-compat

@TopRichard TopRichard added 2023.06-software.eessi.io 2023.06 version of software.eessi.io accel:nvidia labels Nov 14, 2024
@TopRichard
Copy link
Copy Markdown
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic

@riscv-eessi-io-bot
Copy link
Copy Markdown

Updates by the bot instance eessi-bot-riscv (click for details)
  • account TopRichard has NO permission to send commands to the bot

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Nov 14, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from TopRichard

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

    • no jobs were submitted

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Nov 14, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from TopRichard

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Nov 14, 2024

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.11/pr_814/28848

date job status comment
Nov 14 10:42:43 UTC 2024 submitted job id 28848 awaits release by job manager
Nov 14 10:43:06 UTC 2024 released job awaits launch by Slurm scheduler
Nov 14 10:49:13 UTC 2024 running job 28848 is running
Nov 14 10:56:24 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-28848.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-generic-1731581378.tar.gzsize: 0 MiB (3659 bytes)
entries: 1
modules under 2023.06/software/linux/x86_64/generic/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/generic/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/generic
2023.06/scripts/gpu_support/nvidia/install_cuda_and_libraries.sh
Nov 14 10:56:24 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] ( 1/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos /aeb2d9df @BotBuildTests:x86-64-generic-node+default
P: perf: 452.062 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 2/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos /04ff9ece @BotBuildTests:x86-64-generic-node+default
P: perf: 460.258 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 3/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /31ac6ab9 @BotBuildTests:x86-64-generic-node+default
P: latency: 8.85 us (r:0, l:None, u:None)
[ OK ] ( 4/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /f3be40a2 @BotBuildTests:x86-64-generic-node+default
P: latency: 5.17 us (r:0, l:None, u:None)
[ OK ] ( 5/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /10e66fba @BotBuildTests:x86-64-generic-node+default
P: latency: 9.19 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /5be57ae7 @BotBuildTests:x86-64-generic-node+default
P: latency: 7.88 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /c8c9aff5 @BotBuildTests:x86-64-generic-node+default
P: latency: 0.69 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /9795e491 @BotBuildTests:x86-64-generic-node+default
P: latency: 0.76 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /48da21c5 @BotBuildTests:x86-64-generic-node+default
P: bandwidth: 10417.54 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /1b8c1ca2 @BotBuildTests:x86-64-generic-node+default
P: bandwidth: 10439.41 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-28848.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@TopRichard TopRichard requested a review from boegel November 15, 2024 09:25
Copy link
Copy Markdown
Collaborator

@trz42 trz42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the full ${tmpdir} inside the loop likely lets the script fail if one processes multiple easystack files. However, the tmpdir directory is only needed for each loop iteration, so we don't need to wait with the clean up until the loop has processed all easystack files.

Comment thread scripts/gpu_support/nvidia/install_cuda_and_libraries.sh Outdated
Richard Top added 2 commits November 18, 2024 12:53
…ftware-layer into eessi-2023.06-fix-clean-up-dir-install_cuda_and_libraries.sh
@TopRichard TopRichard force-pushed the eessi-2023.06-fix-clean-up-dir-install_cuda_and_libraries.sh branch from 1f6487e to 66c36c8 Compare November 18, 2024 13:03
@TopRichard TopRichard requested review from trz42 and removed request for boegel November 18, 2024 13:04
@TopRichard
Copy link
Copy Markdown
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic

@riscv-eessi-io-bot
Copy link
Copy Markdown

Updates by the bot instance eessi-bot-riscv (click for details)
  • account TopRichard has NO permission to send commands to the bot

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Nov 18, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from TopRichard

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Nov 18, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from TopRichard

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

    • no jobs were submitted

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Nov 18, 2024

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.11/pr_814/29582

date job status comment
Nov 18 13:18:49 UTC 2024 submitted job id 29582 awaits release by job manager
Nov 18 13:18:52 UTC 2024 released job awaits launch by Slurm scheduler
Nov 18 13:24:54 UTC 2024 running job 29582 is running
Nov 18 13:32:01 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-29582.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-generic-1731936299.tar.gzsize: 0 MiB (3663 bytes)
entries: 1
modules under 2023.06/software/linux/x86_64/generic/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/generic/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/generic
2023.06/scripts/gpu_support/nvidia/install_cuda_and_libraries.sh
Nov 18 13:32:01 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] ( 1/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos /aeb2d9df @BotBuildTests:x86-64-generic-node+default
P: perf: 458.116 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 2/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos /04ff9ece @BotBuildTests:x86-64-generic-node+default
P: perf: 461.765 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 3/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /31ac6ab9 @BotBuildTests:x86-64-generic-node+default
P: latency: 5.25 us (r:0, l:None, u:None)
[ OK ] ( 4/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /f3be40a2 @BotBuildTests:x86-64-generic-node+default
P: latency: 5.14 us (r:0, l:None, u:None)
[ OK ] ( 5/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /10e66fba @BotBuildTests:x86-64-generic-node+default
P: latency: 7.94 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /5be57ae7 @BotBuildTests:x86-64-generic-node+default
P: latency: 7.87 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /c8c9aff5 @BotBuildTests:x86-64-generic-node+default
P: latency: 0.61 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /9795e491 @BotBuildTests:x86-64-generic-node+default
P: latency: 0.71 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /48da21c5 @BotBuildTests:x86-64-generic-node+default
P: bandwidth: 10523.26 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /1b8c1ca2 @BotBuildTests:x86-64-generic-node+default
P: bandwidth: 10426.54 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-29582.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Nov 18 19:07:54 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-generic-1731936299.tar.gz to S3 bucket succeeded

Copy link
Copy Markdown
Collaborator

@trz42 trz42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@trz42 trz42 added the bot:deploy Ask bot to deploy missing software installations to EESSI label Nov 18, 2024
@riscv-eessi-io-bot
Copy link
Copy Markdown

Label bot:deploy has been set by user trz42, but this person does not have permission to trigger deployments

@trz42 trz42 merged commit 64545d2 into EESSI:2023.06-software.eessi.io Nov 18, 2024
@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Nov 18, 2024

PR merged! Moved ['/project/def-users/SHARED/jobs/2024.11/pr_814/28848', '/project/def-users/SHARED/jobs/2024.11/pr_814/29582'] to /project/def-users/SHARED/trash_bin/EESSI/software-layer/2024.11.18

@riscv-eessi-io-bot
Copy link
Copy Markdown

PR merged! Moved [] to /home/eessibot/shared/trash_bin/EESSI/software-layer/2024.11.18

@eessi-bot
Copy link
Copy Markdown

eessi-bot Bot commented Nov 18, 2024

PR merged! Moved [] to /project/def-users/SHARED/trash_bin/EESSI/software-layer/2024.11.18

# clean up tmpdir
rm -rf "${tmpdir}"
# clean up tmpdir content
rm -rf "${tmpdir}"/*
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we then remove ${tmpdir} after the loop?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that approach only removes the contents of $tmpdir, never $tmpdir itself. This can be done outside the loop, see #816

@TopRichard TopRichard deleted the eessi-2023.06-fix-clean-up-dir-install_cuda_and_libraries.sh branch March 30, 2026 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2023.06-software.eessi.io 2023.06 version of software.eessi.io accel:nvidia bot:deploy Ask bot to deploy missing software installations to EESSI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants