Skip to content

Debug amd gpu prefix#216

Closed
casparvl wants to merge 5 commits intoEESSI:mainfrom
casparvl:debug_amd_gpu_prefix
Closed

Debug amd gpu prefix#216
casparvl wants to merge 5 commits intoEESSI:mainfrom
casparvl:debug_amd_gpu_prefix

Conversation

@casparvl
Copy link
Copy Markdown
Contributor

No description provided.

@casparvl
Copy link
Copy Markdown
Contributor Author

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws on:arch=zen3 for:arch=x86_64/amd/zen3,accel=amd/gfx90a

@eessi-bot-aws
Copy link
Copy Markdown

eessi-bot-aws Bot commented Apr 22, 2026

New job on instance eessi-bot-mc-aws for repository eessi.io-2025.06-software
Building on: amd-zen3
Building for: x86_64/amd/zen3 and accelerator amd/gfx90a
Job dir: /project/def-users/SHARED/jobs/2026.04/pr_216/150451

date job status comment
Apr 22 17:10:06 UTC 2026 submitted job id 150451 awaits release by job manager
Apr 22 17:10:15 UTC 2026 released job awaits launch by Slurm scheduler
Apr 22 17:15:36 UTC 2026 running job 150451 is running
Apr 22 17:24:43 UTC 2026 finished job id 150451 was cancelled
Apr 22 17:25:58 UTC 2026 finished
🤷 UNKNOWN (click triangle for detailed information)
  • Job results file _bot_job150451.result does not exist in job directory or reading it failed.
  • No artefacts were found/reported.
Apr 22 17:25:58 UTC 2026 test result
🤷 UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job150451.test does not exist in job directory or reading it failed.

@casparvl
Copy link
Copy Markdown
Contributor Author

bot:cancel job:150451

@casparvl
Copy link
Copy Markdown
Contributor Author

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws on:arch=zen3 for:arch=x86_64/amd/zen3,accel=amd/gfx90a

@eessi-bot-aws
Copy link
Copy Markdown

eessi-bot-aws Bot commented Apr 22, 2026

New job on instance eessi-bot-mc-aws for repository eessi.io-2025.06-software
Building on: amd-zen3
Building for: x86_64/amd/zen3 and accelerator amd/gfx90a
Job dir: /project/def-users/SHARED/jobs/2026.04/pr_216/150452

date job status comment
Apr 22 17:24:59 UTC 2026 submitted job id 150452 awaits release by job manager
Apr 22 17:25:45 UTC 2026 released job awaits launch by Slurm scheduler
Apr 22 17:27:12 UTC 2026 running job 150452 is running
Apr 22 17:40:11 UTC 2026 finished job id 150452 was cancelled
Apr 22 17:41:38 UTC 2026 finished
🤷 UNKNOWN (click triangle for detailed information)
  • Job results file _bot_job150452.result does not exist in job directory or reading it failed.
  • No artefacts were found/reported.
Apr 22 17:41:38 UTC 2026 test result
🤷 UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job150452.test does not exist in job directory or reading it failed.

@casparvl
Copy link
Copy Markdown
Contributor Author

bot:cancel job:150452

@casparvl
Copy link
Copy Markdown
Contributor Author

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws on:arch=zen3 for:arch=x86_64/amd/zen3,accel=amd/gfx90a

@eessi-bot-aws
Copy link
Copy Markdown

eessi-bot-aws Bot commented Apr 22, 2026

New job on instance eessi-bot-mc-aws for repository eessi.io-2025.06-software
Building on: amd-zen3
Building for: x86_64/amd/zen3 and accelerator amd/gfx90a
Job dir: /project/def-users/SHARED/jobs/2026.04/pr_216/150453

date job status comment
Apr 22 17:40:32 UTC 2026 submitted job id 150453 awaits release by job manager
Apr 22 17:41:27 UTC 2026 released job awaits launch by Slurm scheduler
Apr 22 17:42:50 UTC 2026 running job 150453 is running
Apr 22 17:45:12 UTC 2026 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-150453.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
Details
No artefacts were created or found.
Apr 22 17:45:12 UTC 2026 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] (1/5) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/22Jul2025-foss-2024a-kokkos %scale=1_node /ade8cad7 @BotBuildTests:x86-64-zen3+default
P: perf: 525.025 timesteps/s (r:0, l:None, u:None)
[ OK ] (2/5) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /e4bf9965 @BotBuildTests:x86-64-zen3+default
P: latency: 0.8 us (r:0, l:None, u:None)
[ OK ] (3/5) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /3da4890b @BotBuildTests:x86-64-zen3+default
P: latency: 1.12 us (r:0, l:None, u:None)
[ OK ] (4/5) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /3255009a @BotBuildTests:x86-64-zen3+default
P: latency: 0.16 us (r:0, l:None, u:None)
[ OK ] (5/5) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /59f4b331 @BotBuildTests:x86-64-zen3+default
P: bandwidth: 14579.63 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 5/5 test case(s) from 5 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-150453.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Copy Markdown
Contributor Author

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws on:arch=zen3 for:arch=x86_64/amd/zen3,accel=amd/gfx90a

@eessi-bot-aws
Copy link
Copy Markdown

eessi-bot-aws Bot commented Apr 22, 2026

New job on instance eessi-bot-mc-aws for repository eessi.io-2025.06-software
Building on: amd-zen3
Building for: x86_64/amd/zen3 and accelerator amd/gfx90a
Job dir: /project/def-users/SHARED/jobs/2026.04/pr_216/150456

date job status comment
Apr 22 21:54:27 UTC 2026 submitted job id 150456 awaits release by job manager
Apr 22 21:54:56 UTC 2026 released job awaits launch by Slurm scheduler
Apr 22 21:58:59 UTC 2026 running job 150456 is running
Apr 22 22:02:03 UTC 2026 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-150456.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
Details
No artefacts were created or found.
Apr 22 22:02:03 UTC 2026 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] (1/5) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/22Jul2025-foss-2024a-kokkos %scale=1_node /ade8cad7 @BotBuildTests:x86-64-zen3+default
P: perf: 529.63 timesteps/s (r:0, l:None, u:None)
[ OK ] (2/5) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /e4bf9965 @BotBuildTests:x86-64-zen3+default
P: latency: 0.77 us (r:0, l:None, u:None)
[ OK ] (3/5) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /3da4890b @BotBuildTests:x86-64-zen3+default
P: latency: 1.11 us (r:0, l:None, u:None)
[ OK ] (4/5) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /3255009a @BotBuildTests:x86-64-zen3+default
P: latency: 0.15 us (r:0, l:None, u:None)
[ OK ] (5/5) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /59f4b331 @BotBuildTests:x86-64-zen3+default
P: bandwidth: 15129.96 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 5/5 test case(s) from 5 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-150456.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Copy Markdown
Contributor Author

EESSI/software-layer#1483 was deployed in the meantime (the issue was a CPU package, elfutils, being installed into a GPU prefix)

@casparvl
Copy link
Copy Markdown
Contributor Author

Closing, the hooks are deployed in a seperate PR. With that, everything should be done to make EESSI/software-layer#1473 work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant