Skip to content

Conversation

@bedroge
Copy link
Contributor

@bedroge bedroge commented Nov 14, 2025

The builds in EESSI/software-layer#1302 were causing node failures, so let's try reducing the number of cores.

@bedroge
Copy link
Contributor Author

bedroge commented Nov 14, 2025

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-deucalion for:arch=aarch64/a64fx

@eessi-bot-deucalion
Copy link

eessi-bot-deucalion bot commented Nov 14, 2025

New job on instance eessi-bot-deucalion for repository eessi.io-2023.06-software
Building on: a64fx
Building for: aarch64/a64fx
Job dir: /home/eessibot/new-bot/jobs/2025.11/pr_129/682852

date job status comment
Nov 14 08:04:42 UTC 2025 submitted job id 682852 awaits release by job manager
Nov 14 08:05:06 UTC 2025 released job awaits launch by Slurm scheduler
Nov 14 08:06:09 UTC 2025 running job 682852 is running
Nov 14 08:14:28 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-682852.out
✅ no message matching FATAL:
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2023.06-software-linux-aarch64-a64fx-17631077080.tar.zstsize: 0 MiB (22965 bytes)
entries: 1
modules under 2023.06/software/linux/aarch64/a64fx/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/a64fx/software
no software packages in tarball
reprod directories under 2023.06/software/linux/aarch64/a64fx/reprod
no reprod directories in tarball
other under 2023.06/software/linux/aarch64/a64fx
2023.06/init/easybuild/eb_hooks.py
Nov 14 08:14:28 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] ( 1/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 2/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 3/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 4/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ OK ] ( 5/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:aarch64_a64fx+default
P: latency: 1.69 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:aarch64_a64fx+default
P: latency: 1.78 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 8655.28 MB/s (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 8767.81 MB/s (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos %scale=1_node /aeb2d9df @BotBuildTests:aarch64_a64fx+default
P: perf: 582.064 timesteps/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos %scale=1_node /04ff9ece @BotBuildTests:aarch64_a64fx+default
P: perf: 583.032 timesteps/s (r:0, l:None, u:None)
[ PASSED ] Ran 6/10 test case(s) from 10 check(s) (0 failure(s), 4 skipped, 0 aborted)
Details
✅ job output file slurm-682852.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@bedroge
Copy link
Contributor Author

bedroge commented Nov 14, 2025

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-deucalion for:arch=aarch64/a64fx

@eessi-bot-deucalion
Copy link

eessi-bot-deucalion bot commented Nov 14, 2025

New job on instance eessi-bot-deucalion for repository eessi.io-2023.06-software
Building on: a64fx
Building for: aarch64/a64fx
Job dir: /home/eessibot/new-bot/jobs/2025.11/pr_129/682857

date job status comment
Nov 14 08:11:39 UTC 2025 submitted job id 682857 awaits release by job manager
Nov 14 08:12:19 UTC 2025 released job awaits launch by Slurm scheduler
Nov 14 08:13:23 UTC 2025 running job 682857 is running
Nov 14 10:50:18 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-682857.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2023.06-software-linux-aarch64-a64fx-17631170140.tar.zstsize: 0 MiB (22966 bytes)
entries: 1
modules under 2023.06/software/linux/aarch64/a64fx/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/a64fx/software
no software packages in tarball
reprod directories under 2023.06/software/linux/aarch64/a64fx/reprod
no reprod directories in tarball
other under 2023.06/software/linux/aarch64/a64fx
2023.06/init/easybuild/eb_hooks.py
Nov 14 10:50:18 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] ( 1/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 2/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 3/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 4/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ OK ] ( 5/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:aarch64_a64fx+default
P: latency: 1.74 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:aarch64_a64fx+default
P: latency: 1.7 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 8836.89 MB/s (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 8472.6 MB/s (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos %scale=1_node /aeb2d9df @BotBuildTests:aarch64_a64fx+default
P: perf: 582.261 timesteps/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos %scale=1_node /04ff9ece @BotBuildTests:aarch64_a64fx+default
P: perf: 582.164 timesteps/s (r:0, l:None, u:None)
[ PASSED ] Ran 6/10 test case(s) from 10 check(s) (0 failure(s), 4 skipped, 0 aborted)
Details
✅ job output file slurm-682857.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@boegel boegel marked this pull request as draft November 14, 2025 10:25
@bedroge
Copy link
Contributor Author

bedroge commented Nov 14, 2025

Getting lots of test failures now in iteration 2 of 2024.1. Let's try 2024.4 first.

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-deucalion for:arch=aarch64/a64fx

@eessi-bot-deucalion
Copy link

eessi-bot-deucalion bot commented Nov 14, 2025

New job on instance eessi-bot-deucalion for repository eessi.io-2023.06-software
Building on: a64fx
Building for: aarch64/a64fx
Job dir: /home/eessibot/new-bot/jobs/2025.11/pr_129/682993

date job status comment
Nov 14 11:09:43 UTC 2025 submitted job id 682993 awaits release by job manager
Nov 14 11:09:53 UTC 2025 released job awaits launch by Slurm scheduler
Nov 14 11:10:57 UTC 2025 running job 682993 is running
Nov 14 14:37:51 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-682993.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2023.06-software-linux-aarch64-a64fx-17631306950.tar.zstsize: 0 MiB (22966 bytes)
entries: 1
modules under 2023.06/software/linux/aarch64/a64fx/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/a64fx/software
no software packages in tarball
reprod directories under 2023.06/software/linux/aarch64/a64fx/reprod
no reprod directories in tarball
other under 2023.06/software/linux/aarch64/a64fx
2023.06/init/easybuild/eb_hooks.py
Nov 14 14:37:51 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] ( 1/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 2/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 3/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 4/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ OK ] ( 5/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:aarch64_a64fx+default
P: latency: 1.72 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:aarch64_a64fx+default
P: latency: 1.7 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 7855.16 MB/s (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 8708.95 MB/s (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos %scale=1_node /aeb2d9df @BotBuildTests:aarch64_a64fx+default
P: perf: 581.177 timesteps/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos %scale=1_node /04ff9ece @BotBuildTests:aarch64_a64fx+default
P: perf: 582.113 timesteps/s (r:0, l:None, u:None)
[ PASSED ] Ran 6/10 test case(s) from 10 check(s) (0 failure(s), 4 skipped, 0 aborted)
Details
✅ job output file slurm-682993.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@bedroge
Copy link
Contributor Author

bedroge commented Nov 14, 2025

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-deucalion for:arch=aarch64/a64fx

@eessi-bot-deucalion
Copy link

eessi-bot-deucalion bot commented Nov 14, 2025

New job on instance eessi-bot-deucalion for repository eessi.io-2023.06-software
Building on: a64fx
Building for: aarch64/a64fx
Job dir: /home/eessibot/new-bot/jobs/2025.11/pr_129/683338

date job status comment
Nov 14 14:43:37 UTC 2025 submitted job id 683338 awaits release by job manager
Nov 14 14:43:57 UTC 2025 released job awaits launch by Slurm scheduler
Nov 14 14:45:00 UTC 2025 running job 683338 is running
Nov 15 00:13:11 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-683338.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2023.06-software-linux-aarch64-a64fx-17631651470.tar.zstsize: 87 MiB (91493542 bytes)
entries: 1539
modules under 2023.06/software/linux/aarch64/a64fx/modules/all
GROMACS/2024.3-foss-2023b.lua
GROMACS/2024.4-foss-2023b.lua
software under 2023.06/software/linux/aarch64/a64fx/software
GROMACS/2024.3-foss-2023b
GROMACS/2024.4-foss-2023b
reprod directories under 2023.06/software/linux/aarch64/a64fx/reprod
no reprod directories in tarball
other under 2023.06/software/linux/aarch64/a64fx
2023.06/init/easybuild/eb_hooks.py
Nov 15 00:13:11 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] ( 1/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 2/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 3/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 4/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ OK ] ( 5/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:aarch64_a64fx+default
P: latency: 1.64 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:aarch64_a64fx+default
P: latency: 1.72 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 8774.37 MB/s (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 8755.99 MB/s (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos %scale=1_node /aeb2d9df @BotBuildTests:aarch64_a64fx+default
P: perf: 12.423 timesteps/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos %scale=1_node /04ff9ece @BotBuildTests:aarch64_a64fx+default
P: perf: 28.391 timesteps/s (r:0, l:None, u:None)
[ PASSED ] Ran 6/10 test case(s) from 10 check(s) (0 failure(s), 4 skipped, 0 aborted)
Details
✅ job output file slurm-683338.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@boegel
Copy link
Contributor

boegel commented Nov 16, 2025

@bedroge So do we need this deployed?

@bedroge
Copy link
Contributor Author

bedroge commented Nov 16, 2025

@bedroge So do we need this deployed?

Not sure anymore, looks like it's not needed for 2024.4. Maybe it was just an issue for 2024.1, which has failing tests anyway. Let's first wait for the final 2024.3+4 builds to be completed in the software-layer PR.

@bedroge
Copy link
Contributor Author

bedroge commented Nov 17, 2025

Trying to build 2024.2 now, to find out if the issue was solved in 2024.2 or 2024.3.

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-deucalion for:arch=aarch64/a64fx

@eessi-bot-deucalion
Copy link

eessi-bot-deucalion bot commented Nov 17, 2025

New job on instance eessi-bot-deucalion for repository eessi.io-2023.06-software
Building on: a64fx
Building for: aarch64/a64fx
Job dir: /home/eessibot/new-bot/jobs/2025.11/pr_129/685801

date job status comment
Nov 17 12:11:00 UTC 2025 submitted job id 685801 awaits release by job manager
Nov 17 12:11:59 UTC 2025 released job awaits launch by Slurm scheduler
Nov 17 12:13:03 UTC 2025 running job 685801 is running
Nov 17 12:22:31 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-685801.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2023.06-software-linux-aarch64-a64fx-17633817880.tar.zstsize: 0 MiB (22965 bytes)
entries: 1
modules under 2023.06/software/linux/aarch64/a64fx/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/a64fx/software
no software packages in tarball
reprod directories under 2023.06/software/linux/aarch64/a64fx/reprod
no reprod directories in tarball
other under 2023.06/software/linux/aarch64/a64fx
2023.06/init/easybuild/eb_hooks.py
Nov 17 12:22:31 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] ( 1/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 2/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 3/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 4/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ OK ] ( 5/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:aarch64_a64fx+default
P: latency: 1.7 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:aarch64_a64fx+default
P: latency: 1.75 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 8151.67 MB/s (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 8737.14 MB/s (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos %scale=1_node /aeb2d9df @BotBuildTests:aarch64_a64fx+default
P: perf: 581.931 timesteps/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos %scale=1_node /04ff9ece @BotBuildTests:aarch64_a64fx+default
P: perf: 581.211 timesteps/s (r:0, l:None, u:None)
[ PASSED ] Ran 6/10 test case(s) from 10 check(s) (0 failure(s), 4 skipped, 0 aborted)
Details
✅ job output file slurm-685801.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@bedroge
Copy link
Contributor Author

bedroge commented Nov 17, 2025

The failures that we've seen for Neoverse V1 seem very similar to the ones seen on A64FX, so let's try to apply the same workaround.

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-deucalion for:arch=aarch64/a64fx

@eessi-bot-deucalion
Copy link

eessi-bot-deucalion bot commented Nov 17, 2025

New job on instance eessi-bot-deucalion for repository eessi.io-2023.06-software
Building on: a64fx
Building for: aarch64/a64fx
Job dir: /home/eessibot/new-bot/jobs/2025.11/pr_129/685988

date job status comment
Nov 17 15:11:26 UTC 2025 submitted job id 685988 awaits release by job manager
Nov 17 15:12:06 UTC 2025 released job awaits launch by Slurm scheduler
Nov 17 15:13:09 UTC 2025 running job 685988 is running
Nov 17 18:44:02 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-685988.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2023.06-software-linux-aarch64-a64fx-17634046630.tar.zstsize: 40 MiB (42278629 bytes)
entries: 771
modules under 2023.06/software/linux/aarch64/a64fx/modules/all
GROMACS/2024.1-foss-2023b.lua
software under 2023.06/software/linux/aarch64/a64fx/software
GROMACS/2024.1-foss-2023b
reprod directories under 2023.06/software/linux/aarch64/a64fx/reprod
no reprod directories in tarball
other under 2023.06/software/linux/aarch64/a64fx
2023.06/init/easybuild/eb_hooks.py
Nov 17 18:44:02 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] ( 1/12) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 2/12) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 3/12) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 4/12) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 5/12) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 6/12) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ OK ] ( 7/12) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:aarch64_a64fx+default
P: latency: 1.68 us (r:0, l:None, u:None)
[ OK ] ( 8/12) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:aarch64_a64fx+default
P: latency: 1.71 us (r:0, l:None, u:None)
[ OK ] ( 9/12) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 8712.37 MB/s (r:0, l:None, u:None)
[ OK ] (10/12) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 8756.58 MB/s (r:0, l:None, u:None)
[ OK ] (11/12) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos %scale=1_node /aeb2d9df @BotBuildTests:aarch64_a64fx+default
P: perf: 43.587 timesteps/s (r:0, l:None, u:None)
[ OK ] (12/12) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos %scale=1_node /04ff9ece @BotBuildTests:aarch64_a64fx+default
P: perf: 25.45 timesteps/s (r:0, l:None, u:None)
[ PASSED ] Ran 6/12 test case(s) from 12 check(s) (0 failure(s), 6 skipped, 0 aborted)
Details
✅ job output file slurm-685988.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

Removed GROMACS CPU target limit from eb_hooks.py.
@bedroge
Copy link
Contributor Author

bedroge commented Nov 17, 2025

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-deucalion for:arch=aarch64/a64fx

@eessi-bot-deucalion
Copy link

eessi-bot-deucalion bot commented Nov 17, 2025

New job on instance eessi-bot-deucalion for repository eessi.io-2023.06-software
Building on: a64fx
Building for: aarch64/a64fx
Job dir: /home/eessibot/new-bot/jobs/2025.11/pr_129/686410

date job status comment
Nov 17 18:58:14 UTC 2025 submitted job id 686410 awaits release by job manager
Nov 17 18:59:10 UTC 2025 released job awaits launch by Slurm scheduler
Nov 17 19:00:12 UTC 2025 running job 686410 is running
Nov 17 21:50:02 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-686410.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2023.06-software-linux-aarch64-a64fx-17634158150.tar.zstsize: 40 MiB (42274408 bytes)
entries: 771
modules under 2023.06/software/linux/aarch64/a64fx/modules/all
GROMACS/2024.1-foss-2023b.lua
software under 2023.06/software/linux/aarch64/a64fx/software
GROMACS/2024.1-foss-2023b
reprod directories under 2023.06/software/linux/aarch64/a64fx/reprod
no reprod directories in tarball
other under 2023.06/software/linux/aarch64/a64fx
2023.06/init/easybuild/eb_hooks.py
Nov 17 21:50:02 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] ( 1/12) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 2/12) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 3/12) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 4/12) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 5/12) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 6/12) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ OK ] ( 7/12) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:aarch64_a64fx+default
P: latency: 1.7 us (r:0, l:None, u:None)
[ OK ] ( 8/12) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:aarch64_a64fx+default
P: latency: 1.71 us (r:0, l:None, u:None)
[ OK ] ( 9/12) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 8823.51 MB/s (r:0, l:None, u:None)
[ OK ] (10/12) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 8709.94 MB/s (r:0, l:None, u:None)
[ OK ] (11/12) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos %scale=1_node /aeb2d9df @BotBuildTests:aarch64_a64fx+default
P: perf: 582.572 timesteps/s (r:0, l:None, u:None)
[ OK ] (12/12) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos %scale=1_node /04ff9ece @BotBuildTests:aarch64_a64fx+default
P: perf: 582.033 timesteps/s (r:0, l:None, u:None)
[ PASSED ] Ran 6/12 test case(s) from 12 check(s) (0 failure(s), 6 skipped, 0 aborted)
Details
✅ job output file slurm-686410.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@bedroge
Copy link
Contributor Author

bedroge commented Nov 18, 2025

These workarounds were not necessary in the end, GROMACS 2024.x versions for A64FX have been added in EESSI/software-layer#1308 and EESSI/software-layer#1302. So, closing this.

@bedroge bedroge closed this Nov 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants