Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{2023.06}[foss/2023a] GROMACS v2023.4 #463

Closed

Conversation

bedroge
Copy link
Collaborator

@bedroge bedroge commented Jan 26, 2024

No description provided.

Copy link

eessi-bot-aws bot commented Jan 26, 2024

Instance eessi-bot-mc-aws is configured to build:

  • arch x86_64/generic for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/generic for repo eessi-hpc.org-2023.06-software
  • arch x86_64/generic for repo eessi.io-2023.06-compat
  • arch x86_64/generic for repo eessi.io-2023.06-software
  • arch x86_64/intel/haswell for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi-hpc.org-2023.06-software
  • arch x86_64/intel/haswell for repo eessi.io-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi.io-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi.io-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi.io-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi.io-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi.io-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi.io-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi.io-2023.06-software
  • arch aarch64/generic for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/generic for repo eessi-hpc.org-2023.06-software
  • arch aarch64/generic for repo eessi.io-2023.06-compat
  • arch aarch64/generic for repo eessi.io-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi-hpc.org-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi.io-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi.io-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi-hpc.org-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi.io-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi.io-2023.06-software

@bedroge
Copy link
Collaborator Author

bedroge commented Jan 26, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3

Copy link

eessi-bot-aws bot commented Jan 26, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 resulted in:

Copy link

eessi-bot-aws bot commented Jan 26, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-amd-zen3 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.01/pr_463/4815

date job status comment
Jan 26 12:26:15 UTC 2024 submitted job id 4815 awaits release by job manager
Jan 26 12:27:07 UTC 2024 released job awaits launch by Slurm scheduler
Jan 26 12:32:09 UTC 2024 running job 4815 is running
Jan 26 13:07:40 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-4815.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Jan 26 13:07:40 UTC 2024 test result (no tests yet)

@bedroge
Copy link
Collaborator Author

bedroge commented Jan 26, 2024

While I was debugging #401 interactively, I also tried installing 2023.4, and then that version did work fine. But now the build with the bot does show the same test failure again 😞

-------------------------------------------------------
Program:     gmxapi-mpi-test, version 2023.4-EasyBuild_4.9.0
Source file: src/gromacs/utility/futil.cpp (line 454)
MPI rank:    1 (out of 2)

File input/output error:
/tmp/bot/easybuild/build/GROMACS/2023.4/foss-2023a/easybuild_obj/api/gmxapi/cpp/tests/Testing/Temporary/GmxApiTest_RunnerChainedMD.trr

For more information and tips for troubleshooting, please check the GROMACS
website at https://manual.gromacs.org/current/user-guide/run-time-errors.html
-------------------------------------------------------

-------------------------------------------------------
Program:     gmxapi-mpi-test, version 2023.4-EasyBuild_4.9.0
Source file: src/gromacs/utility/futil.cpp (line 454)
MPI rank:    0 (out of 2)

File input/output error:
/tmp/bot/easybuild/build/GROMACS/2023.4/foss-2023a/easybuild_obj/api/gmxapi/cpp/tests/Testing/Temporary/GmxApiTest_RunnerChainedMD.trr

For more information and tips for troubleshooting, please check the GROMACS
website at https://manual.gromacs.org/current/user-guide/run-time-errors.html
-------------------------------------------------------

@boegel
Copy link
Contributor

boegel commented Jan 26, 2024

@bedroge Some time to blame fuse-overlayfs? Or doesn't that come into play here yet (since this is pre-install test suite)?

@bedroge
Copy link
Collaborator Author

bedroge commented Jan 26, 2024

@bedroge Some time to blame fuse-overlayfs? Or doesn't that come into play here yet (since this is pre-install test suite)?

It's using /tmp inside the container, which shouldn't use fuse-overlayfs. I'm not entirely sure which host directory is bind mounted as /tmp, but if that's a shared filesystem (not entirely sure, but don't think that's the case?), it could explain things.

@boegel
Copy link
Contributor

boegel commented Jan 26, 2024

@bedroge Some time to blame fuse-overlayfs? Or doesn't that come into play here yet (since this is pre-install test suite)?

It's using /tmp inside the container, which shouldn't use fuse-overlayfs. I'm not entirely sure which host directory is bind mounted as /tmp, but if that's a shared filesystem (not entirely sure, but don't think that's the case?), it could explain things.

A unique subdirectory of /tmp is bind-mounted as /tmp in the build container I think, definitely not a shared filesystem (we would've seen problems already in that case).

@bedroge
Copy link
Collaborator Author

bedroge commented Jan 26, 2024

I've tried building this version a few more times now, and it's failing all the time on that same test. For some reason, it worked the very first time, though 😕

@TopRichard
Copy link
Collaborator

TopRichard commented Feb 7, 2024

@bedroge @boegel Gromacs v2023.4 works for NESSI: NorESSI#273 but with deb10 container on certain architectures.

@bedroge
Copy link
Collaborator Author

bedroge commented Feb 9, 2024

@bedroge @boegel Gromacs v2023.4 works for NESSI: NorESSI#273 but with deb10 container on certain architectures.

Oh, that's really interesting... Now we did have weird behaviour for both 2023.3 and 2023.4, where it occasionally does work, but for e.g. haswell it's failing (almost) all the time, I think. Doesn't look like NESSI builds for haswell (?), so maybe I can try doing that interactively with our old debian10 container.

@bedroge
Copy link
Collaborator Author

bedroge commented Feb 9, 2024

@bedroge @boegel Gromacs v2023.4 works for NESSI: NorESSI#273 but with deb10 container on certain architectures.

Oh, that's really interesting... Now we did have weird behaviour for both 2023.3 and 2023.4, where it occasionally does work, but for e.g. haswell it's failing (almost) all the time, I think. Doesn't look like NESSI builds for haswell (?), so maybe I can try doing that interactively with our old debian10 container.

@TopRichard I tried it on a haswell node with our old Debian 10 build container (https://github.com/EESSI/filesystem-layer/pkgs/container/build-node/40951724?tag=debian10), but it has been stuck in the test step for several hours now, so I think it hangs (and we saw the same thing before). Is there any other significant difference in your Debian 10 container compared to the Debian 11 container?

@TopRichard
Copy link
Collaborator

@bedroge we are using the same containers of EESSI , and our stack doesn’t have haswell, so that i try building GROMACS for haswell within NESSI stack, i will figure out a way to further help in this.

@bedroge
Copy link
Collaborator Author

bedroge commented Mar 27, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3

Copy link

eessi-bot-aws bot commented Mar 27, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from bedroge
    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3

Copy link

eessi-bot-aws bot commented Mar 27, 2024

error: patch failed: easystacks/software.eessi.io/2023.06/eessi-2023.06-eb-4.9.0-2023a.yml:11 error: easystacks/software.eessi.io/2023.06/eessi-2023.06-eb-4.9.0-2023a.yml: patch does not apply Unable to download or merge changes between the source branch and the destination branch.Tip: This can usually be resolved by syncing your branch and resolving any merge conflicts.

Alternative formattings (edited manually):

error: patch failed: easystacks/software.eessi.io/2023.06/eessi-2023.06-eb-4.9.0-2023a.yml:11
error: easystacks/software.eessi.io/2023.06/eessi-2023.06-eb-4.9.0-2023a.yml: patch does not apply

Unable to download or merge changes between the source branch and the destination branch.
Tip: This can usually be resolved by syncing your branch and resolving any merge conflicts.

or

Unable to download or merge changes between the source branch and the destination branch.

error: patch failed: easystacks/software.eessi.io/2023.06/eessi-2023.06-eb-4.9.0-2023a.yml:11
error: easystacks/software.eessi.io/2023.06/eessi-2023.06-eb-4.9.0-2023a.yml: patch does not apply

Tip: This can usually be resolved by syncing your branch and resolving any merge conflicts.

@bedroge
Copy link
Collaborator Author

bedroge commented Mar 27, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3

Copy link

eessi-bot-aws bot commented Mar 27, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 resulted in:

Copy link

eessi-bot-aws bot commented Mar 27, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-amd-zen3 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_463/8659

date job status comment
Mar 27 07:35:50 UTC 2024 submitted job id 8659 awaits release by job manager
Mar 27 07:36:00 UTC 2024 released job awaits launch by Slurm scheduler
Mar 27 07:40:01 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-8659.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Mar 27 07:40:01 UTC 2024 test result
😁 FAILURE (click triangle for details)
Reason
Failed for unknown reason
Details
✅ job output file slurm-8659.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@bedroge
Copy link
Collaborator Author

bedroge commented Mar 27, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3

Copy link

eessi-bot-aws bot commented Mar 27, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 resulted in:

Copy link

eessi-bot-aws bot commented Mar 27, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-amd-zen3 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_463/8677

date job status comment
Mar 27 15:46:15 UTC 2024 submitted job id 8677 awaits release by job manager
Mar 27 15:46:57 UTC 2024 released job awaits launch by Slurm scheduler
Mar 27 15:51:59 UTC 2024 running job 8677 is running
Mar 27 16:43:57 UTC 2024 finished
😢 FAILURE (click triangle for details)
Reason
EasyConfig not found during missing installation check. Are you sure all PRs referenced have been merged in EasyBuild?
Details
✅ job output file slurm-8677.out
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen3-1711557102.tar.gzsize: 41 MiB (43225941 bytes)
entries: 765
modules under 2023.06/software/linux/x86_64/amd/zen3/modules/all
GROMACS/2023.4-foss-2023a.lua
software under 2023.06/software/linux/x86_64/amd/zen3/software
GROMACS/2023.4-foss-2023a
other under 2023.06/software/linux/x86_64/amd/zen3
no other files in tarball
Mar 27 16:43:57 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8677.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@bedroge
Copy link
Collaborator Author

bedroge commented Mar 28, 2024

This actually completed:

== COMPLETED: Installation ended successfully (took 36 mins 48 secs)
== Results of the build can be found in the log file(s) /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/software/GROMACS/2023.4-foss-2023a/easybuild/easybuild-GROMACS-2023.4-20240327.162926.log.bz2
== Build succeeded for 1 out of 1

But then failed in the missing software check (I guess because the easyconfig hasn't been merged yet?):

ERROR: One or more files not found: GROMACS-2023.4-foss-2023a.eb (search paths: /tmp/tmp.LSEuf7NgOB/easyconfigs/easybuild/easyconfigs)

So, let's try building for the other architectures as well.

@bedroge
Copy link
Collaborator Author

bedroge commented Mar 28, 2024

bot: build repo:eessi.io-2023.06-software arch:aarch64/generic
bot: build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1
bot: build repo:eessi.io-2023.06-software arch:aarch64/neoverse_v1
bot: build repo:eessi.io-2023.06-software arch:x86_64/generic
bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen2
bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell
bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512

Copy link

eessi-bot-aws bot commented Mar 28, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/generic from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/generic
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1 from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/neoverse_v1 from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_v1
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen2 from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen2
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/haswell
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512 from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/generic resulted in:

  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1 resulted in:

  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_v1 resulted in:

  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen2 resulted in:

  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/haswell resulted in:

  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512 resulted in:

Copy link

eessi-bot-aws bot commented Mar 28, 2024

New job on instance eessi-bot-mc-aws for architecture aarch64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_463/8719

date job status comment
Mar 28 09:11:48 UTC 2024 submitted job id 8719 awaits release by job manager
Mar 28 09:12:31 UTC 2024 released job awaits launch by Slurm scheduler
Mar 28 09:16:11 UTC 2024 running job 8719 is running
Mar 28 10:13:47 UTC 2024 finished
😢 FAILURE (click triangle for details)
Reason
EasyConfig not found during missing installation check. Are you sure all PRs referenced have been merged in EasyBuild?
Details
✅ job output file slurm-8719.out
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-generic-1711620187.tar.gzsize: 32 MiB (33811112 bytes)
entries: 765
modules under 2023.06/software/linux/aarch64/generic/modules/all
GROMACS/2023.4-foss-2023a.lua
software under 2023.06/software/linux/aarch64/generic/software
GROMACS/2023.4-foss-2023a
other under 2023.06/software/linux/aarch64/generic
no other files in tarball
Mar 28 10:13:47 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8719.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

Copy link

eessi-bot-aws bot commented Mar 28, 2024

New job on instance eessi-bot-mc-aws for architecture aarch64-neoverse_n1 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_463/8720

date job status comment
Mar 28 09:11:55 UTC 2024 submitted job id 8720 awaits release by job manager
Mar 28 09:12:33 UTC 2024 released job awaits launch by Slurm scheduler
Mar 28 09:17:21 UTC 2024 running job 8720 is running
Mar 28 10:16:14 UTC 2024 finished
😢 FAILURE (click triangle for details)
Reason
EasyConfig not found during missing installation check. Are you sure all PRs referenced have been merged in EasyBuild?
Details
✅ job output file slurm-8720.out
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_n1-1711620306.tar.gzsize: 36 MiB (38656276 bytes)
entries: 765
modules under 2023.06/software/linux/aarch64/neoverse_n1/modules/all
GROMACS/2023.4-foss-2023a.lua
software under 2023.06/software/linux/aarch64/neoverse_n1/software
GROMACS/2023.4-foss-2023a
other under 2023.06/software/linux/aarch64/neoverse_n1
no other files in tarball
Mar 28 10:16:14 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8720.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

Copy link

eessi-bot-aws bot commented Mar 28, 2024

New job on instance eessi-bot-mc-aws for architecture aarch64-neoverse_v1 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_463/8721

date job status comment
Mar 28 09:12:05 UTC 2024 submitted job id 8721 awaits release by job manager
Mar 28 09:12:35 UTC 2024 released job awaits launch by Slurm scheduler
Mar 28 09:17:23 UTC 2024 running job 8721 is running
Mar 28 10:10:05 UTC 2024 finished
😢 FAILURE (click triangle for details)
Reason
EasyConfig not found during missing installation check. Are you sure all PRs referenced have been merged in EasyBuild?
Details
✅ job output file slurm-8721.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Mar 28 10:10:05 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8721.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

Copy link

eessi-bot-aws bot commented Mar 28, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_463/8722

date job status comment
Mar 28 09:12:13 UTC 2024 submitted job id 8722 awaits release by job manager
Mar 28 09:12:39 UTC 2024 released job awaits launch by Slurm scheduler
Mar 28 09:18:39 UTC 2024 running job 8722 is running
Mar 28 10:23:22 UTC 2024 finished
😢 FAILURE (click triangle for details)
Reason
EasyConfig not found during missing installation check. Are you sure all PRs referenced have been merged in EasyBuild?
Details
✅ job output file slurm-8722.out
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-generic-1711620355.tar.gzsize: 38 MiB (40576204 bytes)
entries: 765
modules under 2023.06/software/linux/x86_64/generic/modules/all
GROMACS/2023.4-foss-2023a.lua
software under 2023.06/software/linux/x86_64/generic/software
GROMACS/2023.4-foss-2023a
other under 2023.06/software/linux/x86_64/generic
no other files in tarball
Mar 28 10:23:22 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8722.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

Copy link

eessi-bot-aws bot commented Mar 28, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-amd-zen2 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_463/8723

date job status comment
Mar 28 09:12:22 UTC 2024 submitted job id 8723 awaits release by job manager
Mar 28 09:12:37 UTC 2024 released job awaits launch by Slurm scheduler
Mar 28 09:17:25 UTC 2024 running job 8723 is running
Mar 28 10:24:32 UTC 2024 finished
😢 FAILURE (click triangle for details)
Reason
EasyConfig not found during missing installation check. Are you sure all PRs referenced have been merged in EasyBuild?
Details
✅ job output file slurm-8723.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Mar 28 10:24:32 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8723.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

Copy link

eessi-bot-aws bot commented Mar 28, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-intel-haswell for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_463/8724

date job status comment
Mar 28 09:12:31 UTC 2024 submitted job id 8724 awaits release by job manager
Mar 28 09:13:51 UTC 2024 released job awaits launch by Slurm scheduler
Mar 28 09:19:53 UTC 2024 running job 8724 is running
Mar 28 11:51:30 UTC 2024 finished
😢 FAILURE (click triangle for details)
Reason
EasyConfig not found during missing installation check. Are you sure all PRs referenced have been merged in EasyBuild?
Details
✅ job output file slurm-8724.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Mar 28 11:51:30 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8724.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

Copy link

eessi-bot-aws bot commented Mar 28, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-intel-skylake_avx512 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.03/pr_463/8725

date job status comment
Mar 28 09:12:39 UTC 2024 submitted job id 8725 awaits release by job manager
Mar 28 09:13:54 UTC 2024 released job awaits launch by Slurm scheduler
Mar 28 09:19:56 UTC 2024 running job 8725 is running
Mar 28 10:17:26 UTC 2024 finished
😢 FAILURE (click triangle for details)
Reason
EasyConfig not found during missing installation check. Are you sure all PRs referenced have been merged in EasyBuild?
Details
✅ job output file slurm-8725.out
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-skylake_avx512-1711620130.tar.gzsize: 40 MiB (42389834 bytes)
entries: 765
modules under 2023.06/software/linux/x86_64/intel/skylake_avx512/modules/all
GROMACS/2023.4-foss-2023a.lua
software under 2023.06/software/linux/x86_64/intel/skylake_avx512/software
GROMACS/2023.4-foss-2023a
other under 2023.06/software/linux/x86_64/intel/skylake_avx512
no other files in tarball
Mar 28 10:17:26 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-8725.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@bedroge
Copy link
Collaborator Author

bedroge commented Mar 28, 2024

Quick summary of the status of the builds:

# Job 8721 on neoverse v1:
The following tests FAILED:
         15 - GmxlibTests (Subprocess aborted)
         45 - SimdUnitTests (Failed)
         80 - MdrunFEPTests (Failed)

# Job 8723 on zen2:
The following tests FAILED:
          2 - GmxapiMpiTests (Failed)

# Job 8724 on haswell:
Stuck, killed it manually

All the other ones actually completed successfully, but failed in the check for missing installations.

Regarding the haswell build, I saw the following processes running on the node:

bot        31437   31427  0 09:26 ?        00:00:00 orted --hnp --set-sid --report-uri 14 --singleton-died-pipe 15 -mca state_novm_select 1 -mca ess hnp -mca pmix ^s1,s2,cray,isolated
bot        31438   31436  0 09:26 ?        00:00:00 orted --hnp --set-sid --report-uri 14 --singleton-died-pipe 15 -mca state_novm_select 1 -mca ess hnp -mca pmix ^s1,s2,cray,isolated

I'm guessing this may be a similar issue as what we are seeing in #508 and #456.

@marconetto
Copy link

Hi folks, I would like to use gromacs from the eessi prod repo. I understand this is the PR to generate gromacs in prod repo right?

@bedroge
Copy link
Collaborator Author

bedroge commented Mar 30, 2024

Hi folks, I would like to use gromacs from the eessi prod repo. I understand this is the PR to generate gromacs in prod repo right?

Hi @marconetto, yes, that's correct. Unfortunately, we're seeing weird test failures on a few CPU types, and these are quite hard to figure out. We'd really like to make GROMACS available, but we also really want to be sure that the builds are fine before we actually add them to the repository. So we'll keep working on them (we also have open PRs for version 2023.3 in #401 and 2024.1 in #499, but all with similar issues), and hopefully we will be able to resolve the issues soon.

@marconetto
Copy link

thanks @bedroge for the clarification!

@boegel boegel added the 2023.06-software.eessi.io 2023.06 version of software.eessi.io label Apr 2, 2024
@bedroge
Copy link
Collaborator Author

bedroge commented Apr 4, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell

Copy link

eessi-bot-aws bot commented Apr 4, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/haswell
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/haswell resulted in:

Copy link

eessi-bot-aws bot commented Apr 4, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-intel-haswell for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.04/pr_463/9013

date job status comment
Apr 04 11:22:46 UTC 2024 submitted job id 9013 awaits release by job manager
Apr 04 11:23:47 UTC 2024 released job awaits launch by Slurm scheduler
Apr 04 11:27:49 UTC 2024 running job 9013 is running
Apr 04 11:35:01 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-9013.out
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Apr 04 11:35:01 UTC 2024 test result
🤷 UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job9013.test does not exist in job directory or reading it failed.

@bedroge
Copy link
Collaborator Author

bedroge commented Apr 4, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell

Copy link

eessi-bot-aws bot commented Apr 4, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/haswell
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/haswell resulted in:

Copy link

eessi-bot-aws bot commented Apr 4, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-intel-haswell for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.04/pr_463/9014

date job status comment
Apr 04 11:34:47 UTC 2024 submitted job id 9014 awaits release by job manager
Apr 04 11:35:00 UTC 2024 released job awaits launch by Slurm scheduler
Apr 04 11:36:04 UTC 2024 running job 9014 is running
Apr 04 12:17:47 UTC 2024 finished
😢 FAILURE (click triangle for details)
Reason
EasyConfig not found during missing installation check. Are you sure all PRs referenced have been merged in EasyBuild?
Details
✅ job output file slurm-9014.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Apr 04 12:17:47 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-9014.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@bedroge
Copy link
Collaborator Author

bedroge commented Apr 4, 2024

The haswell build is not getting stuck anymore (because of the Lmod hook that we added to the bot environment), but it does still fail due to one failing test:

          2 - GmxapiMpiTests (Failed)

This is the actual error message for the failing test:

Test failures from rank 0:
/tmp/bot/easybuild/build/GROMACS/2023.4/foss-2023a/gromacs-2023.4/api/gmxapi/cpp/tests/stopsignaler.cpp:206: Failure
Expected equality of these values:
  nsteps
    Which is: 4
  gmx::roundToInt(restraint->timeElapsedSinceStart() / getTestStepSize())
    Which is: 0
Test failures from rank 1:
/tmp/bot/easybuild/build/GROMACS/2023.4/foss-2023a/gromacs-2023.4/api/gmxapi/cpp/tests/stopsignaler.cpp:206: Failure
Expected equality of these values:
  nsteps
    Which is: 4
  gmx::roundToInt(restraint->timeElapsedSinceStart() / getTestStepSize())
    Which is: 0
[  FAILED  ] GmxApiTest.ApiRunnerStopSignalClient (203 ms)

@boegel
Copy link
Contributor

boegel commented May 1, 2024

superseded by #499

@boegel boegel closed this May 1, 2024
@bedroge bedroge deleted the gromacs_2023.4 branch May 1, 2024 07:04
@bedroge
Copy link
Collaborator Author

bedroge commented May 1, 2024

Hi folks, I would like to use gromacs from the eessi prod repo. I understand this is the PR to generate gromacs in prod repo right?

HI @marconetto, GROMACS 2024.1 is now available in the production repo. Does that version suit your needs, or did you have a specific need for version 2023.4?

@marconetto
Copy link

Hi @bedroge , this is great news! Thanks for you and your team by putting this together. This version is fine, thanks a lot! I will test it next week and let you know if something is not working as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2023.06-software.eessi.io 2023.06 version of software.eessi.io
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants