Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{phys}[foss/2022a] HOOMD-blue v4.0.1 w/ Python 3.10.4 + CUDA 11.7.0 #18218

Merged

Conversation

akesandgren
Copy link
Contributor

(created using eb --new-pr)

@akesandgren
Copy link
Contributor Author

Test report by @akesandgren
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
b-cn1603.hpc2n.umu.se - Linux Ubuntu 22.04, x86_64, AMD EPYC 7313 16-Core Processor, 1 x NVIDIA NVIDIA A100 80GB PCIe, 525.116.04, Python 3.10.6
See https://gist.github.com/akesandgren/d94edb05b1d85de66270788d2853631e for a full test report.

@verdurin
Copy link
Member

@boegelbot please test @ generoso

@boegelbot
Copy link
Collaborator

@verdurin: Request for testing this PR well received on login1

PR test command 'EB_PR=18218 EB_ARGS= EB_CONTAINER= /opt/software/slurm/bin/sbatch --job-name test_PR_18218 --ntasks=4 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 11151

Test results coming soon (I hope)...

- notification for comment with ID 1611529807 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@verdurin
Copy link
Member

Test report by @verdurin
FAILED
Build succeeded for 1 out of 2 (1 easyconfigs in total)
easybuild-c7.novalocal - Linux CentOS Linux 7.9.2009, x86_64, Intel Xeon Processor (Skylake, IBRS), Python 3.6.8
See https://gist.github.com/verdurin/ab1b2a9f3efaba427f38e5ff96ad3c5a for a full test report.

@verdurin
Copy link
Member

45% tests passed, 36 tests failed out of 65

Total Test time (real) = 175.81 sec

The following tests FAILED:
          1 - test_cell_list (Failed)
          2 - test_cell_list_stencil (Failed)
          3 - test_gpu_array (Failed)
          4 - test_global_array (Failed)
         18 - test_warp_tools (Failed)
         19 - mpi-test_load_balancer (Failed)
         20 - test_warp_tools-synccheck (Failed)
         21 - test_bondtable_bond_force (Failed)
         22 - test_external_periodic (Failed)
         23 - test_fire_energy_minimizer (Failed)
         24 - test_cosinesq_angle_force (Failed)
         25 - test_harmonic_angle_force (Failed)
         26 - test_harmonic_bond_force (Failed)
         27 - test_harmonic_dihedral_force (Failed)
         28 - test_harmonic_improper_force (Failed)
         29 - test_MolecularForceCompute (Failed)
         30 - test_neighborlist (Failed)
         31 - test_opls_dihedral_force (Failed)
         32 - test_pppm_force (Failed)
         33 - test_table_angle_force (Failed)
         34 - test_table_dihedral_force (Failed)
         37 - mpi-test_communication (Failed)
         38 - mpi-test_communicator_grid (Failed)
         52 - mpcd-core-at_collision_method (Failed)
         53 - mpcd-core-cell_list (Failed)
         54 - mpcd-core-cell_thermo_compute (Failed)
         55 - mpcd-core-slit_geometry_filler (Failed)
         56 - mpcd-core-slit_pore_geometry_filler (Failed)
         57 - mpcd-core-sorter (Failed)
         58 - mpcd-core-srd_collision_method (Failed)
         59 - mpcd-core-streaming_method (Failed)
         61 - mpcd-core-cell_communicator-mpi (Failed)
         62 - mpcd-core-cell_list-mpi (Failed)
         63 - mpcd-core-cell_thermo_compute-mpi (Failed)
         64 - mpcd-core-slit_geometry_filler-mpi (Failed)
         65 - mpcd-core-slit_pore_geometry_filler-mpi (Failed)
Errors while running CTest
make: *** [test] Error 8

Example test failure:

--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
==== backtrace (tid:  29689) ====
 0 0x0000000000036400 killpg()  ???:0
 1 0x00000000002ec59c hoomd::ExecutionConfiguration::ExecutionConfiguration()  ???:0
 2 0x00000000004082ab slit_pore_fill_mpi_gpu::run()  ???:0
 3 0x000000000040ac12 upp11::TestInvokerTrivial<slit_pore_fill_mpi_gpu>::invoke()  ???:0
 4 0x000000000040f5e9 upp11::TestCollection::runAllTests()  ???:0
 5 0x0000000000406aae main()  ???:0
 6 0x0000000000022555 __libc_start_main()  ???:0
 7 0x0000000000406da1 _start()  ???:0
=================================
[easybuild-c7:29689] *** Process received signal ***
[easybuild-c7:29689] Signal: Floating point exception (8)
[easybuild-c7:29689] Signal code:  (-6)
[easybuild-c7:29689] Failing at address: 0x3e9000073f9
[easybuild-c7:29689] [ 0] /lib64/libpthread.so.0(+0xf630)[0x7fb2e76ba630]
[easybuild-c7:29689] [ 1] /dev/shm/HOOMDblue/4.0.1/foss-2022a-CUDA-11.7.0/easybuild_obj/hoomd/_hoomd.cpython-310-x86_64-linux-gnu.so(_ZN5hoomd22ExecutionConfigurationC1ENS0_13executionModeESt6vectorIiSaIiEESt10shar
ed_ptrINS_16MPIConfigurationEES5_INS_9MessengerEE+0xd9c)[0x7fb30fcba59c]
[easybuild-c7:29689] [ 2] /dev/shm/HOOMDblue/4.0.1/foss-2022a-CUDA-11.7.0/easybuild_obj/hoomd/mpcd/test/slit_pore_geometry_filler_mpi_test[0x4082ab]
[easybuild-c7:29689] [ 3] /dev/shm/HOOMDblue/4.0.1/foss-2022a-CUDA-11.7.0/easybuild_obj/hoomd/mpcd/test/slit_pore_geometry_filler_mpi_test[0x40ac12]
[easybuild-c7:29689] [ 4] /dev/shm/HOOMDblue/4.0.1/foss-2022a-CUDA-11.7.0/easybuild_obj/hoomd/mpcd/test/slit_pore_geometry_filler_mpi_test[0x40f5e9]
[easybuild-c7:29689] [ 5] /dev/shm/HOOMDblue/4.0.1/foss-2022a-CUDA-11.7.0/easybuild_obj/hoomd/mpcd/test/slit_pore_geometry_filler_mpi_test[0x406aae]
[easybuild-c7:29689] [ 6] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fb2e7af1555]
[easybuild-c7:29689] [ 7] /dev/shm/HOOMDblue/4.0.1/foss-2022a-CUDA-11.7.0/easybuild_obj/hoomd/mpcd/test/slit_pore_geometry_filler_mpi_test[0x406da1]
[easybuild-c7:29689] *** End of error message ***

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
cns1 - Linux Rocky Linux 8.5, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/boegelbot/9e39c4aa47ea3d88892d346694a25220 for a full test report.

@branfosj
Copy link
Member

Test report by @branfosj
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
bear-pg0203u29a.bear.cluster - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), 1 x NVIDIA NVIDIA A100-SXM4-80GB, 520.61.05, Python 3.6.8
See https://gist.github.com/branfosj/b18b6131c2e043db3ce03b41030b6d03 for a full test report.

@branfosj
Copy link
Member

== 2023-06-28 16:41:39,062 build_log.py:171 ERROR EasyBuild crashed with an error (at easybuild/src/easybuild-framework/easybuild/base/exceptions.py:126 in __init__): Sanity check failed: sanity check command python -c 'import hoomd' exited with code 1 (output: Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/rds/projects/2017/branfosj-rse/easybuild/test/software/HOOMD-blue/4.0.1-foss-2022a-CUDA-11.7.0/hoomd/__init__.py", line 60, in <module>
    from hoomd import version
  File "/rds/projects/2017/branfosj-rse/easybuild/test/software/HOOMD-blue/4.0.1-foss-2022a-CUDA-11.7.0/hoomd/version.py", line 64, in <module>
    from hoomd import _hoomd
ImportError: cannot import name '_hoomd' from partially initialized module 'hoomd' (most likely due to a circular import) (/rds/projects/2017/branfosj-rse/easybuild/test/software/HOOMD-blue/4.0.1-foss-2022a-CUDA-11.7.0/hoomd/__init__.py)

@branfosj
Copy link
Member

Test report by @branfosj
FAILED
Build succeeded for 2 out of 3 (1 easyconfigs in total)
bear-pg0105u03a.bear.cluster - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), Python 3.6.8
See https://gist.github.com/branfosj/f42fd5a7a688e172e05cf5f271f0e9c3 for a full test report.

…-CUDA-11.7.0.eb

Co-authored-by: Simon Branford <4967+branfosj@users.noreply.github.com>
@akesandgren
Copy link
Contributor Author

@verdurin @branfosj
Ok, just to try to pinpoint where the problem is, can you try PR #18224 which is the non-CUDA version of this one.

@boegel boegel changed the title {phys}[foss/2022a] HOOMD-blue v4.0.1 w/ Python 3.10.4 {phys}[foss/2022a] HOOMD-blue v4.0.1 w/ Python 3.10.4 + CUDA 11.7.9 Jun 29, 2023
@boegel boegel changed the title {phys}[foss/2022a] HOOMD-blue v4.0.1 w/ Python 3.10.4 + CUDA 11.7.9 {phys}[foss/2022a] HOOMD-blue v4.0.1 w/ Python 3.10.4 + CUDA 11.7.0 Jun 29, 2023
@SebastianAchilles
Copy link
Member

Test report by @SebastianAchilles
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
skl-rockylinux-88 - Linux Rocky Linux 8.8, x86_64, Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz (skylake), 1 x NVIDIA NVIDIA RTX A4000, 530.30.02, Python 3.6.8
See https://gist.github.com/SebastianAchilles/1b234bb0c9082881781ce636fcdf8aff for a full test report.

@branfosj
Copy link
Member

branfosj commented Jul 1, 2023

Test report by @branfosj
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
bear-pg0203u29a.bear.cluster - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), 1 x NVIDIA NVIDIA A100-SXM4-80GB, 520.61.05, Python 3.6.8
See https://gist.github.com/branfosj/5ec005718096106da152da74a20ad755 for a full test report.

@akesandgren
Copy link
Contributor Author

@branfosj ping?

@branfosj branfosj modified the milestones: 4.x, next release (4.8.1?) Jul 8, 2023
@branfosj
Copy link
Member

branfosj commented Jul 8, 2023

Going in, thanks @akesandgren!

@branfosj branfosj merged commit 373e807 into easybuilders:develop Jul 8, 2023
5 checks passed
@akesandgren akesandgren deleted the 20230628160416_new_pr_HOOMD-blue401 branch July 9, 2023 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants