Skip to content

rml assertion failures on v2.0.0 #1506

@jsquyres

Description

@jsquyres

On Cisco MTT -- both with TCP and the usNIC BTLs, I'm getting thousands of RML assertion failures that look like this (this particular test is the hello_c program):

 Warning :: opal_list_remove_item - the item 0x7373a0 is not on the list 0x2aaaab3675c0 
c_hello: base/rml_base_msg_handlers.c:83: orte_rml_base_post_recv: Assertion `((0xdeafbeedULL << 32)
+ 0xdeafbeedULL) == ((opal_object_t *) (recv))->obj_magic_id' failed.
[mpi028:21351] *** Process received signal ***
[mpi028:21351] Signal: Aborted (6)
[mpi028:21351] Signal code:  (-6)[mpi028:21351] [ 0] /lib64/libpthread.so.0[0x3e2fe0f710]
[mpi028:21351] [ 1] /lib64/libc.so.6(gsignal+0x35)[0x3e2fa32925]
[mpi028:21351] [ 2] /lib64/libc.so.6(abort+0x175)[0x3e2fa34105]
[mpi028:21351] [ 3] /lib64/libc.so.6[0x3e2fa2ba4e]
[mpi028:21351] [ 4] /lib64/libc.so.6(__assert_perror_fail+0x0)[0x3e2fa2bb10]
[mpi028:21351] [ 5] /home/mpiteam/scratches/community/2016-03-28cron/RLAY/installs/bFje/install/lib/libopen-rte.so.20(orte_rml_base_post_recv+0x1d4)[0x2aaaab0d067f]
[mpi028:21351] [ 6] /home/mpiteam/scratches/community/2016-03-28cron/RLAY/installs/bFje/install/lib/libopen-pal.so.20(+0xef745)[0x2aaaab458745]
[mpi028:21351] [ 7] /home/mpiteam/scratches/community/2016-03-28cron/RLAY/installs/bFje/install/lib/libopen-pal.so.20(+0xef9b7)[0x2aaaab4589b7]
[mpi028:21351] [ 8] /home/mpiteam/scratches/community/2016-03-28cron/RLAY/installs/bFje/install/lib/libopen-pal.so.20(opal_libevent2022_event_base_loop+0x298)[0x2aaaab45900a]
[mpi028:21351] [ 9] /home/mpiteam/scratches/community/2016-03-28cron/RLAY/installs/bFje/install/lib/libopen-pal.so.20(+0x4c3d4)[0x2aaaab3b53d4]
[mpi028:21351] [10] /lib64/libpthread.so.0[0x3e2fe079d1]
[mpi028:21351] [11] /lib64/libc.so.6(clone+0x6d)[0x3e2fae8b6d]
[mpi028:21351] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 16 with PID 21351 on node mpi028 exited on signal 6 (Aborted).
--------------------------------------------------------------------------

I don't think that this is related to the BTL, but here's separate links, anyway:

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions