-
Notifications
You must be signed in to change notification settings - Fork 934
Description
Random errors occur on MPI_COMPARE_AND_SWAP when using pt2pt OSC.
Run my cswap.c at Gist with:
mpiexec -n 2 --mca osc pt2pt --mca btl self,vader ./cswap
You'll see any of the following errors or another on the rank 0.
cswap: ompi-src/ompi/mca/pml/ob1/pml_ob1_sendreq.h:251: send_request_pml_complete: Assertion `0 == sendreq->req_send.req_base.req_pml_complete' failed.
[mymachine:22183] *** Process received signal ***
[mymachine:22183] Signal: Aborted (6)
[mymachine:22183] Signal code: (-6)
[mymachine:22183] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7f66eeb2b8d0]
[mymachine:22183] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f66ee7a8107]
[mymachine:22183] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f66ee7a94e8]
[mymachine:22183] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x2e226)[0x7f66ee7a1226]
[mymachine:22183] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x2e2d2)[0x7f66ee7a12d2]
[mymachine:22183] [ 5] ompi/lib/openmpi/mca_pml_ob1.so(+0x19dcf)[0x7f66e5faddcf]
[mymachine:22183] [ 6] ompi/lib/openmpi/mca_pml_ob1.so(+0x1a7d4)[0x7f66e5fae7d4]
[mymachine:22183] [ 7] ompi/lib/openmpi/mca_pml_ob1.so(+0x1a8df)[0x7f66e5fae8df]
[mymachine:22183] [ 8] ompi/lib/openmpi/mca_btl_vader.so(+0x3ec2)[0x7f66e65d2ec2]
[mymachine:22183] [ 9] ompi/lib/openmpi/mca_btl_vader.so(mca_btl_vader_poll_handle_frag+0x68)[0x7f66e65d537d]
[mymachine:22183] [10] ompi/lib/openmpi/mca_btl_vader.so(+0x6598)[0x7f66e65d5598]
[mymachine:22183] [11] ompi/lib/openmpi/mca_btl_vader.so(+0x6753)[0x7f66e65d5753]
[mymachine:22183] [12] ompi/lib/libopen-pal.so.0(opal_progress+0xa9)[0x7f66ee18a0eb]
[mymachine:22183] [13] ompi/lib/openmpi/mca_pml_ob1.so(+0xd3ca)[0x7f66e5fa13ca]
[mymachine:22183] [14] ompi/lib/openmpi/mca_pml_ob1.so(+0xd5ab)[0x7f66e5fa15ab]
[mymachine:22183] [15] ompi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_send+0x4ee)[0x7f66e5fa3554]
[mymachine:22183] [16] ompi/lib/libmpi.so.0(PMPI_Send+0x2a7)[0x7f66eeddc039]
[mymachine:22183] [17] ./cswap[0x400ad2]
[mymachine:22183] [18] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f66ee794b45]
[mymachine:22183] [19] ./cswap[0x400929]
[mymachine:22183] *** End of error message ***
[warn] opal_libevent2022_event_base_loop: reentrant invocation. Only one event_base_loop can run on each event_base at once.
*** Error in `./cswap': free(): invalid pointer: 0x00007fb506e2a240 ***
[mymachine:20230] *** Process received signal ***
[mymachine:20230] Signal: Aborted (6)
[mymachine:20230] Signal code: (-6)
[mymachine:20230] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7fb5068b68d0]
[mymachine:20230] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7fb506533107]
[mymachine:20230] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7fb5065344e8]
[mymachine:20230] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x73204)[0x7fb506571204]
[mymachine:20230] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x789de)[0x7fb5065769de]
[mymachine:20230] [ 5] /lib/x86_64-linux-gnu/libc.so.6(+0x796e6)[0x7fb5065776e6]
[mymachine:20230] [ 6] ompi/lib/openmpi/mca_pml_ob1.so(+0xe260)[0x7fb5011c4260]
[mymachine:20230] [ 7] ompi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_send+0x504)[0x7fb5011c556a]
[mymachine:20230] [ 8] ompi/lib/libmpi.so.0(PMPI_Send+0x2a7)[0x7fb506b67039]
[mymachine:20230] [ 9] ./cswap[0x400ad2]
[mymachine:20230] [10] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7fb50651fb45]
[mymachine:20230] [11] ./cswap[0x400929]
[mymachine:20230] *** End of error message ***
These error occur on both Open MPI master (pt2pt OSC) and v1.8 branch (rdma OSC). Though I didn't confirm, probably v2.x branch (pt2pt OSC) and v1.10 branch (rdma OSC) has the same problem.
(rdma OSC was renamed to pt2pt OSC on master and v2.0, while new rdma OSC was introduced in master)
The cause is related to the ob1 PML blocking send optimization and the recursive send operation via the request completion callback (ompi_request_t::req_complete_cb).
In my cswap.c, the following steps are taken.
- On rank 0 (and rank 1),
MPI_Win_createis called and the callback functionompi_osc_pt2pt_callbackis registered for a request returned by themca_pml_ob1_irecv_initfunction called by theompi_osc_pt2pt_frag_start_receivefunction. - On rank 1,
MPI_Compare_and_swapis called and this function sends a control message ofOMPI_OSC_PT2PT_HDR_TYPE_CSWAPto rank 0. - On rank 0,
MPI_Sendis called and the special requestmca_pml_ob1_sendreqis used for this call in themca_pml_ob1_sendfunction. - On rank 0,
ompi_request_wait_completionfunction is called for the request if themca_pml_ob1_send_inlinefunction cannot send the message immediately. This function blocks until the completion of the send operation. - On rank 0, the
ompi_osc_pt2pt_callbackfunction registered at 1. is called when the control message of 2. arrives. - On rank 0, the
mca_pml_ob1_sendfunction is called again (recursively) to send back a control message in theompi_osc_pt2pt_cswap_startfunction. - On rank 0, the special request
mca_pml_ob1_sendreqis used again though is it in use. - On rank 0, a bad thing occur.
A stack trace at 7. will be something like this:
MPI_Send
mca_pml_ob1_send
ompi_request_wait_completion // wait for the send operation
opal_condition_wait
opal_progress
(BTL progress function)
mca_pml_ob1_recv_frag_callback_match // control message for CSWAP
recv_request_pml_complete // completion of irecv_init
ompi_request_complete
ompi_osc_pt2pt_callback // callback of irecv_init
process_frag
process_cswap
ompi_osc_pt2pt_cswap_start
mca_pml_ob1_send // recursive send operation
I confirmed that the error doesn't occur if I replace MCA_PML_CALL(send(...)) to MCA_PML_CALL(isend(...)) in the ompi_osc_pt2pt_cswap_start function. But I think it may not be the real fix. If we allow recursive call of the mca_pml_ob1_send function, we should change the management of the mca_pml_ob1_sendreq. For example, in the mca_pml_ob1_send function, check the request state of mca_pml_ob1_sendreq and don't use it if it is in use.
Though I used vader BTL above, this error is not specific to this BTL. This error occurs very often with vader BTL on my machine. It occurs also with openib BTL sometimes.