-
Notifications
You must be signed in to change notification settings - Fork 68
OSC portals4 : do not generate an EVENT_SEND to avoid to filter it #903
OSC portals4 : do not generate an EVENT_SEND to avoid to filter it #903
Conversation
|
Test PASSed. |
|
@francois-wellenreiter please assign a reviewer. Also is this an enhancement or a bug fix? |
|
@regrant could you please review this enhancement |
|
@francois-wellenreiter @regrant What version is this targeted for? Please see https://github.com/open-mpi/ompi/wiki/OmpiReleaseBotCommands for how to assign milestones, labels, and reviewers. Thanks! |
|
@francois-wellenreiter while there's no explicit use of the PTL_EVENT_SEND event, it's the only way that you can catch initiator side errors since send events are not counted (CT) either. This could lead to infinite polling on your CTWait call if you're waiting for something that will never complete. |
|
@regrant, I am really surprised, my uderstanding of the portals4 specification is that PTL_MD_EVENT_CT_ACK allows to count sucessfull AND unsuccessfull PTL_EVENT_ACK events. |
|
@francois-wellenreiter I wasn't referring to how the spec says things should work, you're right, that should happen exactly as described. I think I was unclear about what I was describing in the OMPI implementation. So the only time we call EQ_Get is in the progress callback, some of the functions depend on the op_count completion here, and so some of the functions rely on this for request completion. If you want to use CTs exclusively you can in places like the passive target ops (they already use CTs exclusively). It is the active target operations that use event counts to signal completion (and call the progress_callback function), so it is here where you need to know that there are no failures (you're not checking the ct.failures value). I misspoke here in my earlier comment it was not the CTWait, but this event count waiting that could be perturbed (also you should get an event back regardless of the disable flag, if it is a failure). If this is changed I expect there to be changes that will need to be made to the places where the progress_callback function is relied upon. Are you testing these functions with the code changes and everything is working? |
|
This checks out and should be merged. @hppritcha there should be no danger in including this in 2.0 if possible. |
|
given that this is highly portals specific and unlikely to cause additional bug fixes prior to 2.0.0 release I'm okay with this. @jsquyres lets discuss on Monday. |
|
I'm assuming @regrant meant: 👍 |
No description provided.