Skip to content

Conversation

@hjelmn
Copy link
Member

@hjelmn hjelmn commented Oct 7, 2015

This commit add the opal_progress_run_once function. This function
creates a new event that is put on the opal_sync_event_base that runs
the requested function as part of opal_progress().

Signed-off-by: Nathan Hjelm hjelmn@lanl.gov

…ress

This commit add the opal_progress_run_once function. This function
creates a new event that is put on the opal_sync_event_base that runs
the requested function as part of opal_progress().

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
@hjelmn
Copy link
Member Author

hjelmn commented Oct 7, 2015

@jsquyres @rhc54 @bosilca This will replace the run_in_main stuff in the openib btl. I can move this function down there if there if this doesn't seem like good functionality to have up in opal/runtime.

@hjelmn
Copy link
Member Author

hjelmn commented Oct 7, 2015

If this goes into master it should be rolled up with the rest of the commits associated with #805.

@jsquyres
Copy link
Member

jsquyres commented Oct 7, 2015

I'd have to check, but I have a dim recollection that the run_in_main stuff was only because of the lack of thread safety with mpool... if that's correct, then there's no real need for this run_in_main stuff any more.

@hjelmn
Copy link
Member Author

hjelmn commented Oct 7, 2015

I think some of it had to do with the timing in the thread servicing rdmacm events.

@hjelmn
Copy link
Member Author

hjelmn commented Oct 7, 2015

I should add this also helps with the case when opal_using_threads() is 0. If there is an error the error needs to be invoked on the application thread and not the opal progress thread since it may cause an upcall into the pml (only bfo did this).

@jsquyres
Copy link
Member

jsquyres commented Oct 7, 2015

Are you sure?

The thread was just to ensure that RDMACM events could be processed in a timely fashion, but the progress thread couldn't actually do anything once the incoming connection request was fully processed (and therefore connected). So it threw the connection back up to the main thread.

If mpool is made thread safe, the need for "do something in the main thread as a proxy for my progress thread" functionality goes away. (I think / disclaimer: I haven't looked at the code)

@jsquyres
Copy link
Member

jsquyres commented Oct 7, 2015

@hjelmn and I chatted on the phone and figured it all out.

@hjelmn
Copy link
Member Author

hjelmn commented Oct 7, 2015

Sounds like this is not a function worth exposing. Will put it back down into the btl.

@hjelmn hjelmn closed this Oct 7, 2015
@bosilca
Copy link
Member

bosilca commented Oct 7, 2015

👍 on the last @hjelmn comment. I don't think we want to encourage developers to start using single shot events on the main thread.

jsquyres pushed a commit to jsquyres/ompi that referenced this pull request Sep 19, 2016
…po_base_dist_graph_neighbors

topo/base: correctly support MPI_UNWEIGHTED in mca_topo_base_dist_gra…
@hjelmn hjelmn deleted the opal_progress_run_once branch July 12, 2018 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants