-
Notifications
You must be signed in to change notification settings - Fork 68
Fix a segfault occuring in IMB Alltoall if tuned is unselected #1015
Conversation
module_data to hold one with the largest necessary size. This array is only allocated when needed, and it is released upon communicator destruction. (cherry picked from commit a324602)
number of requests. Remove unnecessary fields.We don't need these fields. (cherry picked from commit 01b32ca)
in the base). Correctly deal with persistent requests (they must be always freed when they are stored in the request array associated with the communicator). Always use MPI_STATUS_IGNORE for single request waiting functions. (cherry picked from commit 88492a1)
|
Test PASSed. |
|
@derbeyn Please mark a milestone and appropriate labels for this issue. See https://github.com/open-mpi/ompi/wiki/OmpiReleaseBotCommands. If this issue is a must-fix for v2.0.0, please add the "blocker" label to this issue. @derbeyn These commits look like they came from @bosilca; are you you saying that you have reviewed them? (per our requirement that all PRs to release branches must be reviewed) |
|
For the labels, I didn't know I had to add them my self. |
|
@derbeyn Thanks! |
|
Jeff, I just finished reviewing the 3 commits, and I have some remarks. |
|
@derbeyn Not a stupid question at all! Click on the "Files changed" tab on this PR, and click on the "+" sign on any line number where you want to ask a question. It'll pop open a text box where you can type your comment / question / etc. Feel free to put "@bosilca" in there and use like 1st person address -- it'll guarantee to send him a mail with any comment that contains @bosilca. All your comments will also show up here on the main PR tab, too, with some code around it for context. It's a very handy way to comment on exactly the code you're talking about. Make sense? |
|
BTW, I see that Github just recently added some changes to the "Files changed" tab -- you can view the changes per commit (e.g., there are 3 commits on this PR), and/or by file (vs. looking at the union of all changes / all diffs on this PR). So if it helps you break up your review to look by commit and/or by individual file, you can do so. |
|
Thx a lot! |
| ompi_request_t** requests; | ||
|
|
||
| requests = (ompi_request_t**)malloc( size * sizeof(ompi_request_t*) ); | ||
| requests = coll_base_comm_get_reqs(module->base_data, size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bosilca (minor issue) Missing check on the pointer returned by coll_base_comm_get_reqs()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we don't have checks regarding the return from the coll_base_comm_get_reqs anywhere.
| preq = coll_base_comm_get_reqs(base_module->base_data, size * 2); | ||
|
|
||
| if (i == rank) { | ||
| /* Copy the data into the temporary buffer */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bosilca (line 77 above): no need to allocate size*2 requests, we are only posting 2 requests in each turn of the loop
|
@derbeyn you might want to cherry-pick open-mpi/ompi@4b38b6bd. It addresses most of your concerns, except the one where you suggest to move the check for NULL in the free_reqs. |
|
@bosilca OK, Thanks |
|
And also open-mpi/ompi@57eadb0d to fix an issue identified by Coverity. |
|
Thanks! |
|
@derbeyn Could you add those 2 commits to this PR in the very near future? We're trying to release the next rc for v2.0.0. Do you know how to add these 2 commits to this PR? You should be able to cherry-pick them from master (see the wiki) and then push them to your branch here. The PR will automatically update (because a PR always reflects the current state of a branch -- so if you add/change commits on that branch, they will automatically be reflected here on the PR). |
|
@derbeyn Any update, perchance? |
This patch addresses most (if not all) @derbeyn concerns expressed on open-mpi#1015. I added checks for the requests allocation in all functions, ompi_coll_base_free_reqs is called with the right number of requests, I removed the unnecessary basic_module_comm_t and use the base_module_comm_t instead, I remove all uses of the COLL_BASE_BCAST_USE_BLOCKING define, and other minor fixes. (cherry picked from commit 4b38b6b)
|
So sorry for the late answer! We had a long week-end for eastern, so I didn't connect. |
|
Test PASSed. |
|
|
||
| /* Initiate all send/recv to/from others. */ | ||
| reqs = coll_base_comm_get_reqs(base_module->base_data, 2); | ||
| if( NULL == reqs ) { err = OMPI_ERR_OUT_OF_RESOURCE; line = __LINE__; goto error_hndl; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bosilca
Why not just returning OMPI_ERR_OUT_OF_RESOURCE here?
If you don't, you should check reqs!=NULL before calling free_reqs() at error_hndl
(or insert a check inside the routine free_reqs() as proposed in the first review)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 reasons:
- I prefer to have a single return statement for errors
- to get access to the error message.
Fix CID 1325868 (open-mpi#1 of 1): Dereference after null check (FORWARD_NULL): Fix CID 1325869 (open-mpi#1-2 of 2): Dereference after null check (FORWARD_NULL): Here reqs can indeed be NULL. Added a check to ompi_coll_base_free_reqs to prevent dereferencing NULL pointer. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov> (cherry picked from commit 2f4e532)
(cherry picked from commit 004c0cc)
|
Test PASSed. |
|
@jsquyres can you please tell me 2 things:
|
|
@derbeyn Here's all the labels on the ompi-release repo: https://github.com/open-mpi/ompi-release/labels (i.e., go to the github repo, go into the pull requests, and click on the "labels" button). We probably don't have a wiki page describing the process (probably should add that...). Generally, a PR on ompi-release needs to be reviewed and RM approved, and then we'll merge it in. The "thumbs up" emoticon adds "reviewed" and removes "pushed-back". The thumbs down is the opposite (it adds "pushed-back" and removes "reviewed"). These labels are just a simple visual way for someone to scan the list of PRs in https://github.com/open-mpi/ompi-release/pulls and know what still needs to be done. Loosely speaking:
Does that help? |
|
Thanks a lot! |
|
I'm OK with the review |
|
@hppritcha Good to go |

If tuned is unselected, it's up to the basic module to handle the collectives.
The problem is that the request arrays are not consistently used in that module.
Cherry picking three commits fixes the issue.