Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some libfabric transports in the OFI MTL do not handle "arbitrarily-sized" messages #7058

Closed
jsquyres opened this issue Oct 8, 2019 · 1 comment

Comments

@jsquyres
Copy link
Member

jsquyres commented Oct 8, 2019

Per #6976, the PSM2 transport in the OFI MTL does not handle messages larger than 4GB bytes (other OFI providers do). The verbs libfabric provider has a similar issue -- it is limited to 2GB messages.

PR's on master/v4.0.x/v3.1.x/v3.0.x were added to make the ofi MTL obey the libfabric property specifying the max message size (ep_attr->max_msg_size): #7003, #7004, #7005, #7006. This at least makes Open MPI noisly fail to send large messages in this case (vs. silently fail, which is what it was doing before).

This is a good stop-gap solution, but the proper solution is to allow MPI applications to send "arbitrarily-sized" messages (i.e., messages only limited by resources such as memory space).

@mwheinz mwheinz changed the title PSM2 transport in the OFI MTL needs to handle messages >2B in size PSM2 transport in the OFI MTL needs to handle messages >4GB in size Oct 8, 2019
@jsquyres jsquyres changed the title PSM2 transport in the OFI MTL needs to handle messages >4GB in size Some transports in the OFI MTL do not handle "arbitrarily-sized" messages Oct 8, 2019
@jsquyres jsquyres changed the title Some transports in the OFI MTL do not handle "arbitrarily-sized" messages Some libfabric transports in the OFI MTL do not handle "arbitrarily-sized" messages Oct 8, 2019
bwbarrett added a commit to bwbarrett/ompi that referenced this issue Mar 17, 2021
Following up on discussion around Issue open-mpi#7058, add a note to the
README about the OFI MTL's handling of max_msg_size.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
bwbarrett added a commit to bwbarrett/ompi that referenced this issue Mar 17, 2021
Following up on discussion around Issue open-mpi#7058, add a note to the
README about the OFI MTL's handling of max_msg_size.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit 51c0053)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
bwbarrett added a commit to bwbarrett/ompi that referenced this issue Mar 17, 2021
Following up on discussion around Issue open-mpi#7058, add a note to the
README about the OFI MTL's handling of max_msg_size.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit 51c0053)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
@bwbarrett
Copy link
Member

As we discussed in the dev meeting yesterday, added a README note that we're not going to fix this and customers should approach their provider developers to fix it there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants