Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable collective MPI IO. #623

Merged
merged 7 commits into from Oct 28, 2022
Merged

Enable collective MPI IO. #623

merged 7 commits into from Oct 28, 2022

Conversation

1uc
Copy link
Collaborator

@1uc 1uc commented Oct 27, 2022

This PR adds the ability to pass a data transfer property list to read and write operations. This is needed to enable collective MPI-IO.

In essence this is #621

roblatham00 and others added 5 commits October 27, 2022 13:15
* Enable collective I/O when reading/writing slices

* Make test datatypes match in memory and in file

While it's perfectly legal for memory and file to differ, HDF5 will
"break collectives" and do the transfer independently.  See
H5Pget_mpio_no_collective_cause() documentation for other reasons
collectives might break.

* debugging:  report why collective I/O didn't happen
@1uc 1uc marked this pull request as ready for review October 27, 2022 13:12
@1uc 1uc requested a review from alkino October 27, 2022 13:12
matz-e
matz-e previously approved these changes Oct 27, 2022
Copy link
Member

@matz-e matz-e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks decent to me.

@1uc
Copy link
Collaborator Author

1uc commented Oct 27, 2022

@roblatham00 would this variation work for your applications?

<< std::endl;
throw std::runtime_error("IO wasn't collective.");
} else {
std::cout << "Success! The operation was collective.\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding an example of how to obtain the property list and query it. In a parallel test, this output will clutter the screen so can we delete the "Success" and emit text only if an error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is an example, and clients can do whatever they want with highfive, but I think throwing a runtime error is overkill for this condition. The user requested collective i/o, HDF5 library was unable to perform I/o collectively (for whatever reason), but did perform I/O independently.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. The intention behind throwing is that we run the examples a part of our CI. So what you've spotted is that I'm abusing the examples as a test that the request for collective I/O was forwarded correctly to HDF5.

It doesn't cover collective reads, nor all of the entry points. I'll add proper test coverage for this; which frees up the example to be purely instructional. Thank you!

We now clearly state what it means that the operation failed to
use MPI collectives. Additionally, noisy output has been removed.
Add unit-tests for writing/reading collectively & independently.
@1uc
Copy link
Collaborator Author

1uc commented Oct 28, 2022

By merging, I'd like to free up other PRs which are blocked by this one. If there's any changes anyone would like to propose, please do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants