-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPI_File_read_at_all doesn't read all values when certain datatypes are used #10546
Comments
Forgot to add: If all ranks read the entire file then the program passes. This implies that the complicated datatype on rank 1 is interfering with I/O on other ranks (but not the rank with the complicated datatype). |
@edgargabriel Can you have a look? |
@fortnern thanks for the report and the reproducer! meanwhile, you can force using the ROMIO component instead of the (native)
|
I will have a look, but it might take a couple of days until I get to it. |
Test passes with ROMIO, looks like the problem is with OMPIO. Thanks! |
This commit fixes the calculation of the buffer length that needs to be read when using data sieving. The original code implicitely assumed that the ub of an iov at index j+1 is larger than the ub of the iov at index j. This is not necessarily the case for read operations. Hence, the code needs to keep track of the max. ub found. Fixes issue open-mpi#10546 Signed-off-by: Edgar Gabriel <edgar.gabriel1@outlook.com>
@fortnern thank you for the bug report, I can confirm that it was a bug in ompio, introduced with the data sieving feature in the 4.1 release series. I will file a pr for 4.1 and the 5.0 releases once the code has been committed to the main repository. The 4.0 release or previous ompi releases did not have this feature. Just for documentation purposes, it is possible to circumvent this issue in the 4.1 release by setting the
|
This commit fixes the calculation of the buffer length that needs to be read when using data sieving. The original code implicitely assumed that the ub of an iov at index j+1 is larger than the ub of the iov at index j. This is not necessarily the case for read operations. Hence, the code needs to keep track of the max. ub found. Fixes issue open-mpi#10546 Signed-off-by: Edgar Gabriel <edgar.gabriel1@outlook.com>
This commit fixes the calculation of the buffer length that needs to be read when using data sieving. The original code implicitely assumed that the ub of an iov at index j+1 is larger than the ub of the iov at index j. This is not necessarily the case for read operations. Hence, the code needs to keep track of the max. ub found. Fixes issue open-mpi#10546 Signed-off-by: Edgar Gabriel <edgar.gabriel1@outlook.com>
This commit fixes the calculation of the buffer length that needs to be read when using data sieving. The original code implicitely assumed that the ub of an iov at index j+1 is larger than the ub of the iov at index j. This is not necessarily the case for read operations. Hence, the code needs to keep track of the max. ub found. Fixes issue open-mpi#10546 Signed-off-by: Edgar Gabriel <edgar.gabriel1@outlook.com> (cherry picked from commit 6891cee)
This commit fixes the calculation of the buffer length that needs to be read when using data sieving. The original code implicitely assumed that the ub of an iov at index j+1 is larger than the ub of the iov at index j. This is not necessarily the case for read operations. Hence, the code needs to keep track of the max. ub found. Fixes issue open-mpi#10546 Signed-off-by: Edgar Gabriel <edgar.gabriel1@outlook.com> (cherry picked from commit 6891cee)
@fortnern Looks like the fix was merged into the v4.1.x branch -- could you test the latest v4.1.x snapshot to see if the issue is fixed for you? https://www.open-mpi.org/nightly/v4.1.x/ |
Hi all, I just tested @fortnern's test program here, as well as some internal test programs, with the |
We've confirmed the test program works on the latest snapshot. Thanks! We'll test the HDF5 branch in question with it soon to verify. |
The multi dataset branch of HDF5 that was failing with 4.1.2 passes with the latest master. Thanks again! |
Background information
While testing a new feature in HDF5, we noticed that OpenMPI sometimes fails in a test that generates random I/O patterns in HDF5. I've tried to create a minimal pure MPI example that shows the failure.
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
Tested on 4.1.1, 4.1.2, 4.1.3, 4.1.4
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
From source
If you are building/installing from a git clone, please copy-n-paste the output from
git submodule status
.N/A
Please describe the system on which you are running
Tested on multiple Linux machines, running on a single node.
Details of the problem
The test program here should be run with 3 ranks. It consists of 2 phases: first all 3 ranks participate in a collective write, with rank 0 writing 5 integers to the file and ranks 1 and 2 writing nothing. Next the file is closed and reopened, then all 3 ranks participate in a collective read. Rank 0 reads all 5 integers without constructing a datatype. Rank 2 reads nothing. Rank 1 constructs a datatype using MPI_Type_contiguous, MPI_Type_vector, MPI_Type_create_hindexed, then MPI_Type_create_resized to select the middle 3 integers (this series of calls mimics how HDF5 constructs a datatype for hyperslab selections). Ranks 1 and 2 see the expected data, but rank 0 does not read the last integer. This test passes in MPICH.
I have verified that the file contains the correct data. The test also fails if you run it once to create the file (check it with a hex editor if you want), then comment out the write section and run again. If the program is run with 2 ranks it passes (even though the rank that is excluded by doing this (rank 2) does nothing but participate in collective calls). If rank 2 is modified to read the entire file like rank 0, then both ranks 0 and 2 fail to read the last integer.
Note: If you include verbatim output (or a code block), please use a GitHub Markdown code block like below:
The text was updated successfully, but these errors were encountered: