-
Notifications
You must be signed in to change notification settings - Fork 843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPI_File_Write* bug due to uncatched write(v) calls for iov_len > 2,147,479,552 bytes #2399
Comments
Christoph, thank you for your bug report and analysis. I think the second mca_io_ompio_cycle_buffer_size to something (e.g. 1 GB or similar, value has to be given in bytes THis would not solve the issue for non-blocking operations. We do have Also as a side note, I am fairly confident that the problem should not Thanks Edgar On 11/11/2016 3:58 AM, cniethammer wrote:
|
Hello Edgar, Thanks for pointing out this option. I can confirm that setting Best |
Christoph, |
@edgargabriel @cniethammer If you guys want this in v2.0.2, please come to a decision ASAP (i.e., today/tomorrow). Otherwise we're going to have to push this to v2.0.3. |
this is the bug fix that was merged a couple of days back with the default value for the parameter. So it is done for now, I will work on the second solution that we discussed over the break. |
Ah! Ok, my bad -- I should have read closer / remembered better. 😊 |
@edgargabriel @cniethammer It's been about 2 months since the last update -- any progress? The door for v2.1.0 is just about closed. I'm going to pre-emptively move this to v2.1.1; feel free to move to v3.0.0 if you think that is more realistic. |
feel free to move it to 3.0, I will not get to it. The fix that we committed for the 2.0 series (and 2.1, but will double check that), should do the trick for now. |
@edgargabriel Done; thanks. |
@edgargabriel should we just move this to future or do you think you can get a fix in to 3.0.1? |
@hppritcha it is on my to do list for this summer, but I don't think it will make it for 3.0.1, I hope to have it ready for 3.1 (or 4.0 whatever the next release is). |
There is a bug with MPI_File_write* operations in Open MPI which truncates files or writes incorrect data to files.
I broke the problem down to the POSIX writev call used in ompi/mca/fbtl/posix/fbtl_posix_pwritev.c that will not write more than 2,147,479,552 bytes to disk. From man 2 write NOTES:
My suggestions:
Either (1) check for writes completing with the correct number of written bytes and adding another write or (2) modify the convertor used in mca_io_ompio_build_io_array/ompi_io_ompio_decode_datatype to build an iov struct with elements not larger than 2,147,479,552 bytes when using POSIX I/O on Linux systems.
The text was updated successfully, but these errors were encountered: