New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade write_vtu_in_parallel based on mpi large IO update #13673
Conversation
This is WIP and depends on #13611 |
I will remove all the extra comments and space after all so don't worry! |
d5cb7de
to
0848d2c
Compare
2ae3559
to
e4b1579
Compare
source/base/data_out_base.cc
Outdated
ierr = MPI_File_write_at(fh, | ||
footer_offset, | ||
ss.str().c_str(), | ||
footer_size, | ||
MPI_CHAR, | ||
MPI_STATUS_IGNORE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the right function call, or don't you need one from the big_mpi library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the best of my knowledge, the header and footer are very small in terms of size, so using write_at and write_at_c (The one in big_mpi library) will not make a difference :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Pengfei.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do it just out of consistency and so that people like me don't have to think about the question in the future :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
2e6e346
to
28f1513
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Can you please squash your commits?
Let me test this some more before merging... |
@pengfej We forgot to test one important case: if you run the code with |
My suggestion would be to create a new block before
You will need to delete the other
|
@tjhei I see, thanks a lot, let me give it a try. |
@tjhei rank=1 tests passed, but rank=3 test freezes for a very long time, I'm not very sure if it's a code error or my machine is not working properly. |
source/base/data_out_base.cc
Outdated
// Sending the offset for writing the footer to rank 0. | ||
if (myrank == n_ranks - 1) | ||
{ | ||
const std::uint64_t footer_offset = size_on_proc + offset; | ||
AssertThrowMPI(ierr); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Sending the offset for writing the footer to rank 0. | |
if (myrank == n_ranks - 1) | |
{ | |
const std::uint64_t footer_offset = size_on_proc + offset; | |
AssertThrowMPI(ierr); | |
} | |
if (myrank == n_ranks - 1) | |
footer_offset = size_on_proc + offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You never set the footer_offset
correctly (you ended up creating a local variable here and setting it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Let me changed that and test it again, thanks for pointing out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test mpirun=3 still freezes, I'm wondering if there's a way to know where exactly it freezes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is, of course. I will show you next time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good! I just read about the sendrecv_replace function and it seems like I might misordered the sender and the receiver, I'm testing on that fix now.
source/base/data_out_base.cc
Outdated
0, | ||
comm, | ||
MPI_STATUS_IGNORE); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the AssertThrow
here
Can you switch source and dest of MPI_SendRecv_replace? |
Just did! The tests are still frozen unfortunately.. |
The only part missing on the testing output is the last chunk, but I can't post that vtu code in this comment section for some reason. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a simpler suggestion: Just write the footer from the last rank instead of rank 0. This way you don't need to do any communication and you can use the footer_offset
direcly.
source/base/data_out_base.cc
Outdated
MPI_STATUS_IGNORE); | ||
AssertThrowMPI(ierr); | ||
} | ||
|
||
// write footer | ||
if (myrank == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just write the footer on rank n_ranks-1
instead of 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That fixed version passed the test on my machine so I pushed it! Thank you so much!!
Updating author name fixing work update indent Remove the broadcast and fix some comment and fix author name updates updates! last fix fixing indent fixing unmatched int type fixing indent trying to fix the broadcast error Trying to fix broadcast error II adding _c to write_at functions fixing rank issue fixing footer_offset last fix on rank indent removing some unused line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm that things work correctly now.
This is ready to go from our side. Any final comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. It tests alright, so I'll merge, but if you wouldn't mind, please make the comments I have about comments into a separate PR.
// shared file pointer to the location after header; | ||
ierr = MPI_File_write_shared( | ||
fh, ss.str().c_str(), header_size, MPI_CHAR, MPI_STATUS_IGNORE); | ||
// Write the header on rank 0 at the starting of a file, i.e., offset 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Write the header on rank 0 at the starting of a file, i.e., offset 0. | |
// Write the header on rank 0 at the start of a file, i.e., offset 0. |
@@ -7741,21 +7746,47 @@ DataOutInterface<dim, spacedim>::write_vtu_in_parallel( | |||
vtk_flags, | |||
ss); | |||
|
|||
ierr = MPI_File_write_ordered( | |||
fh, ss.str().c_str(), ss.str().size(), MPI_CHAR, MPI_STATUS_IGNORE); | |||
// using prefix sum to find specific offset to write at. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// using prefix sum to find specific offset to write at. | |
// Use prefix sum to find specific offset to write at. |
DataOutBase::write_vtu_footer(ss); | ||
const unsigned int footer_size = ss.str().size(); | ||
|
||
// Writing Footer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Writing Footer. | |
// Writing footer: |
Upgrade write_vtu_in_parallel based on mpi large IO update
Replace the MPI I/O routines for DataOut::write_vtu_in_parallel: