Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade write_vtu_in_parallel based on mpi large IO update #13673

Merged
merged 1 commit into from May 30, 2022

Conversation

pengfej
Copy link
Contributor

@pengfej pengfej commented May 5, 2022

Replace the MPI I/O routines for DataOut::write_vtu_in_parallel:

  • Use the large count facility to support >2GB+ individual writes
  • Use write_at_all instead of write_shared (this is up to 10x faster in testing on Frontera)

@tjhei
Copy link
Member

tjhei commented May 5, 2022

This is WIP and depends on #13611

@pengfej
Copy link
Contributor Author

pengfej commented May 7, 2022

I will remove all the extra comments and space after all so don't worry!

@pengfej pengfej force-pushed the myowndealii branch 2 times, most recently from d5cb7de to 0848d2c Compare May 8, 2022 21:40
@tamiko tamiko added this to the Release 9.4 milestone May 25, 2022
@pengfej pengfej force-pushed the myowndealii branch 4 times, most recently from 2ae3559 to e4b1579 Compare May 25, 2022 22:48
source/base/data_out_base.cc Outdated Show resolved Hide resolved
source/base/data_out_base.cc Show resolved Hide resolved
Comment on lines 7795 to 7798
ierr = MPI_File_write_at(fh,
footer_offset,
ss.str().c_str(),
footer_size,
MPI_CHAR,
MPI_STATUS_IGNORE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right function call, or don't you need one from the big_mpi library?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the best of my knowledge, the header and footer are very small in terms of size, so using write_at and write_at_c (The one in big_mpi library) will not make a difference :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Pengfei.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do it just out of consistency and so that people like me don't have to think about the question in the future :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

@pengfej pengfej force-pushed the myowndealii branch 2 times, most recently from 2e6e346 to 28f1513 Compare May 26, 2022 00:48
source/base/data_out_base.cc Outdated Show resolved Hide resolved
@tjhei tjhei removed the WIP ⚠️ label May 26, 2022
Copy link
Member

@tjhei tjhei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Can you please squash your commits?

@tjhei
Copy link
Member

tjhei commented May 26, 2022

Let me test this some more before merging...

@tjhei
Copy link
Member

tjhei commented May 26, 2022

@pengfej We forgot to test one important case: if you run the code with n_ranks == 1, then rank 0 will try to send to itself, which causes a hang. You can see this with tests/mpi/parallel_vtu_01.mpirun=1.debug, which does not complete.

@tjhei
Copy link
Member

tjhei commented May 26, 2022

My suggestion would be to create a new block before if (myrank == 0) that does something like:

if (myrank ==0 || myrank == n_ranks -1)
{
ierr = MPI_Sendrecv_replace(&footer_offset, 1, MPI_UINT64_T, n_ranks - 1, 0, 0, 0, comm, MPI_STATUS_IGNORE);
}

You will need to delete the other MPI_Send and MPI_Recv but keep the assignment of footer_offset on rank 0 (and move the declaration of this variable up and outside the if).

MPI_Sendrecv_replace is a combination of MPI_Send and MPI_Recv and replaces the contents in the buffer given, which is exactly what we need.

@pengfej
Copy link
Contributor Author

pengfej commented May 26, 2022

@tjhei I see, thanks a lot, let me give it a try.

@pengfej
Copy link
Contributor Author

pengfej commented May 26, 2022

@tjhei rank=1 tests passed, but rank=3 test freezes for a very long time, I'm not very sure if it's a code error or my machine is not working properly.

Comment on lines 7768 to 7786
// Sending the offset for writing the footer to rank 0.
if (myrank == n_ranks - 1)
{
const std::uint64_t footer_offset = size_on_proc + offset;
AssertThrowMPI(ierr);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Sending the offset for writing the footer to rank 0.
if (myrank == n_ranks - 1)
{
const std::uint64_t footer_offset = size_on_proc + offset;
AssertThrowMPI(ierr);
}
if (myrank == n_ranks - 1)
footer_offset = size_on_proc + offset;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You never set the footer_offset correctly (you ended up creating a local variable here and setting it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Let me changed that and test it again, thanks for pointing out!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test mpirun=3 still freezes, I'm wondering if there's a way to know where exactly it freezes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is, of course. I will show you next time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good! I just read about the sendrecv_replace function and it seems like I might misordered the sender and the receiver, I'm testing on that fix now.

0,
comm,
MPI_STATUS_IGNORE);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the AssertThrow here

@tjhei
Copy link
Member

tjhei commented May 26, 2022

Can you switch source and dest of MPI_SendRecv_replace?

@pengfej
Copy link
Contributor Author

pengfej commented May 26, 2022

Can you switch source and dest of MPI_SendRecv_replace?

Just did! The tests are still frozen unfortunately..

@pengfej
Copy link
Contributor Author

pengfej commented May 26, 2022

The only part missing on the testing output is the last chunk, but I can't post that vtu code in this comment section for some reason.

Copy link
Member

@tjhei tjhei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a simpler suggestion: Just write the footer from the last rank instead of rank 0. This way you don't need to do any communication and you can use the footer_offset direcly.

source/base/data_out_base.cc Outdated Show resolved Hide resolved
MPI_STATUS_IGNORE);
AssertThrowMPI(ierr);
}

// write footer
if (myrank == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just write the footer on rank n_ranks-1 instead of 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That fixed version passed the test on my machine so I pushed it! Thank you so much!!

Updating author name

fixing work

update indent

Remove the broadcast and fix some comment and fix author name

updates

updates!

last fix

fixing indent

fixing unmatched int type

fixing indent

trying to fix the broadcast error

Trying to fix broadcast error II

adding _c to write_at functions

fixing rank issue

fixing footer_offset

last fix on rank

indent

removing some unused line
Copy link
Member

@tjhei tjhei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that things work correctly now.

@tjhei
Copy link
Member

tjhei commented May 29, 2022

This is ready to go from our side. Any final comments?

Copy link
Member

@bangerth bangerth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. It tests alright, so I'll merge, but if you wouldn't mind, please make the comments I have about comments into a separate PR.

// shared file pointer to the location after header;
ierr = MPI_File_write_shared(
fh, ss.str().c_str(), header_size, MPI_CHAR, MPI_STATUS_IGNORE);
// Write the header on rank 0 at the starting of a file, i.e., offset 0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Write the header on rank 0 at the starting of a file, i.e., offset 0.
// Write the header on rank 0 at the start of a file, i.e., offset 0.

@@ -7741,21 +7746,47 @@ DataOutInterface<dim, spacedim>::write_vtu_in_parallel(
vtk_flags,
ss);

ierr = MPI_File_write_ordered(
fh, ss.str().c_str(), ss.str().size(), MPI_CHAR, MPI_STATUS_IGNORE);
// using prefix sum to find specific offset to write at.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// using prefix sum to find specific offset to write at.
// Use prefix sum to find specific offset to write at.

DataOutBase::write_vtu_footer(ss);
const unsigned int footer_size = ss.str().size();

// Writing Footer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Writing Footer.
// Writing footer:

@bangerth bangerth merged commit 802b3e0 into dealii:master May 30, 2022
@tjhei
Copy link
Member

tjhei commented May 30, 2022

@pengfej Please create a follow-up with the fixes suggested by @bangerth , ok?

@pengfej
Copy link
Contributor Author

pengfej commented May 30, 2022

@pengfej Please create a follow-up with the fixes suggested by @bangerth , ok?

Ohh sure! Sorry I completely missed it. I only saw the merged notification.

mkghadban pushed a commit to OpenFCST/dealii that referenced this pull request Sep 8, 2022
Upgrade write_vtu_in_parallel based on mpi large IO update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants