Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

step-40: update IO section #13721

Merged
merged 2 commits into from May 22, 2022
Merged

step-40: update IO section #13721

merged 2 commits into from May 22, 2022

Conversation

tjhei
Copy link
Member

@tjhei tjhei commented May 11, 2022

Update the documentation on how we generate output using MPI I/O (we
originally wrote one vtu per process).
Remove the limit of 32 MPI ranks, because it is confusing according to
my students. I think this was put in place to avoid generating 1000+
output files, which is no longer the case (we use 8 here). The output is
a few MB in total, so I don't see a reason to limit things.

Update the documentation on how we generate output using MPI I/O (we
originally wrote one vtu per process).
Remove the limit of 32 MPI ranks, because it is confusing according to
my students. I think this was put in place to avoid generating 1000+
output files, which is no longer the case (we use 8 here). The output is
a few MB in total, so I don't see a reason to limit things.
@tjhei
Copy link
Member Author

tjhei commented May 11, 2022

FYI @zjiaqi2018 @pengfej @singima

Comment on lines 545 to 558
// bottleneck in parallelizing. In step-18, we have moved this step out of
// the actual computation but shifted it into a separate program that later
// combined the output from various processors into a single file. But this
// doesn't scale: if the number of processors is large, this may mean that
// the step of combining data on a single processor later becomes the
// longest running part of the program, or it may produce a file that's so
// large that it can't be visualized any more. We here follow a more
// sensible approach, namely creating individual files for each MPI process
// and leaving it to the visualization program to make sense of that.
// doesn't scale for several reasons: First, creating a single file per
// processor will overwhelm the filesystem with a large number of processors.
// Second, the step of combining all output files later can become the longest
// running part of the program, or it may produce a file that's so large that
// it can't be visualized easily any more.
//
// We here follow a more sophisticated approach that uses
// high-performance, parallel IO routines using MPI I/O to write to
// a small, fixed number of visualization files (here 8). We also
// generate a .pvtu record referencing these .vtu files, which can
// be opened directly in visualizatin tools like ParaView and VisIt.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the text in lines 545-547 is no longer true. In step-18 we now just create one file per process. Can you replcae the sentence "In step-18, we have moved this step out of the actual computation but shifted it into a separate program that later combined the output from various processors into a single file." by the following:

In those two programs, we simply generate one output file per process. That worked because the parallel::shared::Triangulation cannot be used with large numbers of MPI processes anyway.

Then also remove the "Second, ..." text since it's no longer what we do.

@tjhei
Copy link
Member Author

tjhei commented May 17, 2022

updated.

@bangerth bangerth merged commit 6d02bd1 into dealii:master May 22, 2022
@tjhei tjhei deleted the step-40-no-limit-io branch May 22, 2022 10:20
mkghadban pushed a commit to OpenFCST/dealii that referenced this pull request Sep 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants