-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I/O errors when MPI-IO is not available #3396
Comments
The crash is not a bug in p4est, but a routine inside deal.II that requires MPI I/O: The call to MPI_file_open fails, unsurprisingly. The function is used to store solution vectors. One could work around this but it certainly requires some extra work. HDF5 serial support is something we can look into. It might be easier to do. |
Regarding hdf5 support: |
The hdf5 issues are less critical - checkpointing is really essential in order to do any kind of production run. I think that it's actually fairly straightforward to implement a workaround in deal.ii if MPI_File commands are unavailable. I think that it's probably easiest to just write a separate file for each MPI process. Each process should write its piece to the local scratch (if available) or /tmp and then copy to the output directory in a background thread. re the HDf5 issue, I agree that it's not obvious how to deal with this problem but either of your ideas could work. I don't know enough about xdmf files to know whether it's possible to provide the informationload multiple subdomains at a timestep, similar to what's done using pvtu files. If so, having each rank write a separate file and providing the necessary metadata to load it in the xdmf file seems like the right approach. I am still not sure whether these are problems worth investing the possibly significant amount of time to solve, though it is important that they be documented in case others encounter them. I share your sentiment that having broken I/O on clusters with NFS storage would exclude a significant group of users, but these problem seem to be occurring selectively on our cluster due to the decision to not compile mpi-io support in openmpi. I think that other sysadmins must be including mpi-io and hoping for the best, and users are not encountering incorrect output, though it's certainly possible that it could occur. I think that the more pragmatic approach might be for our sysadmin to include mpi-io support but to enable the most conservative (but slow) file locking behavior by default. On our similar cluster with NFS storage at Portland State, I never had problems, though there we were using the Intel MPI library. |
On 2/27/20 7:56 PM, Timo Heister wrote:
The crash is not a bug in p4est, but a routine inside deal.II that
requires MPI I/O:
https://github.com/dealii/dealii/blob/47870e28657efe76b1bbf1207bea5d6d634aa534/source/distributed/tria.cc#L1801-L1805
The call to MPI_file_open fails, unsurprisingly. The function is used to
store solution vectors. One could work around this but it certainly
requires some extra work.
That shouldn't be too much work to rewrite. But is there a standard way
of finding out whether MPI I/O is available?
|
I was wondering the same thing. Would it be difficult to write a test for
the deal.ii cmake step to see whether a simple test program that calls
MPI_File_open fails?
On Fri, Feb 28, 2020 at 10:24 AM Wolfgang Bangerth <notifications@github.com>
wrote:
… On 2/27/20 7:56 PM, Timo Heister wrote:
> The crash is not a bug in p4est, but a routine inside deal.II that
> requires MPI I/O:
>
https://github.com/dealii/dealii/blob/47870e28657efe76b1bbf1207bea5d6d634aa534/source/distributed/tria.cc#L1801-L1805
>
> The call to MPI_file_open fails, unsurprisingly. The function is used to
> store solution vectors. One could work around this but it certainly
> requires some extra work.
That shouldn't be too much work to rewrite. But is there a standard way
of finding out whether MPI I/O is available?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3396?email_source=notifications&email_token=AB6G6NX63IPALVQQOHXPHO3RFFJEHA5CNFSM4K5FR472YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENJUXNY#issuecomment-592661431>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB6G6NWNOPRADW66T54AH7TRFFJEHANCNFSM4K5FR47Q>
.
|
This would be the last resort (a different check is likely much faster/easier). I don't know of a way to check it by inspecting |
On 2/28/20 1:21 PM, Max Rudolph wrote:
I was wondering the same thing. Would it be difficult to write a test for
the deal.ii cmake step to see whether a simple test program that calls
MPI_File_open fails?
That's going to fail on clusters on which the head node is different
from the compute nodes :-( There may in fact be no way to run `mpirun`
on the head node to begin with.
How does p4est check whether MPI IO is available? Through a ./configure
argument?
|
Yes, p4est would be configured with --disable-mpiio
On Fri, Feb 28, 2020 at 12:38 PM Wolfgang Bangerth <notifications@github.com>
wrote:
… On 2/28/20 1:21 PM, Max Rudolph wrote:
> I was wondering the same thing. Would it be difficult to write a test for
> the deal.ii cmake step to see whether a simple test program that calls
> MPI_File_open fails?
That's going to fail on clusters on which the head node is different
from the compute nodes :-( There may in fact be no way to run `mpirun`
on the head node to begin with.
How does p4est check whether MPI IO is available? Through a ./configure
argument?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3396?email_source=notifications&email_token=AB6G6NT7THYMNI273WE527TRFFY35A5CNFSM4K5FR472YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENKCOVI#issuecomment-592717653>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB6G6NR63E7ANLWRKYB2UPLRFFY35ANCNFSM4K5FR47Q>
.
|
We have currently no plans to make checkpointing work without MPI IO. |
On the UC Davis cluster 'peloton', our scratch filesystem is provided as a NFS mount. Because of potential issues with:
(1) correctness of output due to file locking behaviors over NFS, discussed here:
open-mpi/ompi#4446
(2) generally poor performance of mpi-io over NFS
openmpi has been compiled without support for MPI-IO on peloton. ASPECT and its dependencies seem to rely heavily on MPI-IO for output of visualization files and for checkpointing.
I have created a github repository LINKED HERE that builds a docker container containing openmpi-4.0.2 without mpi-io that can reproduce these issues. Please note that the container takes a long time to build. You may want to run on a machine with the ability to build on a couple of dozen threads.
If the number of grouped files in the input file subsection ```Postprocess/visualization/`` is set to >0, there will be an MPI error. If the number of grouped files is set to 0, it appears that vtu output can be written successfully.
Checkpoints cannot be written successfully, and we encounter a MPI error:
The text was updated successfully, but these errors were encountered: