Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize T8codeMesh before MPI is finalized #1585

Closed
jmark opened this issue Jul 24, 2023 · 9 comments
Closed

Finalize T8codeMesh before MPI is finalized #1585

jmark opened this issue Jul 24, 2023 · 9 comments
Labels
discussion help wanted Extra attention is needed parallelization Related to MPI, threading, tasks etc.

Comments

@jmark
Copy link
Contributor

jmark commented Jul 24, 2023

When finalizing the cmesh object in t8code some MPI calls are made since some shared memory arrays must be freed.
Naturally, this has to happen before MPI itself shuts down. Otherwise you get one of the following error messages:

Attempting to use an MPI routine (MPII_Comm_get_attr) before initializing or after finalizing MPICH

Julia uses a garbage collector, thus the order of finalization of objects is not determined by the program flow.

Is there a possibility in Julia resp. Trixi to deterministically finalize modules/objects before MPI shuts down?

@jmark jmark added help wanted Extra attention is needed discussion parallelization Related to MPI, threading, tasks etc. labels Jul 24, 2023
@sloede
Copy link
Member

sloede commented Jul 24, 2023

Could this be helpful? https://juliaparallel.org/MPI.jl/stable/reference/environment/#MPI.add_finalize_hook!

@ranocha
Copy link
Member

ranocha commented Jul 24, 2023

There seem to be two cases where clean-up code could be called:

  • by a finalizer as implemented in a previous version of the code
  • by a finalizer hook of MPI.jl

It would be nice to run the first in a long-running (interactive) Julia session using multiple meshes. However, only the second one is guaranteed to be called before MPi is finalized - finalizer hooks are called in MPI.finalize just before MPI is finalized, see
https://github.com/JuliaParallel/MPI.jl/blob/2a2bd68a6dc53660108bbc16690e4972de6318d5/src/environment.jl#L250-L256

How would you weight the requirement to avoid piling up garbage without a classical finalizer, @sloede? Would an approach like adding both a classical finalizer and an MPI finalize hook work - where both use the same implementation under the hood and check whether they need to clean up at all?

@jmark
Copy link
Contributor Author

jmark commented Jul 24, 2023

I just implemented a version with the MPI finalizer hook. It works as intended. But indeed, this does not solve the problem with long-running sessions. For that I did not find a satisfying solution yet on how the two finalizers know about each other.

@sloede
Copy link
Member

sloede commented Jul 25, 2023

I would probably continue to use the normal, per-object finalizer function, but guard the actual calls to any MPI functions behind if !MPI.Finalized() ... end. That way you get complete finalization/cleanup during long running sessions. And if something is not fully cleaned up after MPI has been finalized, you usually do not care anymore anyways.

@jmark
Copy link
Contributor Author

jmark commented Jul 25, 2023

Thanks for the thought, @sloede !

This is my proposed solution: dde2802

I still want to make sure that T8coeMesh is properly cleaned up before shutdown since sc_finalize is called by the
MPI finalizer hook. A feature I added in a previous commit: 6074e87

sc_finalize checks if all references resp. memory allocated by t8code has been properly freed and throws an exception if this is not the case. This assures code quality and might be a sensible addition for P4estMesh, too.

@ranocha
Copy link
Member

ranocha commented Jul 25, 2023

sc_finalize checks if all references resp. memory allocated by t8code has been properly freed and throws an exception if this is not the case. This assures code quality and might be a sensible addition for P4estMesh, too.

This sounds really good. Would you be willing to make another PR for the P4estMesh?

@jmark
Copy link
Contributor Author

jmark commented Jul 25, 2023

There still might be one issue which I haven't considered yet. When creating a closure like this

 MPI.add_finalize_hook!(function ()
                                   trixi_t8_unref_forest(mesh.forest)
                               end)

then there is still a reference to mesh dangling around even when the instance of mesh is not needed anymore.
This would mean that finalizer actually never gets called even during long running sessions. Did I miss anything?

@ranocha
Copy link
Member

ranocha commented Jul 26, 2023

Good point! So we should remove the finalizer hook again and just keep the ordinary finalizer with the check whether MPI has already been finalized in place?

@jmark
Copy link
Contributor Author

jmark commented Jul 26, 2023

Yes, we should do that. I made the sc_finalize logic optional. It can be activated by setting an environment variable. During production runs sc_finalize is not mandatory, but will be helpfull during development and debugging.

12a20e5

@jmark jmark closed this as completed Sep 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion help wanted Extra attention is needed parallelization Related to MPI, threading, tasks etc.
Projects
None yet
Development

No branches or pull requests

3 participants