Document that the lifetime of the DistributedSearchTree should be within MPI initialization region #346

aprokop · 2020-07-29T16:30:38Z

In #135 we introduced duplicating of user-provided communicators to cleanly separate the library's communication from that of the user. We store the duplicated communicator inside ArborX::DistributedSearchTree. However, what we did not realize was that it can lead to the problem in the following situation:

MPI_Initialize(argc, argv);
ArborX::DistributedSearchTree<DeviceType> tree(comm, primitives);
tree.query(...);
MPI_Finalize();

The issue is that the destructor for the distributed tree, which contains MPI_comm_free(&_comm); will be called after MPI_Finalize().

There is currently nothing in the interface to indicate that it cannot/should not be used this way. After searching online, I cannot say definitively what MPI_Finalize() does with non-freed communicators. But as it cleans up all MPI context, I think the proper fix would be to check if MPI is still initialized before freeing the communicator.

Reported-by: @mattbement

The text was updated successfully, but these errors were encountered:

dalg24 · 2020-07-30T19:27:07Z

This is not a defect. This is expected. The MPI execution environment, just like the Kokkos one, needs to be properly initialized before you create our distributed tree data structure and these are not be finalized as long as any object is in scope.

aprokop added bug Something isn't working API User visible interface modifications labels Jul 29, 2020

aprokop mentioned this issue Jul 29, 2020

Check that MPI is not finalized before freeing the communicator #347

Closed

dalg24 added documentation and removed bug Something isn't working labels Jul 30, 2020

aprokop changed the title ~~Distributed search tree may try to free communicator after MPI_Finalize()~~ Document the that lifetime of the DistributedSearchTree should be within MPI initialization region Jul 30, 2020

aprokop changed the title ~~Document the that lifetime of the DistributedSearchTree should be within MPI initialization region~~ Document that the lifetime of the DistributedSearchTree should be within MPI initialization region Sep 5, 2020

aprokop closed this as completed Dec 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document that the lifetime of the DistributedSearchTree should be within MPI initialization region #346

Document that the lifetime of the DistributedSearchTree should be within MPI initialization region #346

aprokop commented Jul 29, 2020

dalg24 commented Jul 30, 2020

Document that the lifetime of the DistributedSearchTree should be within MPI initialization region #346

Document that the lifetime of the DistributedSearchTree should be within MPI initialization region #346

Comments

aprokop commented Jul 29, 2020

dalg24 commented Jul 30, 2020