Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What to do if we cannot track a communicator? #29

Open
rfvander opened this issue Mar 30, 2017 · 0 comments
Open

What to do if we cannot track a communicator? #29

rfvander opened this issue Mar 30, 2017 · 0 comments
Labels

Comments

@rfvander
Copy link
Contributor

rfvander commented Mar 30, 2017

When we create a new communicator from a resilient communicator (e.g through split or dup, we add it to the list of tracked communicators, to be deleted in case of a failure. We need to decide what happens if creation of the communicator is successful, we can add the error handler, but we cannot add it to the track list (it requires a malloc, which may fail due to insufficient memory).
I think it is valid to say the call has failed, we remove the communicator, and return an MPI error code to the application. This would need to be documented in the specification. Allowing the call to succeed will lead to an error after a failure, because we will not be able to remove the damaged communicator.
If we take this approach, we need to determine which MPI error code to return. I recommend MPI_ERROR_OTHER or MPI_ERROR_INTERN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant