-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix dangling pointer to custom operation in collectives #52
Conversation
This is tricky, I realize I never looked into that corner... But if we look the underlying C API of MPI_Op_create, we see that this not supported:
The root of the problem raised by that mismatch is actually in the That is sooooo bad. If we do have a state in the function object, multi-threading will kill us: If we have two different states for the same couple I don't think you modification actually fixes the problem, it just make the compiler happy for no good reason. Right now, the only real solution I can think of would be to state clearly in the documentation that we require stateless functions objects (in line with the MPI requirements) and use a fresh instance of the op object in perform:
the But that's a change that would break code that just happens to work so far. |
@aminiussi, thanks for the explanation. I have to admit that I only briefly looked at the Boost code and noticed I needed to added these Stateless operations work correctly, but previously produced the warnings that my pull request silences. Stateful operations never really worked if I understand you correctly, so that would need to be documented. If your suggested change needed to enforce statelessness can't be made without breaking code, then I suppose documenting the lack of support for stateful operations is all we can do. We should still suppress these warnings for stateless operations though. |
Documentation change in d7b35ef |
We still have the variable lifetime issue though that this pull request fixes; right now we are still calling member functions on a dangling pointer. This happens to work because the function doesn't access member variables, but is probably undefined behavior. I suppose we could do |
Moving discussion to #68 |
Found with Clang-Tidy in Boost 1.58.
user_op
takes a reference toop
andall_reduce_impl
/reduce_impl
/scan_impl
take it by value, so this seems to be the only place where this can be fixed.