-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow using dynamic reference to communicator #157
Conversation
Thanks for this, and thanks for your patience (I'm finally into summer mode and have more capacity to do Rust things). Does this need to be |
I made this a draft because, while it does work, there are things left that do not (as outlined in the initial post). If #155 is merged, I have another commit that changes the If you are happy to merge this in its incomplete state to at least partially support collective communication on trait objects, then it doesn't need to be a draft.
This is green and untouched. Should I add a call to a collective function to it to make it more adequate? |
Sorry about the delays on my end. Would you like to pick this up now that #155 has merged? |
I will gladly have another look at this in the following weeks |
898b35f
to
110b5c4
Compare
…to use trait-objects to make them object-safe
- changed constructors for Process type - changed AnyProcess analogous - implemented Communicator for Process, such that AsCommunicator can be implemented without violating borrow checker - changed functions in Communicator trait that return processes to be object-safe
Changes so far:
Progress:
Questions
Obviously, all of this are breaking changes. |
This looks good so far. I agree with sealing, though might have just moved the sealed stuff to a different file instead of adding nesting. For #32, we need to be able to be able to instantiate a communicator from a raw As a design question, should Attributes allow callers to attach data to an You've perhaps come across |
Done.
This was already solved by #155, because the traits Currently, the implementation asserts that the passed raw handle isn't a system handle (like
I agree that this design is confusing. This confusion is amplified by this Pull Request, because |
I am not sure how to make fn get_attr<A: Attribute>(&self, key: A) -> Option<&<A as Attribute>::Target>
pub trait Attribute: AsRaw<Raw = c_int> {
type Target;
/// Allocate memory for an attribute value.
unsafe fn allocate() -> *mut Self::Target;
} This way, the allocation inside However, |
Hmm, I hadn't thought about it carefully. Could we make a method in |
The return type still isn't statically computable anymore, as far as I understand. The function |
If we want to use a typed If in addition, we want to work with a Feel free to leave this for a separate PR -- I think I can do it quickly and add a test. |
In that case I'd like to defer this to a separate PR. Same with Groups, if you want them to work as trait-objects as well. Marked as ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks. Are you okay with squash-merging or do you want to make the history atomic (at least squash the sealing with moving it to a separate file, and the legacy extern syntax into that example)?
I don't really mind. Do it as you want it in the history. |
Thanks for your contribution! |
impl<'a> AsCommunicator for Process<'a> { | ||
type Out = Self; | ||
|
||
fn as_communicator(&self) -> &Self::Out { | ||
self | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This turns out to be a bug, in that process.as_communicator().rank()
now returns the rank of process
, not the rank of the current process on the communicator. One can call Communicator::rank(&process)
, but I think process.as_communicator()
should behave like a communicator, not silently use the wrong rank()
method.
Do you have thoughts about how to fix it? We could put an actual communicator back into Process
? It could be comm: &'a dyn Communicator
to avoid making it generic over Communicator
or any other trivial wrapper around a CommunicatorHandle
. AnyProcess<'a>
could be abused for that purpose, but it seems decidedly unclean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty convinced that the PR didn't change any behavior regarding to this.
Previously Process
looked like this:
pub struct Process<'a, C>
where
C: 'a + Communicator,
{
comm: &'a C,
rank: Rank,
}
and the AsCommunicator
trait was implemented to return comm
. Now it returns self
, because comm
is no longer a Communicator
but a CommunicatorHandle
. Instead, Process
implements Communicator
itself, which in turn delegates all calls to the internal CommunicatorHandle
. So if this bug exists, I am pretty sure it existed before this PR (I haven't tested this though, so maybe I am missing something).
Not sure what even causes this, because I am not sure how MPI differentiates between a Process handle and a Communicator handle. Is the differentiation only done by choosing different rank functions?
If so, maybe we could change the Communicator
trait impl of Process by explicitly changing its rank implementation:
impl<'a> Communicator for Process<'a> {
// ...
fn rank(&self) -> Rank {
Process::rank(&self)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your reply and sorry I'm slow following up. The issue is that impl Process
has a method called rank()
referring to the rank of the designated process, while the rank()
method on the Communicator
trait returns the rank of self.
In the old code, process.as_communicator()
was just a Communicator
(no longer a Process
) so there was only one rank()
method. After your change here, it returned a Process
(which implements Communicator
, but prefers the method on Process
instead of the trait method).
I resolved the issue in #177, though I'm not entirely happy with the solution. Part of me wants to remove/rename the Process::rank()
method. There are also experimental systems in which a rank is not a process, so I'm not wild about the Process
name. But these are design questions for another time/place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see the problem now. Would it be confusing for users if the rank methods on both traits simply had different names (e.g. communicator_rank()
and process_rank()
)? That way, it'd be clear which rank is called at any given point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, though comm.comm_rank()
and process.process_rank()
are somewhat tedious and I'm not sure it's better than the current use of AnyProcess
, at least until we get around to making Rank
stronger (e.g., a newtype instead of an alias to c_int
). It's necessary to have a type that can be put in an array and sorted, though it would be nice for it to retain a branding associating it with the communicator on which it's valid.
I think the current Process
is pure ergonomics and I have mixed feelings about its existence -- many/most real-world patterns will be calling comm.process_at_rank()
and using it immediately in exactly one point-to-point operation, but then we need run-time assertions like you'll see in send_receive_with_tags
to check that source and destination communicators match. In a sense, it is more error-prone (weaker compile-time guarantees) to use source_process, dest_process
than comm, source_rank, dest_rank
.
This should fix #148.
It is independent of #155, it should work (in its current unfinished state) without change if 155 is merged.
process_at_rank
is currently not callable ondyn Communicator
for reasons outlined in #148. I have an experimental commit in here that fixes this (building upon #155)The same is true for
split_by_subgroup_collective
, and probably a few other that requireSized
, for the same reason. I haven't gotten to that yet.