-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added local_shapes as a new parameter to DistributedArray #61
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff!
This PR looks very good and I am happy to see how the local_shapes
can help streamline various parts of the library.
I have left a few minor comments, and I have only one major one: what is the reasoning for passing to DistributedArray a list of tuples containing the local shapes for all ranks instead of passing only that of the current rank?
Basically, in other words, does it make sense to have the code pattern all the time user side:
local_shape = local_split(global_shape, MPI.COMM_WORLD, Partition.SCATTER, 0)
local_shapes = MPI.COMM_WORLD.allgather(local_shape)[::-1]
arr = pylops_mpi.DistributedArray(global_shape=global_shape, local_shapes=local_shapes, axis=0)
instead of
local_shape = local_split(global_shape, MPI.COMM_WORLD, Partition.SCATTER, 0)
arr = pylops_mpi.DistributedArray(global_shape=global_shape, local_shape=local_shape, axis=0)
and have the gathering part inside the init method?
I am sure you have probably thought about this, but at first sight I cannot find a reason why the latter would not work 🤔
raise ValueError(f"Dimension mismatch: x shape-{x.local_shape} does not match operator shape " | ||
f"{self.localop_shape}; {x.local_shape[0]} != {self.mops} (dim1) at rank={self.rank}") | ||
y = DistributedArray(global_shape=self.shape[0], dtype=x.dtype) | ||
local_shapes = self.base_comm.allgather((self.nops, )) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this not be moved in the init method, as it seems to me this can be done once... local_shapes
can be put into a member of the class and used every time matvec is called (same for rmatvec). I would just use local_shapes_n
and local_shapes_m
to distinguish between the two :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in BlockDiag
as well as VStack.py
Hi @mrava87, Lets take an example of global_shape=100 with 3 processes and the person wants to split it in (30, ), (40,) and (30, )
When it comes to using the
That is why I preferred the |
Oh good point. I was more thinking about the case a user would use local_split, but your point is valid and indeed sometimes users may want to choose custom splits themselves and the code with if statements is much less clean and elegant than providing all local shapes as you did. So nothing to be changed here :) |
Great.. |
Closes #59
Implemented
local_shapes
parameter and changed all instances of DistributedArray to handle local_shapesI see that
local_shapes
is a better option which will make the local_shape=local_shapes[rank] otherwise the default.In
DistributedArray.py
to_dist
methodravel
as now it is being handled by local_shapesIn
FirstDerivative.py
,SecondDerivative.py
cls_basic.py
andplotting.py
In
decorators.py
reshaped
method by addingforward
andstacking
to handle reshaping/redistributing of stacking operators(I did it like that bcoz the code for them and derivatives was quite similar, if another decorator needs to be made to keep them separate then let me know)...In
VStack.py
andBlockDiag.py
Tests and example