Added local_shapes as a new parameter to DistributedArray #61

rohanbabbar04 · 2023-08-11T17:20:05Z

Closes #59
Implemented local_shapes parameter and changed all instances of DistributedArray to handle local_shapes
I see that local_shapes is a better option which will make the local_shape=local_shapes[rank] otherwise the default.

# Example
from pylops_mpi import DistributedArray

arr = DistributedArray(global_shape=(100, ), local_shapes=[(30, ), (40, ), (30, )])
print(arr.rank, arr)

# Output
0 <DistributedArray with global shape=(100,), local shape=(30,), dtype=<class 'numpy.float64'>, processes=[0, 1, 2])> 
1 <DistributedArray with global shape=(100,), local shape=(40,), dtype=<class 'numpy.float64'>, processes=[0, 1, 2])> 
2 <DistributedArray with global shape=(100,), local shape=(30,), dtype=<class 'numpy.float64'>, processes=[0, 1, 2])>

In DistributedArray.py

Added local_shapes as a List[Tuple]
Added local_shapes to to_dist method
Added a check for local_shapes so that they align with global shape
Changed all instances to use local_shape
Removed send/recv from ravel as now it is being handled by local_shapes

In FirstDerivative.py, SecondDerivative.py cls_basic.py and plotting.py

Minor: Add local_shapes as parameter to DistributedArray

In decorators.py

Updated the reshaped method by adding forward and stacking to handle reshaping/redistributing of stacking operators(I did it like that bcoz the code for them and derivatives was quite similar, if another decorator needs to be made to keep them separate then let me know)...
Also made the code a little clean and easier to understand

In VStack.py and BlockDiag.py

Now we give an option to handle it under the hood, also added local_shapes to y.
Removed the Value Error as now that can be handled by the local_shapes and decorator.

Tests and example

Updated tests for laplacian to check if it works for non-divisibility of size
Added a test for local_shapes
Added an example for local_shapes(since we are working with different ranks, adding raw numbers wasn't possible so just reversed the list of local_split shapes)

examples/plot_distributed_array.py

mrava87

Good stuff!

This PR looks very good and I am happy to see how the local_shapes can help streamline various parts of the library.

I have left a few minor comments, and I have only one major one: what is the reasoning for passing to DistributedArray a list of tuples containing the local shapes for all ranks instead of passing only that of the current rank?

Basically, in other words, does it make sense to have the code pattern all the time user side:

local_shape = local_split(global_shape, MPI.COMM_WORLD, Partition.SCATTER, 0)
local_shapes = MPI.COMM_WORLD.allgather(local_shape)[::-1]
arr = pylops_mpi.DistributedArray(global_shape=global_shape, local_shapes=local_shapes, axis=0)

instead of

local_shape = local_split(global_shape, MPI.COMM_WORLD, Partition.SCATTER, 0)
arr = pylops_mpi.DistributedArray(global_shape=global_shape, local_shape=local_shape, axis=0)

and have the gathering part inside the init method?

I am sure you have probably thought about this, but at first sight I cannot find a reason why the latter would not work 🤔

examples/plot_distributed_array.py

pylops_mpi/DistributedArray.py

mrava87 · 2023-08-11T20:57:54Z

pylops_mpi/basicoperators/BlockDiag.py

-            raise ValueError(f"Dimension mismatch: x shape-{x.local_shape} does not match operator shape "
-                             f"{self.localop_shape}; {x.local_shape[0]} != {self.mops} (dim1) at rank={self.rank}")
-        y = DistributedArray(global_shape=self.shape[0], dtype=x.dtype)
+        local_shapes = self.base_comm.allgather((self.nops, ))


Can this not be moved in the init method, as it seems to me this can be done once... local_shapes can be put into a member of the class and used every time matvec is called (same for rmatvec). I would just use local_shapes_n and local_shapes_m to distinguish between the two :)

Done in BlockDiag as well as VStack.py

tests/test_distributedarray.py

rohanbabbar04 · 2023-08-12T04:28:28Z

Hi @mrava87, Lets take an example of global_shape=100 with 3 processes and the person wants to split it in (30, ), (40,) and (30, )
Using local_shapes

arr = DistributedArray(global_shape=100, local_shapes=[(30,), (40,), (30,)])

When it comes to using the local_shape what I see(correct me if I am wrong)

if rank == 0:
   arr = DistributedArray(global_shape=100, local_shape=(30, )])
elif rank == 1:
   arr = DistributedArray(global_shape=100, local_shape=(40, )])
elif rank == 2:
   arr = DistributedArray(global_shape=100, local_shape=(30, )])

That is why I preferred the local_shapes parameter, plus I see that the users can write the local_shapes, thus also checking if all the local_shapes match the global shape in the initial line itself
What do you think?

mrava87 · 2023-08-12T11:19:31Z

Oh good point. I was more thinking about the case a user would use local_split, but your point is valid and indeed sometimes users may want to choose custom splits themselves and the code with if statements is much less clean and elegant than providing all local shapes as you did. So nothing to be changed here :)

rohanbabbar04 · 2023-08-12T11:29:56Z

Oh good point. I was more thinking about the case a user would use local_split, but your point is valid and indeed sometimes users may want to choose custom splits themselves and the code with if statements is much less clean and elegant than providing all local shapes as you did. So nothing to be changed here :)

Great..
I will just make all the necessary changes which you asked for to make local_shapes work and commit soon..

rohanbabbar04 added 4 commits August 11, 2023 20:22

Added "local_shapes" as a new parameter

bc8efc7

Updated to include local shapes

3097456

Added test for local shapes

3d1dd5f

Updated example

32774b4

rohanbabbar04 requested a review from mrava87 August 11, 2023 17:29

mrava87 reviewed Aug 11, 2023

View reviewed changes

examples/plot_distributed_array.py Outdated Show resolved Hide resolved

mrava87 reviewed Aug 11, 2023

View reviewed changes

Done the changes

8473260

rohanbabbar04 requested a review from mrava87 August 12, 2023 15:13

Minor: Replace self.base_comm.allreduce with self._allreduce

e1f5caa

mrava87 approved these changes Aug 12, 2023

View reviewed changes

rohanbabbar04 merged commit 6cae759 into main Aug 13, 2023
15 checks passed

rohanbabbar04 deleted the local_shapes branch August 14, 2023 14:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added local_shapes as a new parameter to DistributedArray #61

Added local_shapes as a new parameter to DistributedArray #61

rohanbabbar04 commented Aug 11, 2023 •

edited

Loading

mrava87 left a comment

mrava87 Aug 11, 2023

rohanbabbar04 Aug 12, 2023

rohanbabbar04 commented Aug 12, 2023 •

edited

Loading

mrava87 commented Aug 12, 2023

rohanbabbar04 commented Aug 12, 2023 •

edited

Loading

Added local_shapes as a new parameter to DistributedArray #61

Added local_shapes as a new parameter to DistributedArray #61

Conversation

rohanbabbar04 commented Aug 11, 2023 • edited Loading

mrava87 left a comment

Choose a reason for hiding this comment

mrava87 Aug 11, 2023

Choose a reason for hiding this comment

rohanbabbar04 Aug 12, 2023

Choose a reason for hiding this comment

rohanbabbar04 commented Aug 12, 2023 • edited Loading

mrava87 commented Aug 12, 2023

rohanbabbar04 commented Aug 12, 2023 • edited Loading

rohanbabbar04 commented Aug 11, 2023 •

edited

Loading

rohanbabbar04 commented Aug 12, 2023 •

edited

Loading

rohanbabbar04 commented Aug 12, 2023 •

edited

Loading