-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MPIBlockDiag and MPILinearOperator #26
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rohanbabbar04, thanks for this PR. So far I struggle a bit to understand the rationale behind this operator. I had a few comments in the closed PR that I put here again, hopefully your answers (and example!) will help clarify my doubts.
- [Taken from closed PR] Looking at your example, I am not sure I understand how you envision/designed MPIBlockDiag… my idea is that if you have a block diagonal operator with N operators (N being the number of ranks), each rank will build one pylops operator and pass it to the MPIBlockDiag, so that once matvec/rmatvec is called each rank will compute their portion… so if you were to compare your new implementation with an equivalent pylops one, the pylops operator will have to take all of those operators and give the same result…. For simplicity take this example: you have 3 ranks and 3 5x5 matrices which you wrap into 3 MatrixMult operators. The first is filled with 1s, the second with 2s, the third with 3s.. now for pylops-MPI each rank makes one MatrixMult operator and passes it to MPIBlockDiag… the x is also distributed and can be just full of 1s for simplicity. Unless I don’t understand your example it doesn’t seem like it is built like this, rather each rank seems to make all of the ops…
- Based on the above, it seems like your
x
andy
are notDistributedArray
, instead just numpy arrays. Why? Again, I would think we should have distributed arrays passed to MPI operators, so that internally it is easy to get the portion needed - In the PR you closed you had created a
LinearOperator
class and subclassedMPIBlockDiag
. Was there any reason/benefit from doing that, or right now you do not see any use wrtMPIBlockDiag
?
081bfd4
to
2809909
Compare
Added * in pylops_mpi/__init__ Updated pylops_mpi/basicoperators/__init__.py
Removed prev version of _get_dtype
20fbcbb
to
e7c967d
Compare
@mrava87 Done the changes, if the code and tests are good to go, we can merge this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! Still quite a few comments but I think we are getting closer to having a good MPIBlockDiag operator.
For now I think we can stick to this implementation. Later we can consider maybe adding also what you had originally, allowing users to choose (depending on the scenario) if they want to create&pass operators per rank or all of them from all ranks - the second may be more flexible for complex scenarios but I suspect the former will be the most used one as in most cases operators will be the same for all ranks, just filled with different 'content'. (or just doing the same think to different parts of the model and data vector)
Hi @mrava87,
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
I made just a few comments on the LinearOperator. Most should be easy to handle, but there is one or two where I cannot understand some of your code choices that would be good to discuss/agree upon before moving ahead.
After that, for me this is ready to be merged :)
pylops_mpi/LinearOperator.py
Outdated
|
||
def matvec(self, x: DistributedArray) -> DistributedArray: | ||
if self.Op: | ||
y = DistributedArray(global_shape=self.base_comm.allreduce(self.shape[1]), dtype=x.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.base_comm.allreduce(self.shape[1])
why?
You ask the user to pass shape and from the doc you say Shape of the MPI Linear Operator.
which makes me think is the global shape. Also in MPIBlockDiag
the shape is computed as shape = (base_comm.allreduce(self.nops), base_comm.allreduce(self.mops))
which means it is the global shape.. so why this?
pylops_mpi/LinearOperator.py
Outdated
|
||
def rmatvec(self, x: DistributedArray) -> DistributedArray: | ||
if self.Op: | ||
y = DistributedArray(global_shape=self.base_comm.allreduce(self.shape[1]), dtype=x.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
5d27015
to
7ac4821
Compare
Minor: Remove global shape
7ac4821
to
b692cba
Compare
|
Great, I think this is ready to go. We will see if things like |
DistributedArray
DistributedArray
.1.8
and python to3.8