[MPS] Add lu_factor #99269

qqaatw · 2023-04-16T17:37:00Z

Stack from ghstack (oldest at bottom):

-> [MPS] Add lu_factor #99269

`🤖 Generated by Copilot at d75cde1`

Added MPS support and autograd formulas for LU factorization of tensors. Implemented the linalg_lu_factor and linalg_lu_factor.out functions for the MPS backend in LinearAlgebra.mm and added tests in test_mps.py. Added the corresponding dispatch entries in native_functions.yaml and the backward and forward formulas in derivatives.yaml.

[ghstack-poisoned]

pytorch-bot · 2023-04-16T17:37:03Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99269

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Unrelated Failure

As of commit 006ea34 with merge base 6e43897 ():

NEW FAILURE - The following job has failed:

trunk / linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 2, 5, linux.g5.4xlarge.nvidia.gpu) (gh)
inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_huber_loss_cuda_float16

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / linux-jammy-py3.8-gcc11 / test (distributed, 2, 2, linux.2xlarge) (gh) (similar failure)
Process completed with exit code 137.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 6a3d344 Pull Request resolved: #99269

[ghstack-poisoned]

ghstack-source-id: 302e88e Pull Request resolved: #99269

[ghstack-poisoned]

ghstack-source-id: 9d2f9e1 Pull Request resolved: #99269

aten/src/ATen/native/mps/operations/LinearAlgebra.mm

lezcano

Left a few comments.
I'll let the MPS guys review the main implementation though.

lezcano · 2023-05-01T16:32:08Z

aten/src/ATen/native/native_functions.yaml

  variants: function
+  dispatch:
+    CompositeImplicitAutograd: linalg_lu_factor_out
+    MPS: linalg_lu_factor_out_mps


Could you implement linalg_lu_factor_ex rather than this one? That way you wouldn't need to add any new backward rule, and all the other goodies that are implemented for linalg.lu_factor will also extend to MPS.

The reason I implement linalg_lu_factor is because linalg_lu_factor_ex has to return an info tensor, which is not applicable to MPS. (the info tensor is computed by LAPACK on cpu, for example).

I guess I cannot directly return an undefined Tensor Tensor() as the info since there are some post check logics to the info tensor.

lezcano · 2023-05-01T16:35:23Z

aten/src/ATen/native/mps/operations/LinearAlgebra.mm

+  std::vector<Tensor> status_tensors;
+  std::vector<Tensor> pivots_list;
+
+  status_tensors.reserve(batchSize);
+  pivots_list.reserve(batchSize);
+  for (C10_UNUSED const auto i : c10::irange(batchSize)) {
+    status_tensors.push_back(at::zeros(1, kInt, c10::nullopt, kMPS, c10::nullopt));
+    pivots_list.push_back(at::zeros(numPivots, kInt, c10::nullopt, kMPS, c10::nullopt));
+  }


Why not just a tensor with batchSize elements in the case of status_tensors? Same same for pivots_list.

For pivots_list:
For some reason I don't know as the MPS kernel is closed source, probably because MPSMatrixDecompositionLU functions in-place if the result matrix completely aliases the source matrix per the docs, if we use the same MTLBuffer (the underlying storage of MPS tensor) with an offset for each matrix pivot, the resulting pivot values will be incorrect.

For status_tensors:
For each system, the kernel requires status being an MTLBuffer input to be encoded, which doesn't provide an option for specifying an offset to the buffer. Thus, the way I could come up with was splitting status to multiple tensors, each of which has its MTLBuffer. Maybe there is a better approach.

I do not know enough about MPS, but perhaps @kulinseth can comment on what's the best way to do this. It'd be good if pivots could be a Tensor, that way we wouldn't need to first create a vector and then copy it into the output tensor.

For status_tensors, I don't know why can't we simply return the status tensors, same as we do for the pivots. This would allow us to implement the _ex variant, which is the expected way of implementing this function.

@qqaatw , you can use the Aliasing strategy : MPSAliasingStrategyShallNotAlias and provide an Offset using the arrayView.

So start with using MPSNDArray to create the object.

arrayView = [ndArray arrayViewWithCommandBuffer:commandBuffer descriptor:desc aliasing:MPSAliasingStrategyShallNotAlias];

And then convert MPSNDArray tot MPSMatrix to be passed to LU Solve :

[[MPSMatrix alloc] initWithBuffer: [ndArray buffer] offset: offset descriptor: matDesc]; }

Sorry, should it be initialized with [ndArray buffer] or [arrayView buffer]? I guess the latter is what you were suggesting.

With pivot matrix initialized with [arrayView buffer] that uses MPSAliasingStrategyShallNotAlias, the pivots output remains the same as the pivots before LU decomposition, which looks like the array view is not writable with this strategy? On the other hand, if I specify MPSAliasingStrategyShallAlias, the output is correct with unbatched inputs.

I also tried initializing with [ndArray buffer], the outputs were incorrect if the inputs were batched.

@kulinseth I can provide the code if you need repro.

https://gist.github.com/qqaatw/3b3cb633c60fcd6abab3fc5f0e468b88#file-repro-mm

test/test_mps.py

aten/src/ATen/native/mps/operations/LinearAlgebra.mm

[ghstack-poisoned]

ghstack-source-id: 1d55191 Pull Request resolved: #99269

lezcano · 2023-05-02T10:43:03Z

aten/src/ATen/native/mps/operations/LinearAlgebra.mm

+void linalg_lu_factor_out_mps_impl(const Tensor& A, bool pivot, Tensor& LU, Tensor& pivots) {
+  using namespace mps;
+
+  TORCH_CHECK(pivot, "linalg_lu_factor(): MPS doesn't allow pivot == False.");


nit. linalg.lu_factor.

kulinseth

I agree with comments earlier @lezcano ..

kulinseth · 2023-05-09T14:37:07Z

aten/src/ATen/native/mps/operations/LinearAlgebra.mm

+  std::vector<Tensor> status_tensors;
+  std::vector<Tensor> pivots_list;
+
+  status_tensors.reserve(batchSize);
+  pivots_list.reserve(batchSize);
+  for (C10_UNUSED const auto i : c10::irange(batchSize)) {
+    status_tensors.push_back(at::zeros(1, kInt, c10::nullopt, kMPS, c10::nullopt));
+    pivots_list.push_back(at::zeros(numPivots, kInt, c10::nullopt, kMPS, c10::nullopt));
+  }


@qqaatw , you can use the Aliasing strategy : MPSAliasingStrategyShallNotAlias and provide an Offset using the arrayView.

So start with using MPSNDArray to create the object.

arrayView = [ndArray arrayViewWithCommandBuffer:commandBuffer descriptor:desc aliasing:MPSAliasingStrategyShallNotAlias];

And then convert MPSNDArray tot MPSMatrix to be passed to LU Solve :

[[MPSMatrix alloc] initWithBuffer: [ndArray buffer] offset: offset descriptor: matDesc]; }

[ghstack-poisoned]

ghstack-source-id: accadbe Pull Request resolved: #99269

lezcano

There is a a minor correctness issue. That being said, the PR looks mostly right. @kulinseth already approved, so feel free to merge after either fixing the issue properly or simply reverting to the previous flatten (not entirely efficient, but at least correct).

lezcano · 2024-06-19T07:59:42Z

aten/src/ATen/native/mps/operations/LinearAlgebra.mm

+    return;
+  }
+
+  Tensor A_ = A_t.dim() > 3 ? A_t.view({-1, A_t.size(-2), A_t.size(-1)}) : A_t;


This will fail if the view cannot be performed. My point is that you do not need to iterate the strided dimension in contiguous order, nor you need it to be contiguous. You just need to iterate over every matrix inside it, so you never need to copy really.

[ghstack-poisoned]

ghstack-source-id: 532b2d5 Pull Request resolved: #99269

qqaatw · 2024-06-20T04:08:03Z

@pytorchbot merge

pytorchmergebot · 2024-06-20T04:09:37Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-06-20T05:22:34Z

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 2, 5, linux.g5.4xlarge.nvidia.gpu)

Details for Dev Infra team

Raised by workflow job

qqaatw · 2024-06-20T07:33:14Z

@pytorchbot merge -ic

pytorch-bot · 2024-06-20T07:33:18Z

-ic flag is deprecated, please use -i instead for the same effect.

qqaatw · 2024-06-20T07:33:29Z

@pytorchbot merge -i

pytorchmergebot · 2024-06-20T07:35:06Z

Merge started

Your change will be merged while ignoring the following 2 checks: trunk / linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 2, 5, linux.g5.4xlarge.nvidia.gpu), pull / linux-jammy-py3.8-gcc11 / test (distributed, 2, 2, linux.2xlarge)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

ghstack-source-id: 532b2d5 Pull Request resolved: #99269

…u_factor (#165871) Fixes #165870. Follow up from #165254. This PR [a] removes the MPS specific version of `lu_factor` in favor of the version in BatchedLinearAlgebra.cpp which uses `lu_factor_ex`, and [b] updates `lu_factor_ex` error codes to match expectations. When `lu_factor` was first implemented for MPS (#99269), it bypassed the implementation in BatchedLinearAlgebra.cpp since we did not have `lu_factor_ex`. Since #144651 implements `lu_factor_ex`, we can now remove the MPS specific wrapper. Pull Request resolved: #165871 Approved by: https://github.com/kulinseth, https://github.com/albanD

[MPS] Add lu_solve, lu_factor

f7c9441

[ghstack-poisoned]

pytorch-bot bot added ciflow/mps Run MPS tests (subset of trunk) release notes: mps Release notes category labels Apr 16, 2023

qqaatw added a commit that referenced this pull request Apr 16, 2023

[MPS] Add lu_solve, lu_factor

7e5ff13

ghstack-source-id: 6a3d344 Pull Request resolved: #99269

pytorchbot added the open source label Apr 16, 2023

Update on "[MPS] Add lu_solve, lu_factor"

659be58

[ghstack-poisoned]

qqaatw added a commit that referenced this pull request Apr 17, 2023

[MPS] Add lu_solve, lu_factor

0163fc7

ghstack-source-id: 302e88e Pull Request resolved: #99269

Update on "[MPS] Add lu_solve, lu_factor"

b9af2b9

[ghstack-poisoned]

Update on "[MPS] Add lu_solve, lu_factor"

6c7b9a6

[ghstack-poisoned]

qqaatw changed the title ~~[MPS] Add lu_solve, lu_factor~~ [MPS] Add lu_factor Apr 29, 2023

Update on "[MPS] Add lu_factor"

f966d1a

[ghstack-poisoned]

Update on "[MPS] Add lu_factor"

de4efd1

[ghstack-poisoned]

Update on "[MPS] Add lu_factor"

76884ab

[ghstack-poisoned]

qqaatw added a commit that referenced this pull request Apr 29, 2023

[MPS] Add lu_factor

65d8c90

ghstack-source-id: 9d2f9e1 Pull Request resolved: #99269

qqaatw marked this pull request as ready for review April 29, 2023 09:35

qqaatw requested review from albanD, kulinseth and soulitzer as code owners April 29, 2023 09:35

qqaatw commented Apr 29, 2023

View reviewed changes

aten/src/ATen/native/mps/operations/LinearAlgebra.mm Outdated Show resolved Hide resolved

qqaatw requested a review from lezcano May 1, 2023 16:05

lezcano reviewed May 1, 2023

View reviewed changes

Update on "[MPS] Add lu_factor"

d75cde1

[ghstack-poisoned]

qqaatw added a commit that referenced this pull request May 2, 2023

[MPS] Add lu_factor

aac9c0c

ghstack-source-id: 1d55191 Pull Request resolved: #99269

lezcano reviewed May 2, 2023

View reviewed changes

kulinseth requested changes May 9, 2023

View reviewed changes

qqaatw mentioned this pull request Jun 8, 2024

aten::_linalg_solve_ex.result' is not currently implemented for the MPS #98222

Closed

qqaatw removed the Stale label Jun 9, 2024

Update

ce8d9cc

[ghstack-poisoned]

qqaatw requested a review from malfet as a code owner June 18, 2024 00:36

Update

cad764a

[ghstack-poisoned]

qqaatw added a commit that referenced this pull request Jun 18, 2024

[MPS] Add lu_factor

645e64e

ghstack-source-id: accadbe Pull Request resolved: #99269

qqaatw requested a review from lezcano June 18, 2024 17:32

lezcano approved these changes Jun 19, 2024

View reviewed changes

Update

006ea34

[ghstack-poisoned]

qqaatw added a commit that referenced this pull request Jun 20, 2024

[MPS] Add lu_factor

b34c8ff

ghstack-source-id: 532b2d5 Pull Request resolved: #99269

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 20, 2024

pytorchmergebot added the merging label Jun 20, 2024

pytorchmergebot removed the merging label Jun 20, 2024

pytorchmergebot added the merging label Jun 20, 2024

pytorchmergebot closed this in 799acd3 Jun 20, 2024

pytorchmergebot added Merged and removed merging labels Jun 20, 2024

qqaatw added a commit that referenced this pull request Jun 29, 2024

[MPS] Add lu_factor

633b9cc

ghstack-source-id: 532b2d5 Pull Request resolved: #99269

github-actions bot deleted the gh/qqaatw/15/head branch July 21, 2024 02:00

inventshah mentioned this pull request Oct 19, 2025

[MPS] Fix parity between CPU and MPS on singular matrices in linalg.lu_factor #165871

Closed

[MPS] Add lu_factor #99269

[MPS] Add lu_factor #99269

Uh oh!

Conversation

qqaatw commented Apr 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤖 Generated by Copilot at d75cde1

Uh oh!

pytorch-bot bot commented Apr 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99269

❌ 1 New Failure, 1 Unrelated Failure

Uh oh!

Uh oh!

lezcano left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lezcano May 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qqaatw May 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kulinseth left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lezcano left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qqaatw commented Jun 20, 2024

Uh oh!

pytorchmergebot commented Jun 20, 2024

Merge started

Uh oh!

pytorchmergebot commented Jun 20, 2024

Merge failed

Uh oh!

qqaatw commented Jun 20, 2024

Uh oh!

pytorch-bot bot commented Jun 20, 2024

Uh oh!

qqaatw commented Jun 20, 2024

Uh oh!

pytorchmergebot commented Jun 20, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

qqaatw commented Apr 16, 2023 •

edited

Loading

`🤖 Generated by Copilot at d75cde1`

pytorch-bot bot commented Apr 16, 2023 •

edited

Loading

lezcano May 2, 2023 •

edited

Loading

qqaatw May 17, 2023 •

edited

Loading