New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add a note on the stability of linalg functions. #88313

Closed

lezcano wants to merge 6 commits into gh/Lezcano/158/base from gh/Lezcano/158/head

Collaborator

lezcano commented Nov 2, 2022 •

edited by pytorch-bot bot

Stack from ghstack (oldest at bottom):

-> Add a note on the stability of linalg functions. #88313

This was long-due, as it keeps comming up in issues.

Fixes #85950
Fixes #59720
Fixes #59782

cc @svekars @carljparker @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @lezcano


          Add a note on the stability of linalg functions.

7227d79

This was long-due, as it keeps comming up in issues.

Fixes #85950
Fixes #59720
Fixes #59782

[ghstack-poisoned]

pytorch-bot bot commented Nov 2, 2022 •

edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88313

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit our office hours

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 1 Pending

As of commit b5723ec:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

lezcano added a commit that referenced this pull request


          Add a note on the stability of linalg functions.

df2a23a

This was long-due, as it keeps comming up in issues.

Fixes #85950
Fixes #59720
Fixes #59782

ghstack-source-id: b5e1751d9dbad0816b69bba78cd311d54b739d93
Pull Request resolved: #88313

lezcano requested review from mruberry and IvanYashchuk

November 2, 2022 14:35

lezcano added module: docs module: linear algebra release notes: linalg_frontend labels

pytorchbot added the open source label

soumith approved these changes

View reviewed changes


          Update on "Add a note on the stability of linalg functions."

1226fd9

This was long-due, as it keeps comming up in issues.

Fixes #85950
Fixes #59720
Fixes #59782

cc svekars carljparker jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

[ghstack-poisoned]

lezcano added a commit that referenced this pull request


          Add a note on the stability of linalg functions.

746d667

This was long-due, as it keeps comming up in issues.

Fixes #85950
Fixes #59720
Fixes #59782

ghstack-source-id: 3d8be9c09afb5cf3d10ce32cdf16c94c73dc862e
Pull Request resolved: #88313


          Update on "Add a note on the stability of linalg functions."

883482a

This was long-due, as it keeps comming up in issues.

Fixes #85950
Fixes #59720
Fixes #59782

cc svekars carljparker jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

[ghstack-poisoned]

lezcano added a commit that referenced this pull request


          Add a note on the stability of linalg functions.

7c86bc3

This was long-due, as it keeps comming up in issues.

Fixes #85950
Fixes #59720
Fixes #59782

ghstack-source-id: cd15f46532762f923f3cdd4afac6250897609f74
Pull Request resolved: #88313

mruberry reviewed

View reviewed changes

docs/source/notes/numerical_accuracy.rst Outdated

+              Linear algebra (``torch.linalg``)
+              ---------------------------------
+              Functions within ``torch.linalg`` are particularly problematic when dealing with extremal values.

Collaborator

mruberry Nov 4, 2022

"Particularly problematic" is a vague phrase. In general, the handling of extremal values is not necessarily a (numerical) "stability" issue, either.

mruberry reviewed

View reviewed changes

docs/source/notes/numerical_accuracy.rst Outdated

+              it is left to the user to check that the inputs they provide to these functions do not contain any
+              special values.
+              A similar issue happens with extremal values. Some ``linalg`` functions are not defined for all inputs.

Collaborator

mruberry Nov 4, 2022

The previous paragraph already started the discussion of extremal values

mruberry reviewed

View reviewed changes

docs/source/notes/numerical_accuracy.rst

@@ @@ -51,6 +51,44 @@ datatype. E.g.: @@
                   a.norm() # produces tensor(inf)
                   a.double().norm() # produces tensor(1.4142e+20, dtype=torch.float64), representable in fp32
+              .. _Linear Algebra Stability:
+              Linear algebra (``torch.linalg``)

Collaborator

mruberry Nov 4, 2022 •

edited

Want to workshop this a little over Slack, @lezcano?

lezcano mentioned this pull request

Generalize gesvdjBatched to run whith full_matrices==false #88502

Closed


          Update on "Add a note on the stability of linalg functions."

6300e54

This was long-due, as it keeps comming up in issues.

Fixes #85950
Fixes #59720
Fixes #59782

cc svekars carljparker jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

[ghstack-poisoned]

lezcano added a commit that referenced this pull request


          Add a note on the stability of linalg functions.

8d2b63e

This was long-due, as it keeps comming up in issues.

Fixes #85950
Fixes #59720
Fixes #59782

ghstack-source-id: 39b71d67447e570cd0ed5050c4a9048d349c3d4f
Pull Request resolved: #88313

mruberry reviewed

View reviewed changes

docs/source/notes/numerical_accuracy.rst Outdated

+              when the inputs have non-finite values like ``inf`` or ``NaN``. As such, neither does PyTorch.
+              The operations may return a tensor with non-finite values, or raise an exception, or even segfault.
+              Consider using :func:`torch.isfinite` before calling these functions if your inputs may contain these values.

Collaborator

mruberry Nov 7, 2022

This seems good, might be nice to elaborate slightly on what the user would do with the results of this call

mruberry reviewed

View reviewed changes

docs/source/notes/numerical_accuracy.rst Outdated


		Consider using :func:`torch.isfinite` before calling these functions if your inputs may contain these values.

		Extremal values

Collaborator

mruberry Nov 7, 2022

"Ill-conditioned matrices"?

Collaborator Author

lezcano Nov 7, 2022

I prefere extremal values because, as you've seen, different operations may have different kinds of extremal values.

Collaborator Author

lezcano Nov 7, 2022

ah, right, I will have separate headings

mruberry reviewed

View reviewed changes

docs/source/notes/numerical_accuracy.rst Outdated

+              If provided with ill-conditioned inputs, the results of these functions may vary when using the same input on different devices
+              or when using different backends via the keyword ``driver``.
+              Similarly, spectral operations like ``svd``, ``eig``, and ``eigh`` may return incorrect results (and their gradients may be infinite)

Collaborator

mruberry Nov 7, 2022

What about a separate heading: "Matrices with close eigenvalues"?

Collaborator

mruberry Nov 7, 2022

Do we have a suggestion for what the user can do in this case?

Collaborator Author

lezcano Nov 7, 2022

In the end, rather than making these their own heading, I made sure that the first words in the sentence describe the important set of operations that are described in the paragraph. Making these into their own heading did not look great when rendered.

mruberry reviewed

View reviewed changes

docs/source/notes/numerical_accuracy.rst Outdated

+              being, non-invertible (for example, a matrix that a has a very small singular value), then these algorithms may silently return
+              incorrect results. These matrices are said to be `ill-conditioned <https://nhigham.com/2020/03/19/what-is-a-condition-number/>`_.
+              If provided with ill-conditioned inputs, the results of these functions may vary when using the same input on different devices
+              or when using different backends via the keyword ``driver``.

Collaborator

mruberry Nov 7, 2022

Do we have a suggestion for how users can check for ill-conditioning and what to do if they encounter it?

Collaborator Author

lezcano Nov 7, 2022

I don't think there's a good general suggestion for what to do, as that may have to be tacked on a per-case basis (discard that input, regularise it, analyse the model to figure out whether getting those values makes sense at all...)

mruberry reviewed

View reviewed changes

docs/source/notes/numerical_accuracy.rst Outdated

+              when their inputs have eigenvalues that are too close to each other as the approximate algorithms used to compute these decompositions
+              struggle to converge for these inputs.
+              Running the computation in ``float64`` (as NumPy does by default) often helps, but it does not solve the issue in all cases.

Collaborator

mruberry Nov 7, 2022

New section: "Float32 vs. Float64 precision"?

mruberry reviewed

View reviewed changes

docs/source/notes/numerical_accuracy.rst Outdated

+              when their inputs have eigenvalues that are too close to each other as the approximate algorithms used to compute these decompositions
+              struggle to converge for these inputs.
+              Running the computation in ``float64`` (as NumPy does by default) often helps, but it does not solve the issue in all cases.

Collaborator

mruberry Nov 7, 2022

Let's be more precise than "helps." What this is trying to say is that matrices that are effectively "ill-conditioned" w.r.t. an operation in float32 may not be in double, and same for close eigenvalues, right?

Collaborator

mruberry Nov 7, 2022

When would we recommend users trying running in double to address an issue?

Collaborator Author

lezcano Nov 7, 2022

every time, unless the input matrix is singular in exact precision (e.g. torch.ones(3,3)).

mruberry reviewed

View reviewed changes

docs/source/notes/numerical_accuracy.rst Outdated

+              struggle to converge for these inputs.
+              Running the computation in ``float64`` (as NumPy does by default) often helps, but it does not solve the issue in all cases.
+              Otherwise, analyzing the spectrum of the inputs via :func:`torch.linalg.svdvals` or their condition number via :func:`torch.linalg.cond`

Collaborator

mruberry Nov 7, 2022

This recommendation is too vague -- what would users do with the results of these operations?

Collaborator Author

lezcano Nov 7, 2022

As above, I don't think there's a good general suggestion we can give here...

Collaborator

mruberry Nov 8, 2022

Maybe not worth mentioning, then

mruberry approved these changes

View reviewed changes


          Update on "Add a note on the stability of linalg functions."

f132746

This was long-due, as it keeps comming up in issues.

Fixes #85950
Fixes #59720
Fixes #59782

cc svekars carljparker jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

[ghstack-poisoned]

lezcano added a commit that referenced this pull request


          Add a note on the stability of linalg functions.

b59a8e5

This was long-due, as it keeps comming up in issues.

Fixes #85950
Fixes #59720
Fixes #59782

ghstack-source-id: 8d2b5fcab47cc70bd5da3001639659e62af867d2
Pull Request resolved: #88313


          Update on "Add a note on the stability of linalg functions."

b5723ec

This was long-due, as it keeps comming up in issues.

Fixes #85950
Fixes #59720
Fixes #59782

cc svekars carljparker jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

[ghstack-poisoned]

lezcano added a commit that referenced this pull request


          Add a note on the stability of linalg functions.

429c641

This was long-due, as it keeps comming up in issues.

Fixes #85950
Fixes #59720
Fixes #59782

ghstack-source-id: 269bf9ebfcd86e8c9ac6d1a2544f5115b2aa9403
Pull Request resolved: #88313

Collaborator Author

lezcano commented Nov 7, 2022

@pytorchbot merge

pytorch-bot bot added the ciflow/trunk label

Collaborator

pytorchmergebot commented Nov 7, 2022

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot added the Merged label

pytorchmergebot closed this in

d453b3c

IvanYashchuk removed their request for review

November 8, 2022 08:34

kulinseth pushed a commit to kulinseth/pytorch that referenced this pull request


          Add a note on the stability of linalg functions. (pytorch#88313)

59a8342

This was long-due, as it keeps comming up in issues.

Fixes pytorch#85950
Fixes pytorch#59720
Fixes pytorch#59782

Pull Request resolved: pytorch#88313
Approved by: https://github.com/soumith, https://github.com/mruberry

facebook-github-bot deleted the gh/Lezcano/158/head branch

June 8, 2023 14:43

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment