Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] Spectral norm support in torch.norm or factor out Power Iteration from spectralnormalization in some other place and orthogonalization from PowerSGD hook #12760

Open
vadimkantorov opened this issue Oct 17, 2018 · 7 comments
Labels
module: norms and normalization triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@vadimkantorov
Copy link
Contributor

Currently PyTorch has a spectral matrix norm implementation based on power iteration hidden inside the SpectralNorm. Since new torch.norm supports matrix norms (e.g. nuclear norm), it may be good to move it there, so it's usable in other contexts too (e.g. for debugging/tracing spectral properties of neural net weights, and since PyTorch doesn't support truncated randomized SVD yet).

@vadimkantorov vadimkantorov changed the title Spectral norm support in torch.norm [feature request] Spectral norm support in torch.norm Oct 17, 2018
@crcrpar
Copy link
Collaborator

crcrpar commented Oct 18, 2018

I like the idea.

What API do you assume after supports spectral_norm using power iteration method which requires 2 vectors.

@vadimkantorov
Copy link
Contributor Author

vadimkantorov commented Oct 18, 2018

NumPy does this by supporting additional p= arguments:

2 | 2-norm (largest sing. value)
-2 | smallest singular value

@crcrpar
Copy link
Collaborator

crcrpar commented Oct 18, 2018

Ah, I'm concerned with how to handle u and v.
Will the norm function take u as input with dim or prepare inside it?

@heitorschueroff heitorschueroff added module: norms and normalization triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Apr 20, 2021
@vadimkantorov vadimkantorov changed the title [feature request] Spectral norm support in torch.norm [feature request] Spectral norm support in torch.norm or factor out Power Iteration from spectralnormalization in some other place Jun 12, 2021
@vadimkantorov
Copy link
Contributor Author

@tvogels
Copy link

tvogels commented Mar 30, 2022

@vadimkantorov, factoring out orthogonalization from PowerSGD is a good idea, but the orthogonalization operation there is currently much slower than it could be. Based on recommendations in the appendix of this paper (and with help from one of the authors), @younik is currently looking into replacing it with a custom CUDA kernel for the Householder method. It is work in progress, but we are seeing significant speedups. We hope to push a PR for this very soon.

Speedups
Matrix shape   Custom kernel (μs)   torch.qr (μs)   Speedup
(2, 512)       67.4                 11176           165.9x
(2, 4096)      96.5                 11774           122.0x
(4, 512)       96.3                 10497           109.0x
(4, 4096)      71.2                 12448           174.8x
(8, 512)       70.8                 12177           172.1x
(8, 4096)      88.1                 12500           141.9x
(16, 512)      80.2                 11780           147.0x
(16, 4096)     164.9                13296           80.6x
(32, 512)      148.1                11872           80.1x
(32, 4096)     319.1                18132           56.8x
(64, 512)      291.0                11562           39.7x
(64, 4096)     678.6                12940           19.1x
(128, 512)     573.4                16406           28.6x
(128, 4096)    1299.5               27551           21.2x

@vadimkantorov vadimkantorov changed the title [feature request] Spectral norm support in torch.norm or factor out Power Iteration from spectralnormalization in some other place [feature request] Spectral norm support in torch.norm or factor out Power Iteration from spectralnormalization in some other place and orthogonalization from PowerSGD hook Apr 19, 2022
@vadimkantorov
Copy link
Contributor Author

PR on orthogonalize: #76673

@redwrasse
Copy link
Contributor

Another consideration in broadening the applicability of spectral norm is either in documentation or implementation, control of convergence properties- for spectral norm and sister operations. For spectral norm, I believe it's tied to the ratio of top two eigenvalues for the power iteration method. I found today some torch documentation discussing how to handle such concerns in the linalg package: https://pytorch.org/docs/stable/notes/numerical_accuracy.html#extremal-values-in-linalg

Spectral operations like svd, eig, and eigh may also return incorrect results (and their gradients may be infinite) when their inputs have singular values that are close to each other. This is because the algorithms used to compute these decompositions struggle to converge for these inputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: norms and normalization triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

5 participants