Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added linalg.eigh, linalg.eigvalsh #45526

Closed
wants to merge 53 commits into from
Closed
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
018604f
Started torch.linalg.eigh and torch.linalg.eigvalsh
IvanYashchuk Sep 29, 2020
4662eb7
Fixed syevd cpu
IvanYashchuk Sep 29, 2020
bdcda38
Added test for linalg.eigh
IvanYashchuk Sep 29, 2020
b6f814c
Expose linalg_eigh, linalg_eigvalsh functions in torch.linalg
IvanYashchuk Sep 29, 2020
781fb2c
Use smaller shape
IvanYashchuk Sep 29, 2020
8f6325b
Added linalg_eigh documentation
IvanYashchuk Oct 5, 2020
6654f4c
Added linalg_eigvalsh documentation
IvanYashchuk Oct 5, 2020
ecd0c80
Remove code duplication due to rebasing
IvanYashchuk Oct 5, 2020
c871eb6
Updated test_eigh
IvanYashchuk Oct 5, 2020
7a15c47
Added test_eigvalsh
IvanYashchuk Oct 5, 2020
432743f
Remove c10::optional uplo
IvanYashchuk Oct 5, 2020
5a94008
Added _syevd_helper_cuda
IvanYashchuk Oct 5, 2020
e8180d8
Use (void) to mark unused arguments.
IvanYashchuk Oct 5, 2020
4f8542c
Renamed compute_v -> compute_eigenvectors
IvanYashchuk Oct 7, 2020
629d934
Added test check with sign flipping of eigenvectors
IvanYashchuk Oct 8, 2020
e4d5a23
Merge remote-tracking branch 'upstream/master' into numpy-eig
IvanYashchuk Nov 2, 2020
62f3e29
Remove skip if numpy not available
IvanYashchuk Nov 2, 2020
f151bfa
Update input argument description
IvanYashchuk Nov 2, 2020
9e94c8b
Added non-contiguous inptu tests
IvanYashchuk Nov 2, 2020
3ce51ca
Updated docs
IvanYashchuk Nov 2, 2020
ec8f824
Make eigh return named tensors, add default value for uplo
IvanYashchuk Nov 2, 2020
0c0dbac
Updated UPLO argument implementation
IvanYashchuk Nov 2, 2020
b55f824
Added out= variant
IvanYashchuk Nov 2, 2020
3c08f0f
Added wrappers to torch/linalg.h
IvanYashchuk Nov 2, 2020
8fdf8b2
Added overrides.py entry
IvanYashchuk Nov 2, 2020
4477796
Added derivative rules for linalg_eigh
IvanYashchuk Nov 2, 2020
87b7c22
Check dtypes for out= variants
IvanYashchuk Nov 2, 2020
794d7cd
Merge remote-tracking branch 'upstream/master' into numpy-eig
IvanYashchuk Nov 2, 2020
4e405bb
flake8
IvanYashchuk Nov 2, 2020
a3adeb4
Merge remote-tracking branch 'upstream/master' into numpy-eig
IvanYashchuk Nov 3, 2020
93b68c4
Add namedtuple entry
IvanYashchuk Nov 3, 2020
685f383
Fix typo
IvanYashchuk Nov 3, 2020
f880538
Merge remote-tracking branch 'upstream/master' into numpy-eig
IvanYashchuk Nov 4, 2020
0be9cac
Added entry for linalg_eigh namedtuple test
IvanYashchuk Nov 4, 2020
efdbb61
Merge branch 'master' into numpy-eig
IvanYashchuk Nov 6, 2020
0ce7224
Typed std::max; added a few comments on the implementation
IvanYashchuk Nov 12, 2020
a8f8366
Remove method variant
IvanYashchuk Nov 12, 2020
434aa9a
Test torch.linalg.eigh namedtuple
IvanYashchuk Nov 12, 2020
9ca046a
Added test with lower case uplo
IvanYashchuk Nov 12, 2020
d9df1b5
Added a few comments describing new functions
IvanYashchuk Nov 12, 2020
890478d
Updated documentation
IvanYashchuk Nov 12, 2020
9abfb24
Merge remote-tracking branch 'upstream/master' into numpy-eig
IvanYashchuk Nov 12, 2020
8a775fd
Doc fixes
IvanYashchuk Nov 12, 2020
0a8669d
Remove unused import
IvanYashchuk Nov 12, 2020
2603839
Added a note on non-uniqueness of eigenvectors
IvanYashchuk Nov 12, 2020
c65451b
Merge remote-tracking branch 'upstream/master' into numpy-eig
IvanYashchuk Nov 17, 2020
78f6b4e
Use same error message for syevd as for symeig when failed to converge
IvanYashchuk Nov 17, 2020
26971ec
Updated documentation
IvanYashchuk Nov 17, 2020
5ed0159
Complex batched matmul now works on cuda
IvanYashchuk Nov 17, 2020
dbc7a72
, -> and
IvanYashchuk Nov 17, 2020
2911420
Safe use of std::toupper with plain chars
IvanYashchuk Nov 18, 2020
85c6c80
Merge remote-tracking branch 'upstream/master' into numpy-eig
IvanYashchuk Nov 20, 2020
bd41343
Fixed backward; can't in-place multiply real and complex tensors
IvanYashchuk Nov 20, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
124 changes: 124 additions & 0 deletions aten/src/ATen/native/BatchLinearAlgebra.cpp
Expand Up @@ -71,6 +71,12 @@ extern "C" void cheev_(char *jobz, char *uplo, int *n, std::complex<float> *a, i
extern "C" void dsyev_(char *jobz, char *uplo, int *n, double *a, int *lda, double *w, double *work, int *lwork, int *info);
extern "C" void ssyev_(char *jobz, char *uplo, int *n, float *a, int *lda, float *w, float *work, int *lwork, int *info);

// syevd
extern "C" void zheevd_(char *jobz, char *uplo, int *n, std::complex<double> *a, int *lda, double *w, std::complex<double> *work, int *lwork, double *rwork, int *lrwork, int *iwork, int *liwork, int *info);
extern "C" void cheevd_(char *jobz, char *uplo, int *n, std::complex<float> *a, int *lda, float *w, std::complex<float> *work, int *lwork, float *rwork, int *lrwork, int *iwork, int *liwork, int *info);
extern "C" void dsyevd_(char *jobz, char *uplo, int *n, double *a, int *lda, double *w, double *work, int *lwork, int *iwork, int *liwork, int *info);
extern "C" void ssyevd_(char *jobz, char *uplo, int *n, float *a, int *lda, float *w, float *work, int *lwork, int *iwork, int *liwork, int *info);

// gesdd
extern "C" void zgesdd_(char *jobz, int *m, int *n, std::complex<double> *a, int *lda,
double *s, std::complex<double> *u, int *ldu, std::complex<double> *vt, int *ldvt, std::complex<double> *work, int *lwork, double *rwork, int *iwork, int *info);
Expand Down Expand Up @@ -121,6 +127,9 @@ void lapackOrgqr(int m, int n, int k, scalar_t *a, int lda, scalar_t *tau, scala
template<class scalar_t, class value_t=scalar_t>
void lapackSymeig(char jobz, char uplo, int n, scalar_t *a, int lda, value_t *w, scalar_t *work, int lwork, value_t *rwork, int *info);

template<class scalar_t, class value_t=scalar_t>
void lapackSyevd(char jobz, char uplo, int n, scalar_t *a, int lda, value_t *w, scalar_t *work, int lwork, value_t *rwork, int lrwork, int *iwork, int liwork, int *info);

template<class scalar_t, class value_t=scalar_t>
void lapackSvd(char jobz, int m, int n, scalar_t *a, int lda,
value_t *s, scalar_t *u, int ldu, scalar_t *vt, int ldvt, scalar_t *work, int lwork, value_t *rwork, int *iwork, int *info);
Expand Down Expand Up @@ -275,6 +284,26 @@ template<> void lapackSymeig<float>(char jobz, char uplo, int n, float *a, int l
ssyev_(&jobz, &uplo, &n, a, &lda, w, work, &lwork, info);
}

template<> void lapackSyevd<c10::complex<double>, double>(char jobz, char uplo, int n, c10::complex<double> *a, int lda, double *w, c10::complex<double> *work, int lwork, double *rwork, int lrwork, int *iwork, int liwork, int *info) {
zheevd_(&jobz, &uplo, &n, reinterpret_cast<std::complex<double>*>(a), &lda, w, reinterpret_cast<std::complex<double>*>(work), &lwork, rwork, &lrwork, iwork, &liwork, info);
}

template<> void lapackSyevd<c10::complex<float>, float>(char jobz, char uplo, int n, c10::complex<float> *a, int lda, float *w, c10::complex<float> *work, int lwork, float *rwork, int lrwork, int *iwork, int liwork, int *info) {
cheevd_(&jobz, &uplo, &n, reinterpret_cast<std::complex<float>*>(a), &lda, w, reinterpret_cast<std::complex<float>*>(work), &lwork, rwork, &lrwork, iwork, &liwork, info);
}

template<> void lapackSyevd<double>(char jobz, char uplo, int n, double *a, int lda, double *w, double *work, int lwork, double *rwork, int lrwork, int *iwork, int liwork, int *info) {
(void)rwork; // unused
(void)lrwork; // unused
dsyevd_(&jobz, &uplo, &n, a, &lda, w, work, &lwork, iwork, &liwork, info);
}

template<> void lapackSyevd<float>(char jobz, char uplo, int n, float *a, int lda, float *w, float *work, int lwork, float *rwork, int lrwork, int *iwork, int liwork, int *info) {
(void)rwork; // unused
(void)lrwork; // unused
ssyevd_(&jobz, &uplo, &n, a, &lda, w, work, &lwork, iwork, &liwork, info);
}

template<> void lapackSvd<c10::complex<double>, double>(char jobz, int m, int n, c10::complex<double> *a, int lda,
double *s, c10::complex<double> *u, int ldu, c10::complex<double> *vt, int ldvt, c10::complex<double> *work, int lwork, double *rwork, int *iwork, int *info) {
zgesdd_(&jobz, &m, &n, reinterpret_cast<std::complex<double>*>(a), &lda, s, reinterpret_cast<std::complex<double>*>(u), &ldu,
Expand Down Expand Up @@ -862,6 +891,101 @@ std::tuple<Tensor&,Tensor&> qr_out(Tensor& Q, Tensor& R, const Tensor& self, boo
return std::tuple<Tensor&, Tensor&>(Q, R);
}

// ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ syevd ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

template <typename scalar_t>
static void apply_syevd(Tensor& w, Tensor& v, bool compute_v, std::string uplo_str, std::vector<int64_t>& infos) {
IvanYashchuk marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment describing the function and how it's intended to be called

#ifndef USE_LAPACK
AT_ERROR("syevd: LAPACK library not found in compilation");
#else
using value_t = typename c10::scalar_value_type<scalar_t>::type;

auto v_data = v.data_ptr<scalar_t>();
auto w_data = w.data_ptr<value_t>();
auto v_matrix_stride = matrixStride(v);
auto w_stride = w.size(-1);
auto batch_size = batchCount(v);
auto n = v.size(-1);
auto lda = std::max(int64_t{1}, n);

char uplo = uplo_str == "U" ? 'U' : 'L';
char jobz = compute_v ? 'V' : 'N';

int info;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment here explaining that the use of imprecise types, like int, is consistent with the LAPACK interface (otherwise we'd use a precise type, like int32_t).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: if we were being extremely correct we would add a using declaration to make it easy to change this int type later

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this kind of type change should happen at once for all functions that call to LAPACK. There should be lapack_int that depends on the type of the linked LAPACK or only 64-bit indexing should be used by default and only 64-bit version of LAPACK allowed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you and the comment looks great.

// Run once, first to get the optimum work size.
// Since we deal with batches of matrices with the same dimensions, doing this outside
// the loop saves (batch_size - 1) workspace queries which would provide the same result
// and (batch_size - 1) calls to allocate and deallocate workspace using at::empty()
int lwork = -1;
int lrwork = -1;
int liwork = -1;
scalar_t work_query;
value_t rwork_query;
int iwork_query;

lapackSyevd<scalar_t, value_t>(jobz, uplo, n, v_data, lda, w_data, &work_query, lwork, &rwork_query, lrwork, &iwork_query, liwork, &info);
lwork = std::max(1, static_cast<int>(real_impl<scalar_t, value_t>(work_query)));
IvanYashchuk marked this conversation as resolved.
Show resolved Hide resolved
Tensor work = at::empty({lwork}, v.options());
liwork = std::max(1, static_cast<int>(iwork_query));
IvanYashchuk marked this conversation as resolved.
Show resolved Hide resolved
Tensor iwork = at::empty({liwork}, at::kInt);

Tensor rwork;
value_t* rwork_data = nullptr;
if (isComplexType(at::typeMetaToScalarType(v.dtype()))) {
lrwork = std::max(1, static_cast<int>(rwork_query));
IvanYashchuk marked this conversation as resolved.
Show resolved Hide resolved
rwork = at::empty({lrwork}, w.options());
rwork_data = rwork.data_ptr<value_t>();
}

for (int64_t i = 0; i < batch_size; i++) {
IvanYashchuk marked this conversation as resolved.
Show resolved Hide resolved
scalar_t* v_working_ptr = &v_data[i * v_matrix_stride];
value_t* w_working_ptr = &w_data[i * w_stride];
lapackSyevd<scalar_t, value_t>(jobz, uplo, n, v_working_ptr, lda, w_working_ptr, work.data_ptr<scalar_t>(), lwork, rwork_data, lrwork, iwork.data_ptr<int>(), liwork, &info);
infos[i] = info;
if (info != 0) {
return;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment here explaining what triggers this return, and what callers should do if this is triggered

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up question:

In the future we expect to add _ex variants of functions in linear algebra that have additional error information. For example, let's say we have torch.linalg.foo that computes the foo operation on a non-singular matrix. If the matrix is singular, however, then the math library we're using returns error information on a device tensor explaining that the matrix is singular.

On the CPU, torch.linalg.foo(cpu_matrix) can throw an error when a singular tensor is encountered. On CUDA, however, translating the device tensor containing the error information requires a cross-device sync, which we try to avoid. torch.linalg.foo(cuda_matrix) should perform and document this sync, however.

torch.linalg.foo_ex, however, will return both device tensors and not perform the error check. This allows the user to check the error themselves, and avoid an immediate cross-device sync on CUDA if they want.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current behavior for linalg (LAPACK/MAGMA based) functions to raise an error if something goes wrong or input doesn't satisfy some requirement therefore all batched CPU functions return early since further computations will be wasted anyway. In the future, this early return should be removed.

What does _ex suffix mean?
I agree that this kind of change will improve performance. I think the default behavior should be the same on CPU and CUDA, so CPU shouldn't raise an error, even though it's cheap to check for it, if CUDA doesn't.
What do you think about instead of introducing new functions with _ex suffix in Python interface modify existing ones to accept optional info= argument, similar to out=, that will be filled with the returned LAPACK/MAGMA/cuSOLVER error codes for the user to check if he wants to.

eigenvalues, eigenvectors = torch.linalg.eigh(input, info=torch.empty(batch_size))
# instead of 
eigenvalues, eigenvectors, info = torch.linalg.eigh_ex(input)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current behavior for linalg (LAPACK/MAGMA based) functions to raise an error if something goes wrong or input doesn't satisfy some requirement therefore all batched CPU functions return early since further computations will be wasted anyway. In the future, this early return should be removed.

Sounds good for now.

What does _ex suffix mean?

"Extra."

I agree that this kind of change will improve performance. I think the default behavior should be the same on CPU and CUDA, so CPU shouldn't raise an error, even though it's cheap to check for it, if CUDA doesn't.

Agreed. Both the CPU and CUDA ex variants will not perform their own error checking and return the info tensor for callers to evaluate.

What do you think about instead of introducing new functions with _ex suffix in Python interface modify existing ones to accept optional info= argument, similar to out=, that will be filled with the returned LAPACK/MAGMA/cuSOLVER error codes for the user to check if he wants to.

This is a really good idea and we did consider an approach like this, but there are technical issues with this approach that make it prohibitive.

}
}
#endif
}

std::tuple<Tensor, Tensor> _syevd_helper_cpu(const Tensor& self, bool compute_v, std::string uplo) {
std::vector<int64_t> infos(batchCount(self), 0);

auto self_sizes = self.sizes().vec();
self_sizes.pop_back();
ScalarType dtype = toValueType(typeMetaToScalarType(self.dtype()));
auto eigvals = at::empty(self_sizes, self.options().dtype(dtype));

auto eigvecs = cloneBatchedColumnMajor(self);
AT_DISPATCH_FLOATING_AND_COMPLEX_TYPES(self.scalar_type(), "syevd_cpu", [&]{
apply_syevd<scalar_t>(eigvals, eigvecs, compute_v, uplo, infos);
});

if (self.dim() > 2) {
batchCheckErrors(infos, "syevd_cpu");
} else {
singleCheckErrors(infos[0], "syevd_cpu");
}
if (compute_v) {
return std::tuple<Tensor, Tensor>(eigvals, eigvecs);
} else {
return std::tuple<Tensor, Tensor>(eigvals, at::empty({0}, self.options()));
}
}

std::tuple<Tensor, Tensor> linalg_eigh(const Tensor& self, std::string uplo) {
squareCheckInputs(self);
return at::_syevd_helper(self, /*compute_v=*/true, uplo);
}

Tensor linalg_eigvalsh(const Tensor& self, std::string uplo) {
squareCheckInputs(self);
Tensor eigvals, eigvecs;
std::tie(eigvals, eigvecs) = at::_syevd_helper(self, /*compute_v=*/false, uplo);
return eigvals;
}

// ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ symeig ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

template <typename scalar_t>
Expand Down
7 changes: 7 additions & 0 deletions aten/src/ATen/native/cuda/BatchLinearAlgebra.cu
Expand Up @@ -1389,6 +1389,13 @@ std::tuple<Tensor, Tensor> _symeig_helper_cuda(const Tensor& self, bool eigenvec
}
}

// ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ syevd ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

std::tuple<Tensor, Tensor> _syevd_helper_cuda(const Tensor& self, bool compute_v, std::string uplo) {
bool upper = uplo == "U" ? true : false;
return _symeig_helper_cuda(self, compute_v, upper);
}

// ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ svd ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

template<typename scalar_t>
Expand Down
17 changes: 17 additions & 0 deletions aten/src/ATen/native/native_functions.yaml
Expand Up @@ -8224,6 +8224,23 @@
use_c10_dispatcher: full
variants: function, method

- func: _syevd_helper(Tensor self, bool compute_v, str uplo) -> (Tensor, Tensor)
nikitaved marked this conversation as resolved.
Show resolved Hide resolved
use_c10_dispatcher: full
variants: function
dispatch:
CPU: _syevd_helper_cpu
CUDA: _syevd_helper_cuda

- func: linalg_eigh(Tensor self, str uplo) -> (Tensor, Tensor)
python_module: linalg
use_c10_dispatcher: full
variants: function

- func: linalg_eigvalsh(Tensor self, str uplo) -> Tensor
python_module: linalg
use_c10_dispatcher: full
variants: function

# torch.outer, alias for torch.ger
- func: outer(Tensor self, Tensor vec2) -> Tensor
use_c10_dispatcher: full
Expand Down
2 changes: 2 additions & 0 deletions docs/source/linalg.rst
Expand Up @@ -13,4 +13,6 @@ Functions
---------

.. autofunction:: det
.. autofunction:: eigh
.. autofunction:: eigvalsh
.. autofunction:: norm
48 changes: 48 additions & 0 deletions test/test_linalg.py
Expand Up @@ -180,6 +180,54 @@ def test_det(self, device, dtype):
with self.assertRaises(RuntimeError):
op(t)

@skipCUDAIfNoMagma
@skipCPUIfNoLapack
@unittest.skipIf(not TEST_NUMPY, "NumPy not found")
@dtypes(torch.float32, torch.float64, torch.complex64, torch.complex128)
def test_eigh(self, device, dtype):
from torch.testing._internal.common_utils import random_hermitian_matrix

def run_test(shape, batch):
matrix = random_hermitian_matrix(shape, *batch, dtype=dtype, device=device)
expected_w, expected_v = np.linalg.eigh(matrix.cpu().numpy())
actual_w, actual_v = torch.linalg.eigh(matrix)
self.assertEqual(actual_w, expected_w)
# sign of eigenvectors is not unique and therefore absolute values are compared
IvanYashchuk marked this conversation as resolved.
Show resolved Hide resolved
IvanYashchuk marked this conversation as resolved.
Show resolved Hide resolved
self.assertEqual(abs(actual_v), abs(expected_v))

shapes = (0, 3, 5)
batches = ((), (3, ), (2, 2))
for shape, batch in itertools.product(shapes, batches):
run_test(shape, batch)

# eigh requires a square matrix
t = torch.randn(2, 3, device=device, dtype=dtype)
with self.assertRaises(RuntimeError):
torch.linalg.eigh(t)

@skipCUDAIfNoMagma
@skipCPUIfNoLapack
@unittest.skipIf(not TEST_NUMPY, "NumPy not found")
@dtypes(torch.float32, torch.float64, torch.complex64, torch.complex128)
def test_eigvalsh(self, device, dtype):
from torch.testing._internal.common_utils import random_hermitian_matrix

def run_test(shape, batch):
matrix = random_hermitian_matrix(shape, *batch, dtype=dtype, device=device)
expected_w = np.linalg.eigvalsh(matrix.cpu().numpy())
actual_w = torch.linalg.eigvalsh(matrix)
self.assertEqual(actual_w, expected_w)

shapes = (0, 3, 5)
batches = ((), (3, ), (2, 2))
for shape, batch in itertools.product(shapes, batches):
run_test(shape, batch)

# eigvalsh requires a square matrix
t = torch.randn(2, 3, device=device, dtype=dtype)
with self.assertRaises(RuntimeError):
torch.linalg.eigvalsh(t)

# This test confirms that torch.linalg.norm's dtype argument works
# as expected, according to the function's documentation
@skipCUDAIfNoMagma
Expand Down
92 changes: 92 additions & 0 deletions torch/linalg/__init__.py
Expand Up @@ -3,6 +3,8 @@
import torch
from torch._C import _add_docstr, _linalg # type: ignore

import functools

Tensor = torch.Tensor

# Note: This not only adds doc strings for functions in the linalg namespace, but
Expand Down Expand Up @@ -140,3 +142,93 @@
>>> LA.norm(m[0, :, :]), LA.norm(m[1, :, :])
(tensor(3.7417), tensor(11.2250))
""")

_add_docstr(_linalg.linalg_eigh, r"""
linalg.eigh(input, UPLO='L') -> tuple(Tensor, Tensor)
nikitaved marked this conversation as resolved.
Show resolved Hide resolved

This function returns eigenvalues and eigenvectors
IvanYashchuk marked this conversation as resolved.
Show resolved Hide resolved
of a complex Hermitian (conjugate symmetric) or real symmetric matrix :attr:`input`
represented by a namedtuple (eigenvalues, eigenvectors).

This function calculates all eigenvalues (and vectors) of :attr:`input`
such that :math:`\text{input} = V \text{diag}(e) V^H`.

Since the input matrix :attr:`input` is supposed to be Hermitian,
IvanYashchuk marked this conversation as resolved.
Show resolved Hide resolved
only the lower triangular portion is used by default
nikitaved marked this conversation as resolved.
Show resolved Hide resolved
and the imaginary part of the diagonal will always be treated as zero.

IvanYashchuk marked this conversation as resolved.
Show resolved Hide resolved
.. note:: The eigenvalues of real symmetric or complex Hermitian matrices are always real.

.. note:: The eigenvalues/eigenvectors are computed using LAPACK routines ``_syevd``, ``_heevd``.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function causes a cross-device sync on CUDA, right? If so, this PR should add a warning here explaining that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That warning can be combined with a warning that this function will throw an error if a tensor does not meet the requirements.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a note on cross-device synchronization for CUDA.

This function doesn't throw errors if the input is not Hermitian or symmetric because of

Since :attr:input is assumed to be consisting of Hermitian matrices,
only the lower triangular portion of these matrices is used by default (UPLO = 'L') in computations
and the imaginary part of the diagonal will always be treated as zero.

The documentation says the error code can be >0 indicating that the algorithm didn't converge. But I don't know what kind of input is needed to make it fail.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... is our failure behavior consistent with NumPy's, then?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. NumPy raises LinAlgError("Eigenvalues did not converge") for info > 0.

Args:
input (Tensor): the input tensor of size :math:`(*, n, n)` where `*` is zero or more
batch dimensions consisting of Hermitian matrices.
UPLO ('L', 'U', optional): controls whether to consider upper-triangular or lower-triangular part.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UPLO argument needs to be mentioned in the description of the function, too. This should be more specific, too. "considers" is too vague a term.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the main text, it says now "only the lower triangular portion of these matrices is used by default (UPLO = 'L') in computations".
I changed "considers" -> "controls whether to use upper-triangular or lower-triangular part of :attr:input in the computations."

Default: ``'L'``

Returns:
(Tensor, Tensor): A namedtuple (eigenvalues, eigenvectors) containing

- **eigenvalues** (*Tensor*): Shape :math:`(*, m)`.
The eigenvalues in ascending order, each repeated according to its multiplicity.
IvanYashchuk marked this conversation as resolved.
Show resolved Hide resolved
- **eigenvectors** (*Tensor*): Shape :math:`(*, m, m)`.
The orthonormal eigenvectors of the ``input``.

Examples::

>>> import torch
>>> a = torch.randn(2, 2, dtype=torch.complex128)
>>> a = a + a.t().conj() # To make a Hermitian
>>> a
tensor([[2.9228+0.0000j, 0.2029-0.0862j],
[0.2029+0.0862j, 0.3464+0.0000j]], dtype=torch.complex128)
>>> w, v = torch.linalg.eigh(a)
>>> w
tensor([0.3277, 2.9415], dtype=torch.float64)
>>> v
tensor([[-0.0846+-0.0000j, -0.9964+0.0000j],
[ 0.9170+0.3898j, -0.0779-0.0331j]], dtype=torch.complex128)
>>> torch.allclose(torch.matmul(v, torch.matmul(w.to(v.dtype).diag_embed(), v.transpose(-2, -1).conj())), a)
True
""")

_add_docstr(_linalg.linalg_eigvalsh, r"""
linalg.eigvalsh(input, UPLO='L') -> Tensor

This function returns eigenvalues of a complex Hermitian (conjugate symmetric)
IvanYashchuk marked this conversation as resolved.
Show resolved Hide resolved
or real symmetric matrix :attr:`input`.

.. note:: The eigenvalues of real symmetric or complex Hermitian matrices are always real.

.. note:: The eigenvalues are computed using LAPACK routines ``_syevd``, ``_heevd``.

Args:
input (Tensor): the input tensor of size :math:`(*, n, n)` where `*` is zero or more
batch dimensions consisting of Hermitian matrices.
UPLO ('L', 'U', optional): controls whether to consider upper-triangular or lower-triangular part.
Default: ``'L'``

Returns:
(Tensor): Shape :math:`(*, m)`. The eigenvalues in ascending order, each repeated according to its multiplicity.

Examples::

>>> import torch
>>> a = torch.randn(2, 2, dtype=torch.complex128)
>>> a = a + a.t().conj() # To make a Hermitian
>>> a
tensor([[2.9228+0.0000j, 0.2029-0.0862j],
[0.2029+0.0862j, 0.3464+0.0000j]], dtype=torch.complex128)
>>> w = torch.linalg.eigvalsh(a)
>>> w
tensor([0.3277, 2.9415], dtype=torch.float64)
""")

@functools.wraps(_linalg.linalg_eigh)
def eigh(a, UPLO="L"):
return _linalg.linalg_eigh(a, UPLO)

@functools.wraps(_linalg.linalg_eigh)
def eigvalsh(a, UPLO="L"):
return _linalg.linalg_eigvalsh(a, UPLO)