svd_compressed() fails for complex input #7639

nicrie · 2021-05-10T11:07:32Z

What happened:
Complex singular values and vectors of svd_compressed() are different from standard np.linalg.svd.

What you expected to happen:
Complex singular values and vectors of svd_compressed() are similar (within some uncertainties) to standard np.linalg.svd.

Minimal Complete Verifiable Example:

import numpy as np
import xarray as xr
import dask.array as da
from matplotlib.pyplot import imshow

data = xr.tutorial.open_dataset(
    'air_temperature',
    chunks={'lat': 25, 'lon': 25, 'time': -1}
)
temp = data.air
temp = temp.stack(x=('lat', 'lon')).compute()
temp -= temp.mean('time')
# artificial complexification of data
temp = temp + (1j * 0.1 * temp**2)
kernel = np.dot(temp.conj().T, temp) / temp.shape[0]

# standard SVD
u, s, vt = np.linalg.svd(kernel, full_matrices=False)
# dask SVD 
dask_kernel = da.from_array(kernel)
k = 100
dsvd = da.linalg.svd_compressed(dask_kernel, k)
u2, s2, vt2 = (x.compute() for x in dsvd)

np.allclose(s[:100], s2)  # False
np.allclose(vt[:100], vt2)  # False

# visual check: look at first singular vector
v = vt.reshape((temp.shape[1],) + data.air.shape[1:])
v2 = vt2.reshape((k,) + data.air.shape[1:])

# imaginary parts of first singular vector are clearly different
imshow(v[0,:,:].imag, vmin=-1e-3, vmax=1e-3)
imshow(-v2[0,:,:].imag, vmin=-1e-3, vmax=1e-3)

Anything else we need to know?:

For real input, both methods yield the same results.
When running svd_compressed() a warning is thrown complaining about complex being casted to real dtypes:

/home/nrieger/anaconda3/envs/xmca/lib/python3.7/site-packages/dask/array/utils.py:108: ComplexWarning: Casting complex values to real discards the imaginary part
  meta = meta.astype(dtype)

Environment:

Dask version: 2021.04.0
Python version: 3.7.10
Operating System: Ubuntu 20.04
Install method: conda

The text was updated successfully, but these errors were encountered:

jsignell · 2021-05-14T13:29:17Z

Ping @eric-czech and @RogerMoens since you have both looked at the svd implementation recently.

eric-czech · 2021-05-15T09:08:50Z

For real input, both methods yield the same results.

What makes you say that? If I remove the line in your example that casts temp to imaginary, the allclose still fails and afiak singular vectors from np.lingalg.svd aren't directly comparable to those from da.lingalg.svd_compressed like that.

You could compare to da.linalg.svd instead, but even then the results will only be the same up to possible sign flips. You would have to pass the numpy results through sign coercion as dask does by default now after #6613.

nicrie · 2021-05-16T16:10:09Z

Hi thanks for getting back! You're completely correct with your points, I added svd_flip to guarantee consistent signs in the solution and, as you suggested, compared da.linalg.svd_compressed to da.linalg.svd.

I'm aware that da.linalg.svd_compressed is an approximation and I'm not expecting it to yield the exact results as da.linalg.svd. However, I would expect that they yield similar solutions (what would be the benefit of having such a method otherwise?). I rewrote the MWE a bit to make things clear.

Imports

import xarray as xr
import dask.array as da
from dask.array.utils import svd_flip

Make two functions for (i) getting the kernel (real and complex) and (ii) performing svd both in standard and compressed form. perform_svd only returns the first 5 singular values/vectors for both SVD algorithms.

Functions

def get_kernel():
    data = xr.tutorial.open_dataset(
        'air_temperature',
        chunks={'lat': 25, 'lon': 25, 'time': -1}
    )
    temp = data.air
    temp = temp.stack(x=('lat', 'lon')).compute()
    temp -= temp.mean('time')
    # artificial complexification of data
    temp = temp + (1j * 0.1 * temp**2)
    kernel = np.dot(temp.conj().T, temp) / temp.shape[0]
    dask_kernel = da.from_array(kernel)
    return dask_kernel


def perform_dask_svd(kernel):
    # Dask SVD
    dsvd1 = da.linalg.svd(kernel)
    u1, s1, vt1 = (x.compute() for x in dsvd1)
    u1, vt1 = svd_flip(u1, vt1)

    # Dask SVD Compressed
    k = 100
    dsvd2 = da.linalg.svd_compressed(kernel, k)
    u2, s2, vt2 = (x.compute() for x in dsvd2)
    u2, vt2 = svd_flip(u2, vt2)

    # compare only first n singular values/vectors
    n = 5
    result = {
        # standard SVD
        'svd' : {'u': u1[:, :n], 's': s1[:n], 'vt': vt1[:n]},
        # compressed SVD
        'com_svd' : {'u': u2[:, :n], 's': s2[:n], 'vt': vt2[:n]}
    }
    return result

Perform SVD

complex_kernel = get_kernel()
real_kernel = get_kernel().real

real = perform_dask_svd(real_kernel)
cplx = perform_dask_svd(complex_kernel)

real contains the solution for da.linalg.svd_compressed and da.linalg.svd for real input, cplx contains the same but for complex input.

Check results

# REAL
# ----------------------
# Get an idea of the magnitude of the solution
print(np.max(real['svd']['s']))  # 3.4e6
print(np.max(real['svd']['vt']))  # around 0.1

np.allclose(real['svd']['s'], real['com_svd']['s'], rtol=1e-4)  # True
np.allclose(real['svd']['vt'], real['com_svd']['vt'], rtol=1e-4, atol=1e-4)  # True


# Maximal absolute deviation
print(np.max(real['svd']['s'] - real['com_svd']['s']))  # 0.25 for singular values; that's fine
print(np.max(real['svd']['vt'] - real['com_svd']['vt']))  # 2e-5 for singular vector; that's fine


# COMPLEX
# ----------------------
# Get an idea of the magnitude of the solution
print(np.max(cplx['svd']['s']))  # 3.4e6
print(np.max(cplx['svd']['vt'].real))  # around 0.1
print(np.max(cplx['svd']['vt'].imag))  # around 0.1

np.allclose(cplx['svd']['s'], cplx['com_svd']['s'], rtol=1e-4)  # False
np.allclose(cplx['svd']['vt'], cplx['com_svd']['vt'], rtol=1e-4, atol=1e-4)  # False

# Maximal absolute deviation
print(np.max(cplx['svd']['s'] - cplx['com_svd']['s']))  # around 17, OK if compared to max absolute value of 3.4e6 
print(np.max(cplx['svd']['vt'].real - cplx['com_svd']['vt'].real))  # 0.06  really bad if compared to max absolute value of 0.1
print(np.max(cplx['svd']['vt'].imag - cplx['com_svd']['vt'].imag))  # 0.16j even worse, that translates to a relative error > 100%!

From the comparison below it is evident, that while for real input the solutions are approximately the same, the difference are substantial when taking complex input.

Example: Image of imaginary part of 2nd singular vector

Both figures are using the same colorbar range

For `svd`

For `svd_compressed`

RogerMoens · 2021-05-16T19:25:47Z

@nicrie: You have not set an iterator for the randomized svd, setting it properly might improve the quality of the fit. Default is the no iterator, as the number of iterations is set to 0. The power iterator or the qr iterator are more precise depending on your singular value spectrum. Secondly, due to its sampling nature you should not compare the left and right singular vectors separately, but rather compare the whole reconstruction A_k w.r.t. the original A, e.g. by a Frobenius or 2-norm. Your maximal absolute deviation therefore doesn't seem appropriate to me. I guess you use it due to the dimensionality of your vectors?

Edit 1: it might as well be that there is a problem with complex valued matrices, I will try some mock-up problem tomorrow. First I need to get acquainted with the complex type svd.

Edit 2: for np.max(cplx['svd']['s'] - cplx['com_svd']['s']) you inherently assume that the singular vectors of both solutions are properly aligned and ordered w.r.t. each other (standard svd vs. randomized svd), I don't think such statement is generally true.

RogerMoens · 2021-05-17T09:08:09Z

I tried to review the reconstruction and I got:

recon = (cplx['svd']['u']*cplx['svd']['s'])@cplx['svd']['vt']
recon_com = (cplx['com_svd']['u']*cplx['com_svd']['s'])@cplx['com_svd']['vt']
print(np.linalg.norm(recon.imag - recon_com.imag, 2)/np.linalg.norm(recon.imag,2)) # 1.4710661
print(np.linalg.norm(recon.real - recon_com.real, 2)/np.linalg.norm(recon.real,2)) # 0.022141397

which is relatively large: 147% and 2.2% relative difference. It seems that the real part is well reconstructed to some extent, but the complex part not. It might be that the sampling of the imaginary part is not appropriate as only the real part is sampled, i.e. we use a real sampling matrix.

I think we have to review lines 720 to 726 in linalg.py for it:

    datatype = np.float64
    if (data.dtype).type in {np.float32, np.complex64}:
        datatype = np.float32
    omega = state.standard_normal(
        size=(n, comp_level), chunks=(data.chunks[1], (comp_level,))
    ).astype(datatype, copy=False)
    mat_h = data.dot(omega)

eric-czech · 2021-05-17T11:48:52Z

However, I would expect that they yield similar solutions (what would be the benefit of having such a method otherwise?).

I was trying to understand that myself recently and came to the conclusion that I don't think this is true. Afaik, SVD is not unique for non-square matrices so comparison of reconstruction quality is necessary instead of comparing singular vectors.

RogerMoens · 2021-05-17T11:58:29Z

@eric-czech: The SVD is unique (up to signs) for square, non-square, and complex matrices as long as no singular value pairs (two or more singular values with the same value exist). See Trefethen, L. N. & Bau III, D. Numerical linear algebra, vol. 50 (Siam, 1997).

I am not sure whether randomized methods such as svd_compressed provide unique solutions, as they severely depend on the sampling of the matrix. In my experience they are not unique.

RogerMoens · 2021-05-19T08:52:59Z

Just checked, the TSQR gives same results for complex matrices as numpy QR. I am not sure whether the sampling for complex matrices should also be a complex matrix.

RogerMoens · 2021-05-21T08:15:44Z

I think the sampling step is appropriate as is for complex matrices. I have planned to check the tsqr including svd next week:

v, s, u = tsqr(a_compressed.T, compute_svd=True)

@nicrie: you could try to compare the outputs with sklearn randomized/compressed/truncated SVD: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.TruncatedSVD.html

@eric-czech: how does the svd flip function handles complex inputs?

eric-czech · 2021-05-25T15:36:01Z

how does the svd flip function handles complex inputs?

It should be fine AFAIK since the operations in that function only operate on the real term, so this invariant should be true regardless of the input:

x = np.random.rand(1000, 100) + np.random.rand(1000, 100) * 1j
u, s, v = np.linalg.svd(x, full_matrices=False)
uf1, vf1 = svd_flip(u, vt)
uf2, vf2 = svd_flip(np.real(u), np.real(vt))
np.testing.assert_array_equal(np.real(uf1), uf2)
np.testing.assert_array_equal(np.real(vf1), vf2)

It will preserve the complex dtypes too since it's only using addition, multiplication and axis summation.

RogerMoens · 2021-05-27T11:47:19Z

Small update: I just checked the tsqr with svd, I think there is a casting in there somewhere that is not appropriate:

import xarray as xr
import dask.array as da
from dask.array.utils import svd_flip
import numpy as np
import matplotlib.pyplot as plt

a = np.random.rand(500,100)+np.random.rand(500,100)*1j
a_da = da.asarray(a)
u_da, s_da, vt_da = da.linalg.tsqr(a_da, compute_svd=True)

gives me

dask\array\utils.py:108: ComplexWarning: Casting complex values to real discards the imaginary part
  meta = meta.astype(dtype)

The singular values (s_da) seem to be correct, but the u_da and vt_da are not.

nicrie · 2021-05-28T11:40:42Z

Thanks for the updates @RogerMoens. I tried to compare with the truncated SVD of sklearn, but their implementation does not support complex data (perhaps for a reason?)

I tried to understand where the ComplexWarning is coming from, but I realize I don't understand the code sufficiently well.
Nevertheless, I narrowed down the thrown warning to final lines of code in TSQR, lines 502 - 517, where meta_from_array is called and causes the casting from complex to real. However, not sure if its really a problem since it affects only the meta information (?):

graph = HighLevelGraph(layers, dependencies)
u_meta = meta_from_array(data, len((m_u, n_u)), uu.dtype)
s_meta = meta_from_array(data, len((n_s,)), ss.dtype)
vh_meta = meta_from_array(data, len((d_vh, d_vh)), vvh.dtype)
u = Array(
    graph,
    name_u_st4,
    shape=(m_u, n_u),
    chunks=(data.chunks[0], (n_u,)),
    meta=u_meta,
)
s = Array(graph, name_s_st2, shape=(n_s,), chunks=((n_s,),), meta=s_meta)
vh = Array(
    graph, name_v_st2, shape=(d_vh, d_vh), chunks=((n,), (n,)), meta=vh_meta
)
return u, s, vh

jsignell · 2021-06-03T19:39:53Z

If the error arises before you call .compute then the meta might be the issue, but I think what you are seeing is not an error, but incorrect results, so I think the meta is not the root of the issue.

LUOXIAO92 · 2022-08-22T04:24:17Z

The same issue. But on my case, I reconstructed the matrix using M'=UΣV^† . The real matrix input yields reasonable answer, and it fails for complex matrix input. Here is my code

import os
import sys
import time
import numpy as np
from opt_einsum import contract

from dask.distributed import Client
from dask_cuda import LocalCUDACluster
from dask_cuda.initialize import initialize
from dask.utils import parse_bytes
from dask.distributed import performance_report
from dask.distributed import wait
from dask.distributed import get_task_stream

import cupy
import rmm
import cudf
import dask.array as da


def setup_rmm_pool(client):
    client.run(
        cudf.set_allocator,
        pool=False,
        #initial_pool_size= parse_bytes("1GB"),
        allocator="default"
    )
    client.run(
        cupy.cuda.set_allocator,
        #rmm.rmm_cupy_allocator,
        rmm.mr.set_current_device_resource(rmm.mr.ManagedMemoryResource())
    )

if __name__ == "__main__":

    initialize(create_cuda_context=True)
    
    cluster = LocalCUDACluster(local_directory="./tmp/",
                                memory_limit=None)
       
    client = Client(cluster)
    setup_rmm_pool(client)

    nprs = np.random.RandomState(seed=1234)
    rs = da.random.RandomState(seed=1234,RandomState=cupy.random.RandomState)

    SIZE = 15000
    k = 32

    b = nprs.rand(SIZE) + 1j * nprs.rand(SIZE)
    b = da.from_array(b, chunks=(5000))
    b = b.map_blocks(cupy.asarray)
    #a = contract("i,j->ij",b,b) * 10
    a = da.einsum("i,j->ij",b,b) * 10
    #a = a.persist()

    a = da.exp(1.2*a)
    
    t0=time.time()
    u,s,vh=da.linalg.svd_compressed(a,k=k, seed=rs)
    u,s,vh=da.compute(u,s,vh)
    t1=time.time()
    u=da.from_array(u,chunks=(5000,k))
    vh=da.from_array(vh,chunks=(k,5000))
        
    #b = contract("ij,j,jk->ik",u,s,vh)
    b = da.einsum("ij,j,jk->ik",u,s,vh)
    a = a - b #<-a:M, b:M'
    tr = da.sum(da.diagonal(a)).compute()
    print("trace:{:}".format(tr))
    norm=da.linalg.norm(a).compute()
    print("norm:{:}".format(norm))

    sys.exit(0)

I calculated the trace and norm of M-M' (M:original matrix; M':M'~UΣV^†) which should be small. On the complex case, difference between M and M' is large.

trace:(768972.9893344672+7563949.776901031j)
norm:54801713.60352144

But on the real case, it yields a reasonable answer.

trace:-2.470109161656353e-08
norm:1.8832952585877038e-07

LUOXIAO92 · 2022-09-04T07:33:45Z

I think I have found the bugs in svd_compressed(). The svd calculation for complex matrix is $A=U\Sigma V^\dagger$, where $U$ and $V^\dagger$ are complex matrices. The " ${}^\dagger$ " means "Hermitian transpose", which on the case of real matrix reduces to a transpose matrix. While calculating complex matrix svd, we should use Hermitian transpose but not transpose matrix.
At the source code for dask.array.linalg, I found some bugs, and I will figure out the bugs by comments. First, at the "compression_matrix()" function,

    if iterator == "power":
        for i in range(n_power_iter):
            if compute:
                mat_h = mat_h.persist()
                wait(mat_h)
            tmp = data.T.dot(mat_h) #<-should be Hermitian transpose da.conj(data.T) on the complex case
            if compute:
                tmp = tmp.persist()
                wait(tmp)
            mat_h = data.dot(tmp)
        q, _ = tsqr(mat_h)
    else:
        q, _ = tsqr(mat_h)
        for i in range(n_power_iter):
            if compute:
                q = q.persist()
                wait(q)
            q, _ = tsqr(data.T.dot(q)) #<-should be Hermitian transpose da.conj(data.T) on the complex case
            if compute:
                q = q.persist()
                wait(q)
            q, _ = tsqr(data.dot(q))

    return q.T #<-should be Hermitian transpose da.conj(q.T) on the complex case

Then, at the "svd_compressed()" function,

    a_compressed = comp.dot(a)
    v, s, u = tsqr(a_compressed.T, compute_svd=True)
    u = comp.T.dot(u.T) #<-should be  da.conj(comp.T).dot(u.T) on the complex case
    v = v.T
    u = u[:, :k]
    s = s[:k]
    v = v[:k, :]
    if coerce_signs:
        u, v = svd_flip(u, v)
    return u, s, v

Here is my code of randomized svd by the use of power iteration

randomized_svd.py

def rsvd(A:da.Array, k:int, n_oversamples=10, n_power_iter=0, seed=da.random.RandomState(seed=1234,RandomState=np.random.RandomState)):
    """
    A: MxN dask array\\
    k:rank\\
    n_oversamples+k=l << min{M,N} must be satisfied\\
    seed:default is numpy.random.RandomState(seed=1234)
    """
    
    M, N = (da.shape(A)[0], da.shape(A)[1])

    l = k + n_oversamples
    Omega = seed.normal(size=(N, l))
    Q = __power_iteration__(A, Omega, n_power_iter)
    del Omega

    if A.dtype == "complex16" or A.dtype == "complex32" or A.dtype == "complex64" \
        or A.dtype == "complex128" or  A.dtype == "complex256" or A.dtype == "complex512":
        B = da.conj(da.transpose(Q)) @ A
    else:
        B = da.transpose(Q) @ A

    u_tilde, s, vh = da.linalg.svd(B)
    del B
    u = Q @ u_tilde
    del u_tilde
    
    return u[:,:k], s[:k], vh[:k,:]

def __power_iteration__(A, Omega, n_power_iter:int):
    
    Y = A @ Omega
    if A.dtype == "complex16" or A.dtype == "complex32" or A.dtype == "complex64" \
        or A.dtype == "complex128" or  A.dtype == "complex256" or A.dtype == "complex512":
        for q in range(n_power_iter):
            Y = A @ (da.conj(da.transpose(A)) @ Y)
    else:
        for q in range(n_power_iter):
            Y = A @ (da.transpose(A) @ Y)
    Q, _ = da.linalg.tsqr(Y)

    return Q

Here is the test code, which compares to the full svd "cupy.linalg.svd" and "dask.array.linalg.svd_compressed"

test.py

import sys
import time
import numpy as np
from opt_einsum import contract

from dask.distributed import Client
from dask_cuda import LocalCUDACluster
from dask_cuda.initialize import initialize
from dask.utils import parse_bytes
from dask.distributed import performance_report
from dask.distributed import wait
from dask.distributed import get_task_stream

import cupy
import rmm
import cudf
import dask.array as da


def setup_rmm_pool(client):
    client.run(
        cudf.set_allocator,
        pool=False,
        #initial_pool_size= parse_bytes("1GB"),
        allocator="default"
    )
    client.run(
        cupy.cuda.set_allocator,
        #rmm.rmm_cupy_allocator,
        rmm.mr.set_current_device_resource(rmm.mr.ManagedMemoryResource())
    )

if __name__ == "__main__":

    initialize(create_cuda_context=True)
    
    cluster = LocalCUDACluster(local_directory="./tmp/",
                                memory_limit=None)
       
    client = Client(cluster)
    setup_rmm_pool(client)

    nprs = np.random.RandomState(seed=1234)
    rs = da.random.RandomState(seed=1234,RandomState=cupy.random.RandomState)

    SIZE = 10000
    k = 32

    b = nprs.rand(SIZE) + 1j * nprs.rand(SIZE)
    b = cupy.asarray(b, dtype=cupy.complex128)
    a = contract("i,j->ij",b,b) * 10
    a = cupy.exp(1.2*a)

    #full svd
    t0 = time.time()
    u,s,vh = cupy.linalg.svd(a)
    t1=time.time()
    del u,vh
    print("full svd time: {:.2f} s".format(t1-t0))
    print(s[:k])

    a = da.from_array(a, chunks=(5000,5000))

    #rsvd
    from randomized_svd import rsvd
    t0 = time.time()
    u,s,vh = rsvd(a, k=k, seed=rs)
    u,s,vh = da.compute(u,s,vh) 
    t1 = time.time()
    print("rsvd time: {:.2f} s".format(t1-t0))
    print(s)

    u  = da.from_array(u,chunks=(SIZE,k))
    vh = da.from_array(vh,chunks=(k,SIZE))
    b = da.einsum("ij,j,jk->ik",u,s,vh)
    err = a - b
    tr = da.trace(err).compute()
    print("trace:{:19.12e}{:+19.12e}".format(tr.real,tr.imag))
    materr = da.linalg.norm(err).compute()
    materr = float(materr / da.max(da.abs(a)))
    print("err:{:18.12e}".format(materr))


    #dask.array.linalg.svd_compress
    t0 = time.time()
    u,s,vh = da.linalg.svd_compressed(a, k=k, seed=rs)
    u,s,vh = da.compute(u,s,vh) 
    t1 = time.time()
    print("da.linalg.svd_compress time: {:.2f} s".format(t1-t0))
    print(s)

    u  = da.from_array(u,chunks=(SIZE,k))
    vh = da.from_array(vh,chunks=(k,SIZE))
    b = da.einsum("ij,j,jk->ik",u,s,vh)
    err = a - b
    tr = da.trace(err).compute()
    print("trace:{:19.12e}{:+19.12e}".format(tr.real,tr.imag))
    materr = da.linalg.norm(err).compute()
    materr = float(materr / da.max(da.abs(a)))
    print("err:{:18.12e}".format(materr))

    sys.exit(0)

The results are:

full svd by using cupy.linalg.svd()

full svd time: 86.01 s
[2.58296767e+07 1.05558179e+07 3.38844060e+06 9.88898912e+05
 2.59627169e+05 6.74255883e+04 1.89209066e+04 5.94213692e+03
 2.20050461e+03 8.92638939e+02 3.50820653e+02 1.26915979e+02
 4.22639368e+01 1.30184354e+01 3.72038540e+00 9.90751139e-01
 2.47328075e-01 5.81214650e-02 1.29043303e-02 2.71591129e-03
 5.43455898e-04 1.03658501e-04 1.88871552e-05 3.29431051e-06
 5.51039526e-07 8.85671784e-08 1.37849358e-08 6.26629286e-09
 5.37837639e-09 5.31998482e-09 5.17301750e-09 5.07648171e-09]

ramdomized svd by using my function "rsvd()"

rsvd time: 4.72 s
[2.58296767e+07 1.05558179e+07 3.38844060e+06 9.88898912e+05
 2.59627169e+05 6.74255883e+04 1.89209066e+04 5.94213692e+03
 2.20050461e+03 8.92638939e+02 3.50820653e+02 1.26915979e+02
 4.22639368e+01 1.30184354e+01 3.72038540e+00 9.90751139e-01
 2.47328075e-01 5.81214650e-02 1.29043303e-02 2.71591127e-03
 5.43455901e-04 1.03658517e-04 1.88872026e-05 3.29426812e-06
 5.50886046e-07 8.78125741e-08 1.03905719e-08 2.97395585e-09
 2.87520604e-09 2.74762063e-09 2.67777793e-09 2.61763907e-09]
trace:-3.837440116878e-09-2.231108248151e-09
err:4.611527374705e-13

ramdomized svd by using dask.array.linalg.svd_compressed()

da.linalg.svd_compress time: 3.34 s
[1.22753454e+07 3.51378639e+06 1.41504285e+06 2.65017653e+05
 6.59603978e+04 2.11401367e+04 5.50606266e+03 1.51125671e+03
 6.00777209e+02 2.37258157e+02 7.97001757e+01 2.92310333e+01
 8.82265661e+00 2.17949256e+00 5.76741777e-01 1.48938971e-01
 3.01250371e-02 7.56122991e-03 1.67292376e-03 3.35416451e-04
 7.11668163e-05 1.36569683e-05 2.32605468e-06 3.61926695e-07
 5.14757978e-08 6.07435046e-09 2.95743600e-09 2.66944135e-09
 2.56615446e-09 2.42686096e-09 2.35138792e-09 2.28905663e-09]
trace: 1.585523226585e+06+4.090091981828e+06
err:2.223140103552e+02

I calculated the trace and error of matrix $M-M'$, in which $M$ is the original matrix, and
$$M_{ik} \simeq M^\prime_{ik}=\sum_{k=1}^{32} U_{ij} \sigma_{j} V^\dagger_{jk}.$$
The error is defined by $\text{err}=||M-M^\prime||/||M||$.
The results show that my code yields a reasonable answer because the trace and error are quite small, and the large singular values are the same as full svd.
By the way, the skinny matrix svd and tsqr seem to be currect because I used them in my code. But I'm not sure if there are the same bugs (at where should use Hermitian transpose, but have used tranpose matrix for complex matrix) in the code.

ncclementi · 2022-09-06T14:50:01Z

@LUOXIAO92 Thank you for the detailed explanation. By looking at the dask code, we are clearly missing the case for complex numbers. I'm not sure what's the best approach to include this case here, but would you be interested in opening a PR?

Maybe @ian-r-rose or @jakirkham might have some input here.

nicrie · 2022-10-28T15:57:55Z

Perhaps I am wrong, but it seems to me that an easy fix such as replacing all expression in the form of A.T to A.conj().T should do the trick, no? conj creates the conjugates of a matrix which is the identity matrix for real matrices. Therefore, only complex matrices are affected by the change.

I'd still like to get this work so I will try to make a PR if that's really all what's needed.

nicrie mentioned this issue May 11, 2021

dask support nicrie/xmca#6

Open

GenevieveBuckley added the array label Oct 14, 2021

ian-r-rose mentioned this issue Aug 25, 2022

svd_compressed() fails at calculating svd for complex matrix #9421

Closed

ncclementi added the bug Something is broken label Sep 6, 2022

LUOXIAO92 mentioned this issue Sep 7, 2022

bug of svd_compressed needs to fix #9470

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

svd_compressed() fails for complex input #7639

svd_compressed() fails for complex input #7639

nicrie commented May 10, 2021

jsignell commented May 14, 2021

eric-czech commented May 15, 2021

nicrie commented May 16, 2021 •

edited

RogerMoens commented May 16, 2021 •

edited

RogerMoens commented May 17, 2021 •

edited

eric-czech commented May 17, 2021

RogerMoens commented May 17, 2021 •

edited

RogerMoens commented May 19, 2021

RogerMoens commented May 21, 2021 •

edited

eric-czech commented May 25, 2021

RogerMoens commented May 27, 2021

nicrie commented May 28, 2021

jsignell commented Jun 3, 2021

LUOXIAO92 commented Aug 22, 2022

LUOXIAO92 commented Sep 4, 2022 •

edited

ncclementi commented Sep 6, 2022

nicrie commented Oct 28, 2022

svd_compressed() fails for complex input #7639

svd_compressed() fails for complex input #7639

Comments

nicrie commented May 10, 2021

jsignell commented May 14, 2021

eric-czech commented May 15, 2021

nicrie commented May 16, 2021 • edited

Example: Image of imaginary part of 2nd singular vector

For svd

For svd_compressed

RogerMoens commented May 16, 2021 • edited

RogerMoens commented May 17, 2021 • edited

eric-czech commented May 17, 2021

RogerMoens commented May 17, 2021 • edited

RogerMoens commented May 19, 2021

RogerMoens commented May 21, 2021 • edited

eric-czech commented May 25, 2021

RogerMoens commented May 27, 2021

nicrie commented May 28, 2021

jsignell commented Jun 3, 2021

LUOXIAO92 commented Aug 22, 2022

LUOXIAO92 commented Sep 4, 2022 • edited

ncclementi commented Sep 6, 2022

nicrie commented Oct 28, 2022

nicrie commented May 16, 2021 •

edited

For `svd`

For `svd_compressed`

RogerMoens commented May 16, 2021 •

edited

RogerMoens commented May 17, 2021 •

edited

RogerMoens commented May 17, 2021 •

edited

RogerMoens commented May 21, 2021 •

edited

LUOXIAO92 commented Sep 4, 2022 •

edited