# Singular Value Decomposition - Algorithms and Error Analysis

---

We study only algorithms for real matrices, which are most commonly used in the applications described in this course. 


For more details, see 
[A. Kaylor Cline and I. Dhillon, Computation of the Singular Value Decomposition][Hog14] and the references therein.

[Hog14]: #1 "L. Hogben, ed., 'Handbook of Linear Algebra', pp. 58.1-58.13, CRC Press, Boca Raton, 2014."


## Prerequisites

The reader should be familiar with facts about the singular value decomposition and perturbation theory and algorithms for the symmetric eigenvalue decomposition.

 
## Competences 

The reader should be able to apply an adequate algorithm to a given problem, and to assess the accuracy of the solution.

---

## Basics

### Definitions 

The __signular value decomposition__ (SVD) of $A\in\mathbb{R}^{m\times n}$ is
$A=U\Sigma V^T$, where $U\in\mathbb{R}^{m\times m}$ is orthogonal, $U^TU=UU^T=I_m$, 
$V\in\mathbb{R}^{n\times n}$ is orthogonal, $V^TV=VV^T=I_n$, and 
$\Sigma \in\mathbb{R}^{m\times n}$ is diagonal with singular values 
$\sigma_1,\ldots,\sigma_{\min\{m,n\}}$ on the diagonal. 

If $m>n$, the __thin SVD__ of $A$ is $A=U_{1:m,1:n} \Sigma_{1:n,1:n} V^T$.

### Facts

1. Algorithms for computing SVD of $A$ are modifications of algorithms for the symmetric eigenvalue decomposition of the matrices $AA^T$, $A^TA$ and 
$\begin{bmatrix} 0 & A\\ A^T & 0 \end{bmatrix}$.

2. Most commonly used approach is the three-step algorithm:
    1. Reduce $A$ to bidiagonal matrix $B$ by orthogonal transformations, $X^TAY=B$.
    2. Compute the SVD of $B$ with QR iterations, $B=W\Sigma Z^T$.
    3. Multiply $U=XW$ and $V=YZ$.

3. If $m\geq n$, the overall operation count for this algorithm is $O(mn^2)$ operations.

4. __Error bounds__: Let $U\Sigma V^T$ and $\tilde U \tilde \Sigma \tilde V^T$ be the
exact and the computed SVDs of $A$, respectively. The algorithms generally
compute the SVD with errors bounded by
$$
|\sigma_i-\tilde \sigma_i|\leq \phi \epsilon\|A\|_2,
\qquad
\|u_i-\tilde u_i\|_2, \| v_i-\tilde v_i\|_2 \leq \psi\epsilon \frac{\|A\|_2}
{\min_{j\neq i} 
|\sigma_i-\tilde \sigma_j|},
$$
where $\epsilon$ is machine precision and $\phi$ and $\psi$
are slowly growing polynomial functions of
$n$ which depend upon the algorithm used (typically $O(n)$ or $O(n^2)$).
These bounds are obtained by combining perturbation bounds with the floating-point error analysis of the algorithms.

##  Bidiagonalization


### Facts

1. The reduction of $A$ to bidiagonal matrix can be performed by applying 
$\min\{m-1,n\}$ Householder reflections $H_L$ from the left and $n-2$ Householder reflections $H_R$ from the right. In the first step, $H_L$ is chosen to annihilate all elements of the first column below the diagonal, and $H_R$ is chosen to annihilate all elements of the first row right of the first super-diagonal. Applying this procedure recursively yields the bidiagonal matrix.

3. $H_L$ and $H_R$ do not depend on the normalization of the respective Householder 
vectors $v_L$ and $v_R$. With the normalization $[v_L]_1=[V_R]_1=1$, the vectors $v_L$ are stored in the lower-triangular part of $A$, and the vectors $v_R$ are stored in the upper-triangular part of $A$ above the super-diagonal. 

4. The matrices $H_L$ and $H_R$ are not formed explicitly - given $v_L$ and $v_R$, $A$ is overwritten with $H_L A H_R$ in $O(mn)$ operations by using matrix-vector multiplication and rank-one updates.

5. Instead of performing rank-one updates, $p$ transformations can be accumulated, and then applied. This __block algorithm__ is rich in matrix-matrix multiplications (roughly one
half of the operations is performed using BLAS 3 routines), but
it requires extra workspace.

6. If the matrices $X$ and $Y$ are needed explicitly, they can be computed from the 
stored Householder vectors.
In order to minimize the operation count, the computation
starts from the smallest matrix and the size is gradually
increased.

7. The backward error bounds for the bidiagonalization are as follows: 
The computed matrix $\tilde B$ is equal to the matrix which
would be obtained by exact bidiagonalization of some perturbed matrix $A+\Delta A$, 
where $\|\Delta A\|_2 \leq \psi \varepsilon \|A\|_2$ and $\psi$ is a
slowly increasing function of $n$.
The computed matrices $\tilde X$ and $\tilde Y$ satisfy $\tilde X=X+\Delta X$ and 
$\tilde Y=Y+\Delta Y$, where
$\|\Delta X \|_2,\|\Delta Y\|_2\leq \phi \varepsilon$ 
and $\phi$ is a slowly increasing function of $n$.

12. The bidiagonal reduction is implemented in the 
[LAPACK](http://www.netlib.org/lapack) subroutine 
[DGEBRD](http://www.netlib.org/lapack/explore-html/dd/d9a/group__double_g_ecomputational.html#ga9c735b94f840f927f8085fd23f3ee2e6).
The computation of $X$ and $Y$ is implemented in the subroutine
[DORGBR](http://www.netlib.org/lapack/lapack-3.1.1/html/dorgtr.f.html), which is not yet wrapped in Julia.

8. Bidiagonalization can also be performed using Givens rotations.
Givens rotations act more selectively than Householder reflectors, and
are useful if $A$ has some special structure, for example, if $A$ is a banded matrix. 
Error bounds for function `myBidiagG()` are the same as above, 
but with slightly different functions $\psi$ and $\phi$.

In [9]:
m=8
n=5
A=map(Float64,rand(-9:9,m,n))

8x5 Array{Float64,2}:
  7.0   7.0   9.0  -8.0  -1.0
  6.0  -7.0  -4.0  -4.0  -5.0
 -5.0   0.0   5.0   5.0  -3.0
  5.0  -3.0   8.0  -3.0   0.0
 -2.0  -7.0   5.0   3.0  -4.0
 -2.0  -1.0  -2.0   3.0  -1.0
 -4.0   1.0  -4.0   3.0  -1.0
  9.0   9.0  -9.0  -4.0   9.0

In [10]:
?LAPACK.gebrd!

```
gebrd!(A) -> (A, d, e, tauq, taup)
```

Reduce `A` in-place to bidiagonal form `A = QBP'`. Returns `A`, containing the bidiagonal matrix `B`; `d`, containing the diagonal elements of `B`; `e`, containing the off-diagonal elements of `B`; `tauq`, containing the elementary reflectors representing `Q`; and `taup`, containing the elementary reflectors representing `P`.


In [11]:
# We need copy()
Out=LAPACK.gebrd!(copy(A))

(
8x5 Array{Float64,2}:
 -15.4919     13.73       -0.0571037   -0.604627    0.24521  
   0.266762   -9.9183     13.2337      -0.767195   -0.130139 
  -0.222302    0.0430786  -8.99108    -11.1333     -0.0998221
   0.222302    0.254667   -0.0422467   11.7769     -3.09596  
  -0.0889208   0.308034   -0.0278571    0.343363    5.18788  
  -0.0889208   0.068709    0.238191     0.024516    0.149822 
  -0.177842   -0.107552    0.0565483   -0.0934766   0.376151 
   0.400143   -0.11361     0.905872    -0.502951   -0.839261 ,

[-15.491933384829668,-9.918301150405899,-8.991081772164891,11.776853680722558,5.187877171664298],[13.729985433349887,13.233735713960527,-11.133291343294982,-3.0959589159153476,1.081982666e-314],[1.451848057057532,1.679555042977175,1.0620801714083317,1.449069014372053,1.070494599071799],[1.3996163312599017,1.2456988308632695,1.9802677228897878,0.0,0.0])

In [13]:
B=Bidiagonal(Out[2],Out[3][1:end-1],true)

5x5 Bidiagonal{Float64}:
 -15.4919  13.73     0.0        0.0      0.0    
   0.0     -9.9183  13.2337     0.0      0.0    
   0.0      0.0     -8.99108  -11.1333   0.0    
   0.0      0.0      0.0       11.7769  -3.09596
   0.0      0.0      0.0        0.0      5.18788

In [14]:
svdvals(A), svdvals(B)

([22.861021785090372,18.518793933255157,13.330442065692964,5.62694647147077,2.657958227599012],[22.861021785090372,18.518793933255157,13.330442065692964,5.62694647147077,2.657958227599012])

In [15]:
# Extract X
function myBidiagX{T}(H::Array{T})
    m,n=size(H)
    X = eye(T,m,n)
    v=Array(T,m)
    for j = n : -1 : 1
        v[j] = one(T)
        v[j+1:m] = H[j+1:m, j]
        γ = -2 / (v[j:m]⋅v[j:m])
        w = γ * X[j:m, j:n]'*v[j:m]
        X[j:m, j:n] = X[j:m, j:n] + v[j:m]*w'
    end
    X
end

# Extract Y
function myBidiagY{T}(H::Array{T})
    n,m=size(H)
    Y = eye(T,n)
    v=Array(T,n)
    for j = n-2 : -1 : 1
        v[j+1] = one(T)
        v[j+2:n] = H[j+2:n, j]
        γ = -2 / (v[j+1:n]⋅v[j+1:n])
        w = γ * Y[j+1:n, j+1:n]'*v[j+1:n]
        Y[j+1:n, j+1:n] = Y[j+1:n, j+1:n] + v[j+1:n]*w'
    end
    Y
end

myBidiagY (generic function with 1 method)

In [16]:
X=myBidiagX(Out[1])

8x5 Array{Float64,2}:
 -0.451848   0.231986   0.552061   -0.399587  -0.459447 
 -0.387298  -0.61767   -0.0479767   0.425464  -0.48393  
  0.322749  -0.123924  -0.193215   -0.623463  -0.370064 
 -0.322749  -0.376156   0.117871   -0.37134    0.466137 
  0.129099  -0.537989  -0.0796456  -0.277652   0.0918663
  0.129099  -0.136029  -0.315482   -0.138557  -0.237409 
  0.258199   0.139382  -0.137239    0.107718  -0.367835 
 -0.580948   0.283641  -0.719023   -0.157538   0.0195867

In [17]:
Y=myBidiagY(Out[1]')

5x5 Array{Float64,2}:
 1.0   0.0         0.0        0.0         0.0     
 0.0  -0.399616    0.733478  -0.422693   -0.351636
 0.0   0.0799233  -0.287583  -0.880653    0.367912
 0.0   0.846246    0.512213  -0.0305457   0.143428
 0.0  -0.3432      0.341971   0.211775    0.848776

In [18]:
# Orthogonality
norm(X'*X-I), norm(Y'*Y-I)

(9.636975638173114e-16,6.864563874634175e-16)

In [19]:
# Error
X'*A*Y-B

5x5 Array{Float64,2}:
  0.0           0.0           9.30077e-16   2.13023e-15  -1.13406e-16
 -6.10623e-16  -7.10543e-15   8.88178e-15  -4.73302e-15  -2.54077e-16
 -6.66134e-16  -4.95338e-16  -1.77636e-15  -1.77636e-15   9.5386e-16 
  8.32667e-16   1.14072e-15  -1.53194e-15  -1.77636e-15  -8.88178e-16
  1.05471e-15  -2.79425e-15   4.00919e-15   1.57854e-15   0.0        

In [40]:
# Bidiagonalization using Givens rotations
function myBidiagG{T}(A::Array{T})
    m,n=size(A)
    X=eye(T,m,m)
    Y=eye(T,n,n)
    for j = 1 : n        
        for i = j+1 : m
            G,r=givens(A,j,i,j)
            A=G*A
            X=G*X
        end
        for i=j+2:n
            G,r=givens(A',j+1,i,j)
            A=A*G'
            Y*=G'
        end
    end
    X',Bidiagonal(diag(A),diag(A,1),true), Y
end

myBidiagG (generic function with 1 method)

In [41]:
X1, B1, Y1 = myBidiagG(A)

(
8x8 Array{Float64,2}:
  0.451848  -0.231986   0.552061   …   0.245879   0.0246806   0.0736   
  0.387298   0.61767   -0.0479767     -0.217959  -0.044084   -0.0390443
 -0.322749   0.123924  -0.193215      -0.531605  -0.183217   -0.0363024
  0.322749   0.376156   0.117871      -0.189759   0.497483   -0.319017 
 -0.129099   0.537989  -0.0796456      0.459139  -0.0214966   0.625106 
 -0.129099   0.136029  -0.315482   …   0.602177  -0.0146213  -0.653383 
 -0.258199  -0.139382  -0.137239       0.0        0.845999    0.180143 
  0.580948  -0.283641  -0.719023       0.0        0.0         0.199628 ,

5x5 Bidiagonal{Float64}:
 diag: 15.4919  -9.9183  8.99108  11.7769  5.18788
 super: 13.73  13.2337  11.1333  3.09596,

5x5 Array{Float64,2}:
 1.0   0.0         0.0        0.0         0.0     
 0.0   0.399616   -0.733478   0.422693   -0.351636
 0.0  -0.0799233   0.287583   0.880653    0.367912
 0.0  -0.846246   -0.512213   0.0305457   0.143428
 0.0   0.3432     -0.341971  -0.211775    0.848776)

In [42]:
# Orthogonality
norm(X1'*X1-I), norm(Y1'*Y1-I)

(8.327929118379496e-16,5.27722152990908e-16)

In [43]:
# Error
X1'*A*Y1

8x5 Array{Float64,2}:
 15.4919       13.73          9.69608e-16  -1.02408e-16   1.13406e-16
 -1.66533e-15  -9.9183       13.2337        1.78395e-15   1.76181e-15
  1.11022e-16   7.20644e-16   8.99108      11.1333       -6.86318e-17
  1.77636e-15   2.00954e-15  -1.40278e-15  11.7769        3.09596    
  5.23886e-16  -1.14997e-15  -2.47701e-16   1.30528e-15   5.18788    
  0.0           1.5857e-15   -6.69309e-16   3.64122e-15  -2.3305e-18 
 -4.44089e-16  -3.83116e-16   7.91956e-17  -2.24563e-16   3.92395e-16
 -9.15934e-16   7.97811e-16  -1.56461e-15   1.8307e-15   -4.96202e-16

In [44]:
# X, Y and B are not unique
B

5x5 Bidiagonal{Float64}:
 -15.4919  13.73     0.0        0.0      0.0    
   0.0     -9.9183  13.2337     0.0      0.0    
   0.0      0.0     -8.99108  -11.1333   0.0    
   0.0      0.0      0.0       11.7769  -3.09596
   0.0      0.0      0.0        0.0      5.18788

In [45]:
B1

5x5 Bidiagonal{Float64}:
 15.4919  13.73     0.0       0.0     0.0    
  0.0     -9.9183  13.2337    0.0     0.0    
  0.0      0.0      8.99108  11.1333  0.0    
  0.0      0.0      0.0      11.7769  3.09596
  0.0      0.0      0.0       0.0     5.18788

## Bidiagonal QR method

Let $B$ be a real upper-bidiagonal matrix of order $n$ and let 
$B=W\Sigma Z^T$ be its SVD.

All metods for computing the SVD of bidiagonal matrix are derived from the methods 
for computing the EVD of the tridiagonal matrix $T=B^T B$.


### Facts

1. The shift $\mu$ is the eigenvalue of the $2\times 2$ matrix $T_{n-1:n,n-1:n}$ which is closer to $T_{n,n}$. The first Givens rotation from the right is the one which annihilates 
the element $(1,2)$ of the shifted $2\times 2$ matrix $T_{1:2,1:2}-\mu I$. Applying this rotation to $B$ creates the bulge at the element $B_{2,1}$. This bulge is subsequently chased out by applying adequate Givens rotations alternating from the left and from the right.
This is the __Golub-Kahan algorithm__.

2. The computed SVD satisfes error bounds from the Fact 4 above.

3. The special variant of zero-shift QR algorithm (the __Demmel-Kahan algorithm__) computes the singular values with high relative accuracy. 

4. The tridiagonal divide-and-conquer method, bisection and inverse iteration, and MRRR method can also be adapted for bidiagonal matrices. 

5. Zero shift QR algorithm for bidiagonal matrices is implemented in the LAPACK routine 
[DBDSQR](http://www.netlib.org/lapack/explore-html/db/dcc/dbdsqr_8f.html). It is also used in the function `svdvals()`. Divide-and-conquer algorithm for bidiagonal matrices is implemented in the LAPACK routine 
[DBDSDC](http://www.netlib.org/lapack/explore-html/d9/d08/dbdsdc_8f.html). However, this algorithm also calls zero-shift QR to compute singular values.

### Examples

In [46]:
W,σ,Z=svd(B)

(
5x5 Array{Float64,2}:
  0.795457     0.459995   -0.375278    0.0708052  -0.0990277
 -0.549747     0.289035   -0.691184    0.20634    -0.306474 
  0.240618    -0.695144   -0.127019    0.338357   -0.572938 
 -0.0843939    0.470177    0.600999    0.259522   -0.585883 
  0.00273443  -0.0238955  -0.0640171  -0.877823   -0.474078 ,

[22.86102178509037,18.51879393325515,13.330442065692965,5.626946471470772,2.6579582275990132],
5x5 Array{Float64,2}:
 -0.539047   -0.384809   0.436128  -0.194939    0.577184
  0.716248    0.186242   0.127738  -0.190937    0.632083
 -0.41287     0.544048  -0.600498  -0.0553663   0.412173
 -0.160656    0.716918   0.637039  -0.126298   -0.196082
  0.0120496  -0.085298  -0.164494  -0.952116   -0.242889)

In [47]:
@which svd(B)

In [48]:
σ1=svdvals(B)

5-element Array{Float64,1}:
 22.861  
 18.5188 
 13.3304 
  5.62695
  2.65796

In [19]:
@which svdvals(B)

In [49]:
σ-σ1

5-element Array{Float64,1}:
 -3.55271e-15
 -7.10543e-15
  1.77636e-15
  1.77636e-15
  1.33227e-15

In [50]:
?LAPACK.bdsqr!

```
bdsqr!(uplo, d, e_, Vt, U, C) -> (d, Vt, U, C)
```

Computes the singular value decomposition of a bidiagonal matrix with `d` on the diagonal and `e_` on the off-diagonal. If `uplo = U`, `e_` is the superdiagonal. If `uplo = L`, `e_` is the subdiagonal. Can optionally also compute the product `Q' * C`.

Returns the singular values in `d`, and the matrix `C` overwritten with `Q' * C`.


In [51]:
BV=eye(n)
BU=eye(n)
BC=eye(n)
σ2,Z2,W2,C = LAPACK.bdsqr!('U',copy(B.dv),copy(B.ev),BV,BU,BC)

([22.86102178509037,18.51879393325515,13.330442065692965,5.626946471470772,2.6579582275990132],
5x5 Array{Float64,2}:
 -0.539047   0.716248  -0.41287    -0.160656   0.0120496
 -0.384809   0.186242   0.544048    0.716918  -0.085298 
  0.436128   0.127738  -0.600498    0.637039  -0.164494 
 -0.194939  -0.190937  -0.0553663  -0.126298  -0.952116 
  0.577184   0.632083   0.412173   -0.196082  -0.242889 ,

5x5 Array{Float64,2}:
  0.795457     0.459995   -0.375278    0.0708052  -0.0990277
 -0.549747     0.289035   -0.691184    0.20634    -0.306474 
  0.240618    -0.695144   -0.127019    0.338357   -0.572938 
 -0.0843939    0.470177    0.600999    0.259522   -0.585883 
  0.00273443  -0.0238955  -0.0640171  -0.877823   -0.474078 ,

5x5 Array{Float64,2}:
  0.795457   -0.549747   0.240618  -0.0843939   0.00273443
  0.459995    0.289035  -0.695144   0.470177   -0.0238955 
 -0.375278   -0.691184  -0.127019   0.600999   -0.0640171 
  0.0708052   0.20634    0.338357   0.259522   -0.877823  
 -0.0990

In [52]:
W2'*full(B)*Z2'

5x5 Array{Float64,2}:
 22.861        -2.80024e-15   4.65493e-16   1.14779e-15  -1.87874e-15
 -2.0624e-15   18.5188       -3.85608e-15  -7.42469e-16  -1.30903e-15
  4.42721e-15  -7.11548e-15  13.3304       -3.87516e-14   1.17858e-15
  7.40835e-16  -2.37207e-15  -1.96404e-15   5.62695      -4.17962e-16
 -8.59094e-16  -1.05379e-17   4.70719e-16  -8.70534e-16   2.65796    

In [53]:
?LAPACK.bdsdc!

```
bdsdc!(uplo, compq, d, e_) -> (d, e, u, vt, q, iq)
```

Computes the singular value decomposition of a bidiagonal matrix with `d` on the diagonal and `e_` on the off-diagonal using a divide and conqueq method. If `uplo = U`, `e_` is the superdiagonal. If `uplo = L`, `e_` is the subdiagonal. If `compq = N`, only the singular values are found. If `compq = I`, the singular values and vectors are found. If `compq = P`, the singular values and vectors are found in compact form. Only works for real types.

Returns the singular values in `d`, and if `compq = P`, the compact singular vectors in `iq`.


In [55]:
σ3,ee,W3,Z3,rest=LAPACK.bdsdc!('U','I',copy(B.dv),copy(B.ev))

([22.86102178509037,18.51879393325515,13.330442065692965,5.626946471470772,2.6579582275990132],e = 2.7182818284590...,
5x5 Array{Float64,2}:
  0.795457     0.459995   -0.375278    0.0708052  -0.0990277
 -0.549747     0.289035   -0.691184    0.20634    -0.306474 
  0.240618    -0.695144   -0.127019    0.338357   -0.572938 
 -0.0843939    0.470177    0.600999    0.259522   -0.585883 
  0.00273443  -0.0238955  -0.0640171  -0.877823   -0.474078 ,

5x5 Array{Float64,2}:
 -0.539047   0.716248  -0.41287    -0.160656   0.0120496
 -0.384809   0.186242   0.544048    0.716918  -0.085298 
  0.436128   0.127738  -0.600498    0.637039  -0.164494 
 -0.194939  -0.190937  -0.0553663  -0.126298  -0.952116 
  0.577184   0.632083   0.412173   -0.196082  -0.242889 ,

[4.07674394e-316],[82514216])

In [56]:
W3'*full(B)*Z3'

5x5 Array{Float64,2}:
 22.861        -2.80024e-15   4.65493e-16   1.14779e-15  -1.87874e-15
 -2.0624e-15   18.5188       -3.85608e-15  -7.42469e-16  -1.30903e-15
  4.42721e-15  -7.11548e-15  13.3304       -3.87516e-14   1.17858e-15
  7.40835e-16  -2.37207e-15  -1.96404e-15   5.62695      -4.17962e-16
 -8.59094e-16  -1.05379e-17   4.70719e-16  -8.70534e-16   2.65796    

Functions `svd()`, `LAPACK.bdsqr!()` and `LAPACK.bdsdc!()` use the same algorithm to compute singular values. ERROR!!

In [29]:
[σ3-σ2 σ3-σ]

5x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0
 0.0  0.0
 0.0  0.0
 0.0  0.0

Let us compute some timings. We observe $O(n^2)$ operations.

In [57]:
n=1000
Abig=Bidiagonal(rand(n),rand(n-1),true)
Bbig=Bidiagonal(rand(2*n),rand(2*n-1),true)
@time svdvals(Abig);
@time svdvals(Bbig);
@time LAPACK.bdsdc!('U','N',copy(Abig.dv),copy(Abig.ev));
@time svd(Abig);
@time svd(Bbig);

  0.024740 seconds (168 allocations: 151.667 KB)
  0.096291 seconds (24 allocations: 282.297 KB)
  0.025593 seconds (23 allocations: 141.641 KB)
  0.142365 seconds (33 allocations: 45.884 MB, 5.09% gc time)
  0.394746 seconds (33 allocations: 183.320 MB, 4.55% gc time)


## QR method

Final algorithm is obtained by combining bidiagonalization and bidiagonal SVD methods.
Standard method is implemented in the LAPACK routine 
[DGESVD](http://www.netlib.org/lapack/explore-html/d8/d2d/dgesvd_8f.html).
Divide-and-conquer method is implemented in the LAPACK routine 
[DGESDD](http://www.netlib.org/lapack/explore-html/db/db4/dgesdd_8f.html).

The functions `svd()`, `svdvals()`, and `svdvecs()`  use `DGESDD`.
Wrappers for `DGESVD` and `DGESDD` give more control about output of eigenvectors.

In [58]:
# The built-in algorithm
U,σA,V=svd(A)

(
8x5 Array{Float64,2}:
 -0.321657   -0.701456   -0.271638    0.50228    0.109277 
 -0.017291   -0.111725    0.865045    0.364116   0.235289 
  0.329973   -0.0373387  -0.361935    0.094954   0.657435 
  0.0110333  -0.524856    0.113126   -0.566143   0.0762864
  0.40297    -0.173488    0.160768   -0.281516   0.316847 
  0.112608    0.1799      0.0175702   0.0467709  0.403385 
  0.0856422   0.313894   -0.0875169   0.351455   0.121616 
 -0.777711    0.240036    0.0173638  -0.283972   0.46557  ,

[22.86102178509037,18.51879393325515,13.330442065692965,5.626946471470772,2.6579582275990132],
5x5 Array{Float64,2}:
 -0.539047  -0.384809    0.436128  -0.194939   0.577184 
 -0.525384   0.0515796  -0.702928   0.423875   0.218021 
  0.321895  -0.804312   -0.438627  -0.238408   0.0153023
  0.401281   0.402143   -0.242537  -0.322642   0.71717  
 -0.410801   0.201556   -0.253903  -0.788285  -0.323664 )

In [60]:
# With our building blocks
U1=X*W
V1=Y*Z
U1'*A*V1

5x5 Array{Float64,2}:
 22.861        -3.06152e-15   6.2966e-15   -1.55787e-15  -2.54711e-15
 -5.42459e-15  18.5188       -4.60822e-15   2.08862e-15   2.20327e-17
  1.35696e-14  -9.11368e-15  13.3304       -3.9798e-14    7.53501e-16
  1.10926e-15  -6.56605e-15  -2.42205e-15   5.62695      -2.45295e-16
  7.87305e-16   1.7859e-16    3.99238e-15  -2.24218e-15   2.65796    

In [61]:
?LAPACK.gesvd!

```
gesvd!(jobu, jobvt, A) -> (U, S, VT)
```

Finds the singular value decomposition of `A`, `A = U * S * V'`. If `jobu = A`, all the columns of `U` are computed. If `jobvt = A` all the rows of `V'` are computed. If `jobu = N`, no columns of `U` are computed. If `jobvt = N` no rows of `V'` are computed. If `jobu = O`, `A` is overwritten with the columns of (thin) `U`. If `jobvt = O`, `A` is overwritten with the rows of (thin) `V'`. If `jobu = S`, the columns of (thin) `U` are computed and returned separately. If `jobvt = S` the rows of (thin) `V'` are computed and returned separately. `jobu` and `jobvt` can't both be `O`.

Returns `U`, `S`, and `Vt`, where `S` are the singular values of `A`.


In [62]:
# DGESVD
LAPACK.gesvd!('A','A',copy(A))

(
8x8 Array{Float64,2}:
 -0.321657    0.701456    0.271638   …   0.216588    0.122342    0.0678585
 -0.017291    0.111725   -0.865045      -0.193242   -0.113766   -0.0262469
  0.329973    0.0373387   0.361935      -0.46934    -0.310703    0.0259525
  0.0110333   0.524856   -0.113126      -0.209352    0.201851   -0.548358 
  0.40297     0.173488   -0.160768       0.316964    0.415069    0.573829 
  0.112608   -0.1799     -0.0175702  …   0.715398   -0.176802   -0.496675 
  0.0856422  -0.313894    0.0875169     -0.194169    0.788301   -0.298404 
 -0.777711   -0.240036   -0.0173638     -0.0409508   0.0992416   0.168302 ,

[22.861021785090376,18.51879393325515,13.33044206569296,5.626946471470769,2.6579582275990123],
5x5 Array{Float64,2}:
 -0.539047  -0.525384   0.321895    0.401281  -0.410801
  0.384809  -0.0515796  0.804312   -0.402143  -0.201556
 -0.436128   0.702928   0.438627    0.242537   0.253903
  0.194939  -0.423875   0.238408    0.322642   0.788285
  0.577184   0.218021   0.0153023

In [63]:
?LAPACK.gesdd!

```
gesdd!(job, A) -> (U, S, VT)
```

Finds the singular value decomposition of `A`, `A = U * S * V'`, using a divide and conquer approach. If `job = A`, all the columns of `U` and the rows of `V'` are computed. If `job = N`, no columns of `U` or rows of `V'` are computed. If `job = O`, `A` is overwritten with the columns of (thin) `U` and the rows of (thin) `V'`. If `job = S`, the columns of (thin) `U` and the rows of (thin) `V'` are computed and returned separately.


In [64]:
LAPACK.gesdd!('N',copy(A))

(8x0 Array{Float64,2},[22.861021785090372,18.518793933255157,13.330442065692964,5.62694647147077,2.657958227599012],5x0 Array{Float64,2})

Let us perform some timings. We observe $O(n^3)$ operations.

In [65]:
n=1000
Abig=rand(n,n)
Bbig=rand(2*n,2*n)
@time Ubig,σbig,Vbig=svd(Abig);
@time svd(Bbig);
@time LAPACK.gesvd!('A','A',copy(Abig));
@time LAPACK.gesdd!('A',copy(Abig));
@time LAPACK.gesdd!('A',copy(Bbig));

  1.135546 seconds (41 allocations: 53.529 MB, 0.29% gc time)
  7.513030 seconds (35 allocations: 213.868 MB, 1.40% gc time)
  5.626581 seconds (24 allocations: 23.408 MB)
  0.962845 seconds (26 allocations: 45.899 MB)
  7.384837 seconds (26 allocations: 183.350 MB, 1.34% gc time)


In [66]:
# Residual
norm(Abig*Vbig-Ubig*diagm(σbig))

4.319644411408229e-13