# Updating the SVD

In many applications which are based on the SVD, arrival of new data requires SVD of the new matrix. Instead of computing from scratch, existing SVD can be updated.

## Prerequisites

The reader should be familiar with concepts of singular values and singular vectors, related perturbation theory, and algorithms.
 
## Competences 

The reader should be able to recognise applications where SVD updating can be sucessfully applied and apply it.

## Facts

For more details see
[M. Gu and S. C. Eisenstat, A Stable and Fast Algorithm for Updating the Singular Value Decomposition][GE93]
and [M. Brand, Fast low-rank modifications of the thin singular value decomposition][Bra06]
and the references therein.

[GE93]: http://www.cs.yale.edu/publications/techreports/tr966.pdf "M. Gu and S. C. Eisenstat, 'A Stable and Fast Algorithm for Updating the Singular Value Decomposition', Tech.report, Yale University, 1993."

[Bra06]: http://www.sciencedirect.com/science/article/pii/S0024379505003812 "M. Brand, 'Fast low-rank modifications of the thin singular value decomposition', Linear Algebra and its Appl, 415 (20-30) 2006."

1. Let $A\in\mathbb{R}^{m\times n}$ with $m\geq n$ and $\mathop{\mathrm{rank}}(A)=n$, and  let $A=U\Sigma V^T$ be its SVD.
   Let $a\in\mathbb{R}^{n}$ be a vector, and let $\tilde A=\begin{bmatrix} A \\ a^T\end{bmatrix}$. Then
   $$\begin{bmatrix} A \\ a^T\end{bmatrix} =\begin{bmatrix} U & \\ & 1 \end{bmatrix} 
   \begin{bmatrix} \Sigma \\ a^TV \end{bmatrix}  V^T.
   $$
   Let $\begin{bmatrix} \Sigma \\ a^T V \end{bmatrix} = \bar U \bar \Sigma \bar V^T$ be the SVD of the half-arrowhead matrix. _This SVD can be computed in $O(n^2)$ operations._ Then 
   $$\begin{bmatrix} A \\ a^T\end{bmatrix} =
   \begin{bmatrix} U & \\ & 1 \end{bmatrix} \bar U \bar\Sigma \bar V^T V^T \equiv
   \tilde U \bar \Sigma \tilde V^T
   $$
   is the SVD of $\tilde A$. 
   
2. Direct computation of $\tilde U$ and $\tilde V$ requires $O(mn^2)$ and $O(n^3)$ operations. However, these multiplications can be performed using Fast Multipole Method. This is not (yet) implemented in Julia and is "not for the timid" (quote by Steven G. Johnson).

3. If $m<n$ and $\mathop{\mathrm{rank}}(A)=n$, then
   $$
   \begin{bmatrix} A \\ a^T\end{bmatrix} =\begin{bmatrix} U & \\ & 1 \end{bmatrix} 
   \begin{bmatrix} \Sigma & 0 \\ a^T V & \beta\end{bmatrix} \begin{bmatrix} V^T \\ v^T \end{bmatrix},
   $$
   where $\beta=\sqrt{\|a\|_2^2-\|V^T a\|_2^2}$ and $v=(I-VV^T)a$. Notice that $V^Tv=0$ by construction.
   Let $\begin{bmatrix} \Sigma & 0 \\ a^T V &  \beta\end{bmatrix} = \bar U \bar \Sigma \bar V^T$ be the SVD of 
   the half-arrowhead matrix. Then 
   $$\begin{bmatrix} A \\ a^T\end{bmatrix} =
   \begin{bmatrix} U & \\ & 1 \end{bmatrix} \bar U \bar\Sigma \bar V^T \begin{bmatrix} V^T \\ v^T \end{bmatrix}
   \equiv \tilde U \bar \Sigma \tilde V^T
   $$
   is the SVD of $\tilde A$.
   
3. Adding a column $a$ to $A$ is equivalent to adding a row $a^T$ to $A^T$.

3. If $\mathop{\mathrm{rank}}(A)<\min\{m,n\}$ or if we are using SVD approximation of rank $r$, and if we want to keep the rank of the approximation (this is the common case in practice), then the formulas in Fact 1 hold approximately. More precisely, the updated rank $r$ approximation is __not__ what we would get by computing the approximation of rank $r$ of the updated matrix, but is sufficient in many applications. 

### Example - Adding row to a tall matrix

If $m>=n$, adding row does not increase the size of $\Sigma$.

In [1]:
# pkg> add Arrowhead#master
using Arrowhead, LinearAlgebra

In [2]:
function mySVDaddrow(svdA::LinearAlgebra.SVD,a::Vector)
    # Create the transposed half-arrowhead
    m,r,n=size(svdA.U,1),length(svdA.S),size(svdA.V,1)
    T=typeof(a[1])
    b=svdA.Vt*a
    if m>=n || r<m
        M=HalfArrow(svdA.S,b)
    else
        β=sqrt(norm(a)^2-norm(b)^2)
        M=HalfArrow(svdA.S,[b;β])
    end
    # From Arrowhead package
    svdM,info=svd(M)
    println(norm(M*svdM.Vt-svdM.U*Diagonal(svdM.S)))
    # Return the updated SVD
    if m>=n || r<m
        return SVD([svdA.U zeros(T,m); zeros(T,1,r) one(T)]*svdM.Vt, 
            svdM.S, adjoint(svdA.V*svdM.U))
    else
        # Need one more row of svdA.V - v is an orthogonal projection
        v=a-svdA.V*b
        normalize!(v)
        return SVD([svdA.U zeros(T,m); zeros(T,1,r) one(T)]*svdM.Vt, 
            svdM.S, adjoint([svdA.V v]*svdM.U))
    end
end

mySVDaddrow (generic function with 1 method)

In [3]:
import Random
Random.seed!(421)
A=rand(10,6)
a=rand(6)

6-element Array{Float64,1}:
 0.33696435480910214
 0.916644781291106  
 0.83277664059846   
 0.8448238239288268 
 0.8866516008033594 
 0.3443212111724143 

In [4]:
svdA=svd(A)

SVD{Float64,Float64,Array{Float64,2}}([-0.303731 -0.160825 … -0.0983823 0.0874599; -0.308381 0.324931 … -0.212191 0.131507; … ; -0.391551 0.24166 … -0.0586923 -0.55629; -0.20966 0.140166 … -0.110358 -0.431799], [4.04432, 1.33392, 1.04232, 0.91928, 0.557717, 0.317478], [-0.430615 -0.536704 … -0.288732 -0.366357; -0.189618 0.125828 … -0.222344 -0.671129; … ; -0.715287 -0.218191 … 0.459078 0.343027; 0.105415 -0.502889 … -0.581862 0.366137])

In [5]:
norm(A*svdA.V-svdA.U*Diagonal(svdA.S))

7.973965710771358e-15

In [6]:
typeof(svdA)

SVD{Float64,Float64,Array{Float64,2}}

In [7]:
svdAa=mySVDaddrow(svdA,a)

1.068855750423739e-15


SVD{Float64,Float64,Array{Float64,2}}([-0.276987 0.203469 … 0.145536 0.065527; -0.285338 -0.271956 … 0.262804 0.0867921; … ; -0.19529 -0.119094 … 0.180764 -0.354866; -0.391301 -0.234263 … -0.467766 -0.288235], [4.38661, 1.36971, 1.0722, 0.945333, 0.674942, 0.357373], [-0.394539 -0.536976 … -0.324469 -0.339258; 0.260977 -0.0788592 … 0.112295 0.70978; … ; 0.678824 0.115075 … -0.51228 -0.352948; -0.072745 -0.535394 … -0.440423 0.414466])

In [8]:
Aa=[A;transpose(a)]
println(size(Aa),size(svdAa.U),size(svdA.V))
[svdvals(Aa) svdAa.S]

(11, 6)(11, 6)(6, 6)


6×2 Array{Float64,2}:
 4.38661   4.38661 
 1.36971   1.36971 
 1.0722    1.0722  
 0.945333  0.945333
 0.674942  0.674942
 0.357373  0.357373

In [9]:
# Check the residual and orthogonality
norm(Aa*svdAa.V-svdAa.U*Diagonal(svdAa.S)),
norm(svdAa.U'*svdAa.U-I), norm(svdAa.Vt*svdAa.V-I)

(8.180658602331912e-15, 1.8545366665040274e-15, 2.450949471062762e-15)

### Example - Adding row to a flat matrix

In [10]:
# Now flat matrix
Random.seed!(421)
A=rand(6,10)
a=rand(10)
svdA=svd(A)

SVD{Float64,Float64,Array{Float64,2}}([-0.322695 -0.445735 … 0.468456 0.567051; -0.477735 0.704939 … -0.0949152 0.480236; … ; -0.394485 -0.252949 … -0.412626 -0.174191; -0.330335 -0.428706 … -0.545783 0.157113], [3.99564, 1.29725, 1.2146, 0.95763, 0.491987, 0.443677], [-0.366339 -0.321222 … -0.244904 -0.295685; 0.0854379 -0.186212 … -0.534621 0.497777; … ; -0.0667873 -0.492411 … 0.261408 0.322671; -0.231519 -0.240959 … 0.540268 -0.189819])

In [11]:
svdAa=mySVDaddrow(svdA,a)

3.906017505299942e-16


SVD{Float64,Float64,Array{Float64,2}}([-0.297258 -0.341679 … -0.464921 0.394001; -0.416981 0.70696 … -0.475186 0.105191; … ; -0.30061 -0.216062 … -0.108189 0.222934; -0.442351 -0.371776 … -0.22281 -0.761855], [4.44188, 1.41235, 1.2192, 0.985345, 0.49206, 0.456045, 0.26585], [-0.327393 -0.352355 … -0.278481 -0.274767; 0.272567 -0.259338 … -0.452184 0.484463; … ; 0.253177 0.254949 … -0.511131 0.124985; 0.207497 -0.03837 … 0.0967877 -0.386992])

In [12]:
Aa=[A;transpose(a)]
println(size(Aa),size(svdAa.U),size(svdA.V))
[svdvals(Aa) svdAa.S]

(7, 10)(7, 7)(10, 6)


7×2 Array{Float64,2}:
 4.44188   4.44188 
 1.41235   1.41235 
 1.2192    1.2192  
 0.985345  0.985345
 0.49206   0.49206 
 0.456045  0.456045
 0.26585   0.26585 

In [13]:
# Check the residual and orthogonality
norm(Aa*svdAa.V-svdAa.U*Diagonal(svdAa.S)),
norm(svdAa.U'*svdAa.U-I), norm(svdAa.Vt*svdAa.V-I)

(9.742173835407067e-15, 1.6733904812196066e-15, 4.75290908271683e-15)

### Example - Adding columns

This can be viewed as adding rows to the transposed matrix, an elegant one-liner in Julia.

In [14]:
function mySVDaddcol(svdA::SVD,a::Vector)
    X=mySVDaddrow(SVD(svdA.V,svdA.S,adjoint(svdA.U)),a)
    SVD(X.V,X.S,adjoint(X.U))
end 

mySVDaddcol (generic function with 1 method)

In [15]:
# Tall matrix
Random.seed!(897)
A=rand(10,6)
a=rand(10)
svdA=svd(A)
svdAa=mySVDaddcol(svdA,a)
# vecnorm([A a]*V-U*diagm(σ)), vecnorm(U'*U-I), vecnorm(V'*V-I)

1.806498804423111e-15


SVD{Float64,Float64,Adjoint{Float64,Array{Float64,2}}}([-0.250319 -0.182314 … -0.366543 -0.208073; -0.304909 -0.462519 … 0.204587 -0.0600162; … ; -0.255968 0.630217 … 0.00576628 -0.325822; -0.276224 -0.306128 … -0.360096 0.341993], [4.3384, 1.3159, 1.10466, 0.887522, 0.784881, 0.54766, 0.339043], [-0.331631 -0.432442 … -0.264222 -0.384453; 0.606311 -0.123091 … -0.206021 -0.0440809; … ; 0.694092 0.072713 … 0.12282 -0.332087; -0.0401745 0.570454 … -0.0300197 -0.0278254])

In [16]:
# Check the residual and orthogonality
Aa=[A a]
norm(Aa*svdAa.V-svdAa.U*Diagonal(svdAa.S)),
norm(svdAa.U'*svdAa.U-I), norm(svdAa.Vt*svdAa.V-I)

(2.886820704538929e-15, 2.4966936580709457e-15, 1.614461233948329e-15)

In [17]:
# Flat matrix
Random.seed!(332)
A=rand(6,10)
a=rand(6)
svdA=svd(A)
svdAa=mySVDaddcol(svdA,a)

9.429902480489434e-16


SVD{Float64,Float64,Adjoint{Float64,Array{Float64,2}}}([-0.455255 -0.166233 … -0.681802 0.0523952; -0.378797 -0.379463 … 0.394109 0.183865; … ; -0.422557 0.520279 … -0.258266 0.400265; -0.429658 0.246082 … 0.0926024 -0.8562], [4.44247, 1.43903, 1.05304, 0.853501, 0.576431, 0.282941], [-0.416635 -0.270978 … -0.272799 -0.272176; 0.146067 0.277967 … -0.481469 -0.0350067; … ; 0.249312 0.344526 … 8.07118e-6 -0.573328; -0.421238 -0.384621 … 0.00443857 -0.166896])

In [18]:
# Check the residual and orthogonality
Aa=[A a]
norm(Aa*svdAa.V-svdAa.U*Diagonal(svdAa.S)),
norm(svdAa.U'*svdAa.U-I), norm(svdAa.Vt*svdAa.V-I)

(3.4111439403910404e-15, 1.9656809242159845e-15, 1.8600904856505075e-15)

In [19]:
# Square matrix
A=rand(10,10)
a=rand(10);
svdA=svd(A);

In [20]:
svdAa=mySVDaddrow(svdA,a)
Aa=[A;transpose(a)]
norm(Aa*svdAa.V-svdAa.U*Diagonal(svdAa.S)),
norm(svdAa.U'*svdAa.U-I), norm(svdAa.Vt*svdAa.V-I)

1.0530590333361355e-15


(5.621740314055746e-14, 2.6522212249707117e-15, 3.253109881119632e-15)

In [21]:
svdAa=mySVDaddcol(svdA,a)
Aa=[A a]
norm(Aa*svdAa.V-svdAa.U*Diagonal(svdAa.S)),
 norm(svdAa.U'*svdAa.U-I), norm(svdAa.Vt*svdAa.V-I)

9.752691313885945e-16


(5.677080032041437e-14, 2.95139164746805e-14, 4.6901837546295234e-15)

### Example - Updating a low rank approximation


In [22]:
# Adding row to a tall matrix
A=rand(10,6)
svdA=svd(A)
a=rand(6)
# Rank of the approximation
r=4

4

In [23]:
svdAr=SVD(svdA.U[:,1:r], svdA.S[1:r],adjoint(svdA.V[:,1:r]))

SVD{Float64,Float64,Array{Float64,2}}([-0.271169 0.0534105 -0.29403 0.565083; -0.232537 -0.0843103 0.380893 -0.423132; … ; -0.346856 0.375002 0.0867381 -0.0532533; -0.152853 0.460534 0.0105176 0.266006], [3.84685, 1.16069, 1.14914, 0.845909], [-0.375963 -0.445045 … -0.324985 -0.53538; 0.800598 -0.294843 … -0.382492 -0.0610056; -0.284809 -0.490563 … 0.323003 0.366834; 0.149562 0.255861 … 0.553767 -0.687543])

In [24]:
# Mirsky
Ar=svdAr.U*Diagonal(svdAr.S)*svdAr.Vt
Δ=Ar-A
opnorm(Δ),svdvals(A)[5]

(0.7079785575104238, 0.7079785575104239)

In [25]:
svdAa=mySVDaddrow(svdAr,a)

1.0411355404852968e-15


SVD{Float64,Float64,Array{Float64,2}}([-0.263625 0.135927 0.14768 -0.566604; -0.225241 -0.159809 -0.206121 0.515098; … ; -0.151323 -0.184947 0.42606 -0.156069; -0.253978 -0.456474 -0.015248 -0.312509], [3.97304, 1.27599, 1.15937, 0.930943], [-0.380249 -0.433608 … -0.349092 -0.523377; -0.0477667 0.472924 … -0.317698 -0.0653647; 0.847962 -0.107104 … -0.478667 -0.175512; -0.140994 -0.409842 … -0.459331 0.767226])

In [26]:
Aa=[A; transpose(a)]

11×6 Array{Float64,2}:
 0.496024   0.812065    0.374092   0.650515  0.284429   0.0323014
 0.0154666  0.207978    0.145923   0.561336  0.157788   0.873543 
 0.283984   0.270589    0.302796   0.638842  0.852921   0.693089 
 0.10062    0.974098    0.867051   0.510718  0.611038   0.988092 
 0.145471   0.678997    0.151663   0.418343  0.861964   0.494143 
 0.636556   0.170216    0.0909539  0.886356  0.312655   0.876372 
 0.8296     0.96157     0.647838   0.362638  0.0688865  0.717597 
 0.701214   0.359699    0.770393   0.081159  0.35015    0.604647 
 0.790144   0.548257    0.0982269  0.640503  0.210458   0.80308  
 0.748652   0.033805    0.0413225  0.305608  0.242771   0.13753  
 0.492707   0.00284153  0.190767   0.793571  0.814449   0.258851 

In [27]:
svdvals(Aa),svdvals([Ar;transpose(a)]),svdAa.S

([3.97469, 1.29896, 1.1594, 0.951601, 0.736308, 0.43137], [3.97465, 1.29366, 1.15939, 0.942137, 0.34285, 3.28723e-16], [3.97304, 1.27599, 1.15937, 0.930943])

In [28]:
# Adding row to a flat matrix
A=rand(6,10)
svdA=svd(A)
a=rand(10)
# Rank of the approximation
r=4

4

In [29]:
svdAr=SVD(svdA.U[:,1:r], svdA.S[1:r],adjoint(svdA.V[:,1:r]))
svdAa=mySVDaddrow(svdAr,a)

3.174449165989653e-16


SVD{Float64,Float64,Array{Float64,2}}([-0.425511 0.534229 -0.330469 -0.290139; -0.44797 -0.218816 -0.441887 0.640743; … ; -0.372781 -0.363505 0.447105 -0.41223; -0.240356 -0.111 -0.125965 -0.220419], [4.54696, 1.43779, 1.22723, 0.89426], [-0.200752 -0.406021 … -0.297836 -0.334136; -0.17726 -0.223875 … 0.133706 0.501956; 0.418761 0.314601 … -0.0155627 -0.0935488; -0.25802 0.319106 … -0.64908 0.364995])

In [30]:
Ar=svdAr.U*Diagonal(svdAr.S)*svdAr.Vt
svdvals(Aa),svdvals([Ar;transpose(a)]),svdAa.S

([3.97469, 1.29896, 1.1594, 0.951601, 0.736308, 0.43137], [4.54968, 1.44004, 1.23088, 0.91368, 0.586697, 3.59193e-16, 1.70207e-16], [4.54696, 1.43779, 1.22723, 0.89426])