# Updating the SVD

---

In many applications which are based on the SVD, arrival of new data requires SVD of the new matrix. Instead of computing from scratch, existing SVD can be updated.

## Prerequisites

The reader should be familiar with concepts of singular values and singular vectors, related perturbation theory, and algorithms.
 
## Competences 

The reader should be able to recognise applications where SVD updating can be sucessfully applied and apply it.

---

## Facts

For more details see
[M. Gu and S. C. Eisenstat, A Stable and Fast Algorithm for Updating the Singular Value Decomposition][GE93]
and [M. Brand, Fast low-rank modifications of the thin singular value decomposition][Bra06]
and the references therein.

[GE93]: http://www.cs.yale.edu/publications/techreports/tr966.pdf "M. Gu and S. C. Eisenstat, 'A Stable and Fast Algorithm for Updating the Singular Value Decomposition', Tech.report, Yale University, 1993."

[Bra06]: http://www.sciencedirect.com/science/article/pii/S0024379505003812 "M. Brand, 'Fast low-rank modifications of the thin singular value decomposition', Linear Algebra and its Appl, 415 (20-30) 2006."

1. Let $A\in\mathbb{R}^{m\times n}$ with $m\geq n$ and $\mathop{\mathrm{rank}}(A)=n$, and  let $A=U\Sigma V^T$ be its SVD.
   Let $a\in\mathbb{R}^{n}$ be a vector, and let $\tilde A=\begin{bmatrix} A \\ a^T\end{bmatrix}$. Then
   $$\begin{bmatrix} A \\ a^T\end{bmatrix} =\begin{bmatrix} U & \\ & 1 \end{bmatrix} 
   \begin{bmatrix} \Sigma \\ a^TV \end{bmatrix}  V^T.
   $$
   Let $\begin{bmatrix} \Sigma \\ a^T V \end{bmatrix} = \bar U \bar \Sigma \bar V^T$ be the SVD of the half-arrowhead matrix. _This SVD can be computed in $O(n^2)$ operations._ Then 
   $$\begin{bmatrix} A \\ a^T\end{bmatrix} =
   \begin{bmatrix} U & \\ & 1 \end{bmatrix} \bar U \bar\Sigma \bar V^T V^T \equiv
   \tilde U \bar \Sigma \tilde V^T
   $$
   is the SVD of $\tilde A$. 
   
2. Direct computation of $\tilde U$ and $\tilde V$ requires $O(mn^2)$ and $O(n^3)$ operations. However, these multiplications can be performed using Fast Multipole Method. This is not (yet) implemented in Julia and is "not for the timid" (quote by Steven G. Johnson).

3. If $m<n$ and $\mathop{\mathrm{rank}}(A)=n$, then
   $$
   \begin{bmatrix} A \\ a^T\end{bmatrix} =\begin{bmatrix} U & \\ & 1 \end{bmatrix} 
   \begin{bmatrix} \Sigma & 0 \\ a^T V & \beta\end{bmatrix} \begin{bmatrix} V^T \\ v^T \end{bmatrix},
   $$
   where $\beta=\sqrt{\|a\|_2^2-\|V^T a\|_2^2}$ and $v=(I-VV^T)a$. Notice that $V^Tv=0$ by construction.
   Let $\begin{bmatrix} \Sigma & 0 \\ a^T V &  \beta\end{bmatrix} = \bar U \bar \Sigma \bar V^T$ be the SVD of 
   the half-arrowhead matrix. Then 
   $$\begin{bmatrix} A \\ a^T\end{bmatrix} =
   \begin{bmatrix} U & \\ & 1 \end{bmatrix} \bar U \bar\Sigma \bar V^T \begin{bmatrix} V^T \\ v^T \end{bmatrix}
   \equiv \tilde U \bar \Sigma \tilde V^T
   $$
   is the SVD of $\tilde A$.
   
3. Adding a column $a$ to $A$ is equivalent to adding a row $a^T$ to $A^T$.

3. If $\mathop{\mathrm{rank}}(A)<\min\{m,n\}$ or if we are using SVD approximation of rank $r$, and if we want to keep the rank of the approximation (this is the common case in practice), then the formulas in Fact 1 hold approximately. More precisely, the updated rank $r$ approximation is __not__ what we would get by computing the approximation of rank $r$ of the updated matrix, but is sufficient in many applications. 

### Example - Adding row to a tall matrix

If $m>=n$, adding row does not increase the size of $\Sigma$.

In [1]:
using Arrowhead

In [3]:
function mySVDaddrow{T}(svdA::Tuple,a::Vector{T})
    # Create the transposed half-arrowhead
    m,r,n=size(svdA[1],1),length(svdA[2]),size(svdA[3],1)
    b=svdA[3]'*a
    if m>=n || r<m
        M=HalfArrow(svdA[2],b)
    else
        β=sqrt(vecnorm(a)^2-vecnorm(b)^2)
        M=HalfArrow(svdA[2],[b;β])
    end
    tols=[1e2,1e2,1e2,1e2]
    U,σ,V=svd(M,tols)
    # Return the updated SVD
    if m>=n || r<m
        return [svdA[1] zeros(T,m); zeros(T,1,r) one(T)]*V, σ, svdA[3]*U
    else
        # Need one more row of svdA[3] - v is orthogonal projection
        v=a-svdA[3]*b
        v=v/norm(v)
        return [svdA[1] zeros(T,m); zeros(T,1,r) one(T)]*V, σ, [svdA[3] v]*U
    end
end

mySVDaddrow (generic function with 1 method)

In [4]:
A=rand(10,6)
a=rand(6)

6-element Array{Float64,1}:
 0.0928924
 0.887951 
 0.517968 
 0.231056 
 0.0385352
 0.486347 

In [5]:
svdA=svd(A)

(
10x6 Array{Float64,2}:
 -0.202045  -0.325981   0.00157307   0.386508    0.0610235   0.201287
 -0.324758   0.124974  -0.118993    -0.426752    0.343555    0.344056
 -0.372259   0.12883    0.155709    -0.150447   -0.24869     0.298472
 -0.409947   0.2291    -0.291099    -0.0997363   0.172445    0.109963
 -0.346031   0.263648   0.228842     0.478055   -0.203468   -0.327291
 -0.188645  -0.281908   0.698134    -0.447241    0.145768   -0.345321
 -0.362971   0.465553   0.0731078    0.0208722  -0.184808   -0.161569
 -0.30738   -0.286415   0.144817     0.42627     0.540872    0.142238
 -0.316467  -0.551669  -0.139874    -0.0924748  -0.614296    0.17053 
 -0.255676  -0.238619  -0.539641    -0.129064    0.141275   -0.661928,

[4.193023886977698,1.0915281566750403,0.9259518180563089,0.7708240555874193,0.5429722182454295,0.4129406032560103],
6x6 Array{Float64,2}:
 -0.572198   0.121833   -0.137336   0.343882     -0.651924    0.309233
 -0.446899   0.0844174   0.715274  -0.276517     -0.0982127  -0.

In [6]:
typeof(svdA)

Tuple{Array{Float64,2},Array{Float64,1},Array{Float64,2}}

In [7]:
U,σ,V=mySVDaddrow(svdA,a)

(
11x6 Array{Float64,2}:
 -0.195368   0.347282   -0.0394979    0.314876   -0.179525    0.110963 
 -0.317362  -0.0782289   0.132243    -0.340432    0.254652    0.488695 
 -0.363323  -0.090404   -0.0126725    0.066102    0.409111   -0.0451198
 -0.400238  -0.117329    0.346802    -0.136063    0.0185272   0.209642 
 -0.339248  -0.240282   -0.0224223    0.510629   -0.158703   -0.313213 
 -0.187459   0.0586492  -0.770337    -0.20182     0.251241    0.0497799
 -0.356116  -0.405107    0.143492     0.131199    0.131106   -0.195956 
 -0.299685   0.263915   -0.172756     0.341325   -0.344359    0.517208 
 -0.305757   0.589747   -0.00859284  -0.0688653   0.257672   -0.476095 
 -0.249773   0.279439    0.271842    -0.459137   -0.435785   -0.199245 
 -0.214905  -0.361262   -0.377081    -0.333674   -0.5048     -0.173293 ,

[4.290276921124515,1.131444628665424,0.9846430307431189,0.8029106839944139,0.691819889340859,0.5168322391993393],
6x6 Array{Float64,2}:
 -0.550486   0.10484    0.405135    0.500263 

In [8]:
# Check the residual and orthogonality
norm([A;a']*V-U*diagm(σ)), norm(U'*U-I), norm(V'*V-I)

(1.21331381799717e-15,1.5869069979326949e-15,1.1629953422117905e-15)

### Example - Adding row to a flat matrix

In [9]:
# Now flat matrix
A=rand(6,10)
a=rand(10)
svdA=svd(A)

(
6x6 Array{Float64,2}:
 -0.432393   0.160235    -0.211182   0.16489   -0.719376    0.445054
 -0.418288  -0.717937    -0.385621  -0.285087  -0.0339432  -0.280129
 -0.374601  -0.00685766   0.389024  -0.542459   0.327683    0.55376 
 -0.39868   -0.00783435  -0.22494    0.653772   0.571588    0.190431
 -0.400647   0.672849    -0.275713  -0.309954   0.124477   -0.446288
 -0.422236  -0.0778302    0.727148   0.271633  -0.17822    -0.425874,

[4.16666391152648,1.0724827324932222,0.9376591441893033,0.8943078623035339,0.6594546699656325,0.4047875523733787],
10x6 Array{Float64,2}:
 -0.160776  -0.0985226  -0.180098    0.233496    0.20507    -0.394763
 -0.37738    0.159758   -0.285629    0.550176   -0.160696    0.104352
 -0.243722  -0.48046    -0.0545974  -0.286548   -0.446603   -0.444187
 -0.360217  -0.107562    0.484051   -0.370104   -0.0646561   0.193055
 -0.361811   0.581003    0.433407    0.0521544   0.137587   -0.240379
 -0.391142  -0.465751    0.122489    0.211198    0.252305    0.555006
 -

In [10]:
U,σ,V=mySVDaddrow(svdA,a)
norm([A;a']*V-U*diagm(σ)), norm(U'*U-I), norm(V'*V-I)

(1.7366599135313465e-14,1.1570016931296238e-15,5.116683537954051e-15)

### Example - Adding columns

This can be viewed as adding rows to the transposed matrix, an elegant one-liner in Julia.

In [11]:
function mySVDaddcol{T}(svdA::Tuple,a::Vector{T})
    reverse(mySVDaddrow(reverse(svdA),a))
end 

mySVDaddcol (generic function with 1 method)

In [13]:
# Tall matrix
A=rand(10,6)
a=rand(10)
svdA=svd(A)
U,σ,V=mySVDaddcol(svdA,a)
norm([A a]*V-U*diagm(σ)), norm(U'*U-I), norm(V'*V-I)

(3.703741162685095e-15,2.2316564630420207e-15,1.17813516321025e-15)

In [14]:
# Flat matrix
A=rand(6,10)
a=rand(6)
svdA=svd(A)
U,σ,V=mySVDaddcol(svdA,a)
norm([A a]*V-U*diagm(σ)), norm(U'*U-I), norm(V'*V-I)

(1.3442104169809738e-15,7.843690968221566e-16,9.455190826283757e-16)

In [15]:
# Square matrix
A=rand(10,10)
a=rand(10);
svdA=svd(A);

In [16]:
U,σ,V=mySVDaddrow(svdA,a)
norm([A;a']*V-U*diagm(σ)), norm(U'*U-I), norm(V'*V-I)

(2.5285138092863163e-14,1.6743478769024218e-15,2.2951296678890638e-15)

In [17]:
U,σ,V=mySVDaddcol(svdA,a)
norm([A a]*V-U*diagm(σ)), norm(U'*U-I), norm(V'*V-I)

(2.5203322475718537e-14,1.6476458351487867e-15,1.5529553982312772e-15)

### Example - Updating a low rank approximation


In [18]:
# Adding row to a tall matrix
A=rand(10,6)
svdA=svd(A)
a=rand(6)
# Rank of the approximation
r=4

4

In [19]:
svdAr=(svdA[1][:,1:r], svdA[2][1:r],svdA[3][:,1:r])
U,σ,V=mySVDaddrow(svdAr,a)
norm([A;a']-U*diagm(σ)*V'), svdvals([A;a']), σ

(0.5767244490378295,[3.983282775265407,1.2368584864567675,1.015595396856338,0.8588949224399546,0.5573396035697008,0.28908522265724285],[3.980775555676989,1.2365416309545627,1.0155536057891885,0.8582051673927036])

In [21]:
# Adding row to a flat matrix
A=rand(6,10)
svdA=svd(A)
a=rand(10)
# Rank of the approximation
r=4

4

In [22]:
svdAr=(svdA[1][:,1:r], svdA[2][1:r],svdA[3][:,1:r])
U,σ,V=mySVDaddrow(svdAr,a)
norm([A;a']-U*diagm(σ)*V'), svdvals([A;a']), σ

(0.9977575093994425,[4.362678939640087,1.3425579717495806,1.0347188983555768,0.9194988272619319,0.7297484437624062,0.5077973623027092,0.2516233402582735],[4.350124937773744,1.2862214879485432,1.0038802337142887,0.8252865332963021])