# Updating the SVD

---

In many applications which are based on the SVD, arrival of new data requires SVD of the new matrix. Instead of computing from scratch, existing SVD can be updated.

## Prerequisites

The reader should be familiar with concepts of singular values and singular vectors, related perturbation theory, and algorithms.
 
## Competences 

The reader should be able to recognise applications where SVD updating can be sucessfully applied and apply it.

---

## Facts

For more details see
[M. Gu and S. C. Eisenstat, A Stable and Fast Algorithm for Updating the Singular Value Decomposition][GE93]
and [M. Brand, Fast low-rank modifications of the thin singular value decomposition][Bra06]
and the references therein.

[GE93]: http://www.cs.yale.edu/publications/techreports/tr966.pdf "M. Gu and S. C. Eisenstat, 'A Stable and Fast Algorithm for Updating the Singular Value Decomposition', Tech.report, Yale University, 1993."

[Bra06]: http://www.sciencedirect.com/science/article/pii/S0024379505003812 "M. Brand, 'Fast low-rank modifications of the thin singular value decomposition', Linear Algebra and its Appl, 415 (20-30) 2006."

1. Let $A\in\mathbb{R}^{m\times n}$ with $m\geq n$ and $\mathop{\mathrm{rank}}(A)=n$, and  let $A=U\Sigma V^T$ be its SVD.
   Let $a\in\mathbb{R}^{n}$ be a vector, and let $\tilde A=\begin{bmatrix} A \\ a^T\end{bmatrix}$. Then
   $$\begin{bmatrix} A \\ a^T\end{bmatrix} =\begin{bmatrix} U & \\ & 1 \end{bmatrix} 
   \begin{bmatrix} \Sigma \\ a^TV \end{bmatrix}  V^T.
   $$
   Let $\begin{bmatrix} \Sigma \\ a^T V \end{bmatrix} = \bar U \bar \Sigma \bar V^T$ be the SVD of the half-arrowhead matrix. _This SVD can be computed in $O(n^2)$ operations._ Then 
   $$\begin{bmatrix} A \\ a^T\end{bmatrix} =
   \begin{bmatrix} U & \\ & 1 \end{bmatrix} \bar U \bar\Sigma \bar V^T V^T \equiv
   \tilde U \bar \Sigma \tilde V^T
   $$
   is the SVD of $\tilde A$. 
   
2. Direct computation of $\tilde U$ and $\tilde V$ requires $O(mn^2)$ and $O(n^3)$ operations. However, these multiplications can be performed using Fast Multipole Method. This is not (yet) implemented in Julia and is "not for the timid" (quote by Steven G. Johnson).

3. If $m<n$ and $\mathop{\mathrm{rank}}(A)=n$, then
   $$
   \begin{bmatrix} A \\ a^T\end{bmatrix} =\begin{bmatrix} U & \\ & 1 \end{bmatrix} 
   \begin{bmatrix} \Sigma & 0 \\ a^T V & \beta\end{bmatrix} \begin{bmatrix} V^T \\ v^T \end{bmatrix},
   $$
   where $\beta=\sqrt{\|a\|_2^2-\|V^T a\|_2^2}$ and $v=(I-VV^T)a$. Notice that $V^Tv=0$ by construction.
   Let $\begin{bmatrix} \Sigma & 0 \\ a^T V &  \beta\end{bmatrix} = \bar U \bar \Sigma \bar V^T$ be the SVD of 
   the half-arrowhead matrix. Then 
   $$\begin{bmatrix} A \\ a^T\end{bmatrix} =
   \begin{bmatrix} U & \\ & 1 \end{bmatrix} \bar U \bar\Sigma \bar V^T \begin{bmatrix} V^T \\ v^T \end{bmatrix}
   \equiv \tilde U \bar \Sigma \tilde V^T
   $$
   is the SVD of $\tilde A$.
   
3. Adding a column $a$ to $A$ is equivalent to adding a row $a^T$ to $A^T$.

3. If $\mathop{\mathrm{rank}}(A)<\min\{m,n\}$ or if we are using SVD approximation of rank $r$, and if we want to keep the rank of the approximation (this is the common case in practice), then the formulas in Fact 1 hold approximately. More precisely, the updated rank $r$ approximation is __not__ what we would get by computing the approximation of rank $r$ of the updated matrix, but is sufficient in many applications. 

### Example - Adding row to a tall matrix

If $m>=n$, adding row does not increase the size of $\Sigma$.

In [1]:
using Arrowhead

In [2]:
function mySVDaddrow{T}(svdA::Tuple,a::Vector{T})
    # Create the transposed half-arrowhead
    m,r,n=size(svdA[1],1),length(svdA[2]),size(svdA[3],1)
    b=svdA[3]'*a
    if m>=n || r<m
        M=HalfArrow(svdA[2],b)
    else
        β=sqrt(vecnorm(a)^2-vecnorm(b)^2)
        M=HalfArrow(svdA[2],[b;β])
    end
    tols=[1e2,1e2,1e2,1e2]
    U,σ,V=svd(M,tols)
    # Return the updated SVD
    if m>=n || r<m
        return [svdA[1] zeros(T,m); zeros(T,1,r) one(T)]*V, σ, svdA[3]*U
    else
        # Need one more row of svdA[3] - v is orthogonal projection
        v=a-svdA[3]*b
        v=v/norm(v)
        return [svdA[1] zeros(T,m); zeros(T,1,r) one(T)]*V, σ, [svdA[3] v]*U
    end
end

mySVDaddrow (generic function with 1 method)

In [3]:
A=rand(10,6)
a=rand(6)

6-element Array{Float64,1}:
 0.85332 
 0.903412
 0.827819
 0.408869
 0.583093
 0.736354

In [4]:
svdA=svd(A)

(
[-0.285378 0.16357 … 0.199572 0.276073; -0.300254 0.248522 … -0.381776 -0.0350621; … ; -0.429127 -0.0207417 … -0.182095 0.000543409; -0.349413 0.41547 … 0.349614 0.15118],

[3.47339,1.24835,1.07237,0.706608,0.627259,0.200231],
[-0.348914 0.301398 … 0.512509 -0.62273; -0.391877 -0.230145 … 0.396827 0.0286701; … ; -0.511098 0.475454 … -0.193685 0.164429; -0.483959 -0.389085 … -0.652782 -0.39171])

In [5]:
typeof(svdA)

Tuple{Array{Float64,2},Array{Float64,1},Array{Float64,2}}

In [6]:
U,σ,V=mySVDaddrow(svdA,a)

(
[-0.255632 0.160349 … -0.337601 0.336241; -0.263237 0.277911 … 0.558511 -0.0681366; … ; -0.309791 0.4366 … 0.130764 0.257087; -0.454025 -0.156387 … -0.188337 -0.314866],

[3.89136,1.26562,1.07855,0.73691,0.696342,0.221121],
[-0.377134 0.252546 … -0.464391 -0.577474; -0.417981 -0.232042 … 0.330988 0.0741107; … ; -0.472015 0.542261 … -0.179912 0.138866; -0.471308 -0.344215 … 0.459776 -0.433116])

In [7]:
# Check the residual and orthogonality
norm([A;a']*V-U*diagm(σ)), norm(U'*U-I), norm(V'*V-I)

(2.4963221545544757e-15,1.5385509384231767e-15,1.4023654896907511e-15)

### Example - Adding row to a flat matrix

In [8]:
# Now flat matrix
A=rand(6,10)
a=rand(10)
svdA=svd(A)

(
[-0.316032 -0.813425 … -0.20229 0.111578; -0.338335 0.100809 … 0.599685 0.50262; … ; -0.332441 -0.159702 … -0.370551 0.287277; -0.366914 -0.142901 … 0.558991 -0.169037],

[3.90411,1.42263,0.989402,0.701988,0.512857,0.254624],
[-0.322587 0.174371 … -0.439598 0.063015; -0.395624 0.0938582 … -0.273813 -0.487974; … ; -0.387256 0.232612 … 0.682705 0.106128; -0.300052 -0.39502 … -0.00377504 -0.143848])

In [9]:
U,σ,V=mySVDaddrow(svdA,a)
norm([A;a']*V-U*diagm(σ)), norm(U'*U-I), norm(V'*V-I)

(2.3074104782530735e-15,1.709420891407407e-15,1.6488268397277191e-15)

### Example - Adding columns

This can be viewed as adding rows to the transposed matrix, an elegant one-liner in Julia.

In [10]:
function mySVDaddcol{T}(svdA::Tuple,a::Vector{T})
    reverse(mySVDaddrow(reverse(svdA),a))
end 

mySVDaddcol (generic function with 1 method)

In [11]:
# Tall matrix
A=rand(10,6)
a=rand(10)
svdA=svd(A)
U,σ,V=mySVDaddcol(svdA,a)
norm([A a]*V-U*diagm(σ)), norm(U'*U-I), norm(V'*V-I)

(5.3190732135377304e-15,7.390458116569099e-15,9.513145304928979e-16)

In [12]:
# Flat matrix
A=rand(6,10)
a=rand(6)
svdA=svd(A)
U,σ,V=mySVDaddcol(svdA,a)
norm([A a]*V-U*diagm(σ)), norm(U'*U-I), norm(V'*V-I)

(1.2967084780045917e-15,1.5141032328978982e-15,1.407905031422773e-15)

In [13]:
# Square matrix
A=rand(10,10)
a=rand(10);
svdA=svd(A);

In [14]:
U,σ,V=mySVDaddrow(svdA,a)
norm([A;a']*V-U*diagm(σ)), norm(U'*U-I), norm(V'*V-I)

Remedy 3 


(2.5542172008400272e-14,1.841965491224396e-15,1.1950750703510105e-15)

In [15]:
U,σ,V=mySVDaddcol(svdA,a)
norm([A a]*V-U*diagm(σ)), norm(U'*U-I), norm(V'*V-I)

(2.5588376140842295e-14,1.8808117490108662e-15,1.103971768414233e-15)

### Example - Updating a low rank approximation


In [16]:
# Adding row to a tall matrix
A=rand(10,6)
svdA=svd(A)
a=rand(6)
# Rank of the approximation
r=4

4

In [17]:
svdAr=(svdA[1][:,1:r], svdA[2][1:r],svdA[3][:,1:r])
U,σ,V=mySVDaddrow(svdAr,a)
norm([A;a']-U*diagm(σ)*V'), svdvals([A;a']), σ

(0.7562770594874182,[4.38054,1.43714,1.3023,0.748024,0.742989,0.472363],[4.38033,1.43174,1.30226,0.74412])

In [18]:
# Adding row to a flat matrix
A=rand(6,10)
svdA=svd(A)
a=rand(10)
# Rank of the approximation
r=4

4

In [19]:
svdAr=(svdA[1][:,1:r], svdA[2][1:r],svdA[3][:,1:r])
U,σ,V=mySVDaddrow(svdAr,a)
norm([A;a']-U*diagm(σ)*V'), svdvals([A;a']), σ

(0.8953845533124976,[4.20618,1.18811,0.951222,0.823968,0.626708,0.418073,0.18691],[4.1928,1.17104,0.937317,0.667047])