# Updating the SVD

In many applications which are based on the SVD, arrival of new data requires SVD of the new matrix. Instead of computing from scratch, existing SVD can be updated.

## Prerequisites

The reader should be familiar with concepts of singular values and singular vectors, related perturbation theory, and algorithms.
 
## Competences 

The reader should be able to recognise applications where SVD updating can be sucessfully applied and apply it.

## Facts

For more details see
[M. Gu and S. C. Eisenstat, A Stable and Fast Algorithm for Updating the Singular Value Decomposition][GE93]
and [M. Brand, Fast low-rank modifications of the thin singular value decomposition][Bra06]
and the references therein.

[GE93]: http://www.cs.yale.edu/publications/techreports/tr966.pdf "M. Gu and S. C. Eisenstat, 'A Stable and Fast Algorithm for Updating the Singular Value Decomposition', Tech.report, Yale University, 1993."

[Bra06]: http://www.sciencedirect.com/science/article/pii/S0024379505003812 "M. Brand, 'Fast low-rank modifications of the thin singular value decomposition', Linear Algebra and its Appl, 415 (20-30) 2006."

1. Let $A\in\mathbb{R}^{m\times n}$ with $m\geq n$ and $\mathop{\mathrm{rank}}(A)=n$, and  let $A=U\Sigma V^T$ be its SVD.
   Let $a\in\mathbb{R}^{n}$ be a vector, and let $\tilde A=\begin{bmatrix} A \\ a^T\end{bmatrix}$. Then
   $$\begin{bmatrix} A \\ a^T\end{bmatrix} =\begin{bmatrix} U & \\ & 1 \end{bmatrix} 
   \begin{bmatrix} \Sigma \\ a^TV \end{bmatrix}  V^T.
   $$
   Let $\begin{bmatrix} \Sigma \\ a^T V \end{bmatrix} = \bar U \bar \Sigma \bar V^T$ be the SVD of the half-arrowhead matrix. _This SVD can be computed in $O(n^2)$ operations._ Then 
   $$\begin{bmatrix} A \\ a^T\end{bmatrix} =
   \begin{bmatrix} U & \\ & 1 \end{bmatrix} \bar U \bar\Sigma \bar V^T V^T \equiv
   \tilde U \bar \Sigma \tilde V^T
   $$
   is the SVD of $\tilde A$. 
   
2. Direct computation of $\tilde U$ and $\tilde V$ requires $O(mn^2)$ and $O(n^3)$ operations. However, these multiplications can be performed using Fast Multipole Method. This is not (yet) implemented in Julia and is "not for the timid" (quote by Steven G. Johnson).

3. If $m<n$ and $\mathop{\mathrm{rank}}(A)=n$, then
   $$
   \begin{bmatrix} A \\ a^T\end{bmatrix} =\begin{bmatrix} U & \\ & 1 \end{bmatrix} 
   \begin{bmatrix} \Sigma & 0 \\ a^T V & \beta\end{bmatrix} \begin{bmatrix} V^T \\ v^T \end{bmatrix},
   $$
   where $\beta=\sqrt{\|a\|_2^2-\|V^T a\|_2^2}$ and $v=(I-VV^T)a$. Notice that $V^Tv=0$ by construction.
   Let $\begin{bmatrix} \Sigma & 0 \\ a^T V &  \beta\end{bmatrix} = \bar U \bar \Sigma \bar V^T$ be the SVD of 
   the half-arrowhead matrix. Then 
   $$\begin{bmatrix} A \\ a^T\end{bmatrix} =
   \begin{bmatrix} U & \\ & 1 \end{bmatrix} \bar U \bar\Sigma \bar V^T \begin{bmatrix} V^T \\ v^T \end{bmatrix}
   \equiv \tilde U \bar \Sigma \tilde V^T
   $$
   is the SVD of $\tilde A$.
   
3. Adding a column $a$ to $A$ is equivalent to adding a row $a^T$ to $A^T$.

3. If $\mathop{\mathrm{rank}}(A)<\min\{m,n\}$ or if we are using SVD approximation of rank $r$, and if we want to keep the rank of the approximation (this is the common case in practice), then the formulas in Fact 1 hold approximately. More precisely, the updated rank $r$ approximation is __not__ what we would get by computing the approximation of rank $r$ of the updated matrix, but is sufficient in many applications. 

### Example - Adding row to a tall matrix

If $m>=n$, adding row does not increase the size of $\Sigma$.

In [1]:
using Arrowhead

In [7]:
function mySVDaddrow(svdA::Tuple,a::Vector)
    # Create the transposed half-arrowhead
    m,r,n=size(svdA[1],1),length(svdA[2]),size(svdA[3],1)
    T=typeof(a[1])
    b=svdA[3]'*a
    if m>=n || r<m
        M=HalfArrow(svdA[2],b)
    else
        β=sqrt(vecnorm(a)^2-vecnorm(b)^2)
        M=HalfArrow(svdA[2],[b;β])
    end
    tols=[1e2,1e2,1e2,1e2]
    U,σ,V=svd(M,tols)
    # Return the updated SVD
    if m>=n || r<m
        return [svdA[1] zeros(T,m); zeros(T,1,r) one(T)]*V, σ, svdA[3]*U
    else
        # Need one more row of svdA[3] - v is orthogonal projection
        v=a-svdA[3]*b
        v=v/norm(v)
        return [svdA[1] zeros(T,m); zeros(T,1,r) one(T)]*V, σ, [svdA[3] v]*U
    end
end

mySVDaddrow (generic function with 1 method)

In [3]:
A=rand(10,6)
a=rand(6)

6-element Array{Float64,1}:
 0.600138
 0.494396
 0.637202
 0.255038
 0.747546
 0.857846

In [4]:
svdA=svd(A)

([-0.253518 0.180114 … 0.309896 -0.0691524; -0.226062 -0.121541 … -0.00695264 0.433157; … ; -0.26309 -0.239412 … -0.819333 -0.0752609; -0.154135 0.280222 … 0.0267833 -0.326761], [3.37392, 1.04084, 1.00092, 0.868372, 0.488725, 0.124233], [-0.38682 0.136315 … -0.88927 -0.0626369; -0.313079 -0.143102 … -0.0296047 0.830579; … ; -0.407181 -0.385655 … 0.0242662 -0.541095; -0.509891 0.516766 … 0.359554 -0.0959959])

In [5]:
typeof(svdA)

Tuple{Array{Float64,2},Array{Float64,1},Array{Float64,2}}

In [8]:
U,σ,V=mySVDaddrow(svdA,a)

([-0.229911 -0.0342483 … 0.309037 -0.057797; -0.209717 0.0196049 … 0.0169131 0.477908; … ; -0.142252 -0.227439 … 0.0318075 -0.245091; -0.40644 -0.253101 … -0.0724165 -0.426307], [3.68841, 1.07552, 1.00261, 0.905266, 0.490583, 0.144028], [-0.389427 -0.083517 … -0.890335 -0.049393; -0.316661 0.0699576 … -0.00884346 0.842396; … ; -0.423026 0.227259 … 0.0280775 -0.527231; -0.521655 -0.415074 … 0.365689 -0.0901238])

In [9]:
# Check the residual and orthogonality
vecnorm([A;a']*V-U*diagm(σ)), vecnorm(U'*U-I), vecnorm(V'*V-I)

(1.9587048094424453e-15, 1.885257853121155e-15, 1.8237184231469905e-15)

### Example - Adding row to a flat matrix

In [10]:
# Now flat matrix
A=rand(6,10)
a=rand(10)
svdA=svd(A)

([-0.476057 -0.211646 … 0.0703447 -0.112251; -0.444378 0.352095 … 0.0651245 -0.533882; … ; -0.278694 -0.596163 … -0.600797 0.219102; -0.426618 0.608495 … -0.237625 0.618014], [4.55289, 1.29194, 0.967594, 0.813792, 0.656833, 0.288334], [-0.265053 0.0929198 … -0.500067 -0.141792; -0.413149 -0.122492 … 0.368838 0.00607877; … ; -0.34046 0.253964 … 0.164106 -0.725205; -0.277363 -0.158561 … -0.482752 0.251285])

In [11]:
U,σ,V=mySVDaddrow(svdA,a)
norm([A;a']*V-U*diagm(σ)), norm(U'*U-I), norm(V'*V-I)

(2.7603535729310404e-15, 1.8219426142087213e-15, 2.6764436302052887e-15)

### Example - Adding columns

This can be viewed as adding rows to the transposed matrix, an elegant one-liner in Julia.

In [12]:
function mySVDaddcol(svdA::Tuple,a::Vector)
    reverse(mySVDaddrow(reverse(svdA),a))
end 

mySVDaddcol (generic function with 1 method)

In [13]:
# Tall matrix
A=rand(10,6)
a=rand(10)
svdA=svd(A)
U,σ,V=mySVDaddcol(svdA,a)
vecnorm([A a]*V-U*diagm(σ)), vecnorm(U'*U-I), vecnorm(V'*V-I)

(2.7427595495755484e-15, 2.5744709723782386e-15, 1.6305681323770733e-15)

In [14]:
# Flat matrix
A=rand(6,10)
a=rand(6)
svdA=svd(A)
U,σ,V=mySVDaddcol(svdA,a)
vecnorm([A a]*V-U*diagm(σ)), vecnorm(U'*U-I), vecnorm(V'*V-I)

(2.5405639765851526e-15, 2.8743103525474136e-15, 2.023822046289524e-15)

In [15]:
# Square matrix
A=rand(10,10)
a=rand(10);
svdA=svd(A);

In [16]:
U,σ,V=mySVDaddrow(svdA,a)
vecnorm([A;a']*V-U*diagm(σ)), vecnorm(U'*U-I), vecnorm(V'*V-I)

(4.467605417315042e-14, 2.594231655800166e-15, 4.016986118189482e-15)

In [17]:
U,σ,V=mySVDaddcol(svdA,a)
vecnorm([A a]*V-U*diagm(σ)), vecnorm(U'*U-I), vecnorm(V'*V-I)

(4.430284503846282e-14, 3.1499710014674485e-15, 3.78046877817903e-15)

### Example - Updating a low rank approximation


In [18]:
# Adding row to a tall matrix
A=rand(10,6)
svdA=svd(A)
a=rand(6)
# Rank of the approximation
r=4

In [19]:
svdAr=(svdA[1][:,1:r], svdA[2][1:r],svdA[3][:,1:r])
U,σ,V=mySVDaddrow(svdAr,a)
vecnorm([A;a']-U*diagm(σ)*V'), svdvals([A;a']), σ

(0.7263342977519102, [4.20073, 1.20188, 1.05707, 0.917689, 0.494643, 0.460649], [4.19701, 1.1962, 1.05602, 0.904695])

In [20]:
# Adding row to a flat matrix
A=rand(6,10)
svdA=svd(A)
a=rand(10)
# Rank of the approximation
r=4

In [21]:
svdAr=(svdA[1][:,1:r], svdA[2][1:r],svdA[3][:,1:r])
U,σ,V=mySVDaddrow(svdAr,a)
vecnorm([A;a']-U*diagm(σ)*V'), svdvals([A;a']), σ

(1.0860732648102345, [4.40671, 1.25549, 1.10951, 0.8289, 0.770095, 0.599756, 0.341589], [4.40174, 1.25257, 1.085, 0.825713])