# Instability of Gram–Schmidt

Both forms of Gram-Schmidt orthogonalization suffer from instability when used in finite precision. Here are our reference implementations again. 

In [1]:
using LinearAlgebra

function cgs(A)
    m,n = size(A)
    Q = zeros(m,n)
    R = zeros(n,n)
    for j = 1:n
        v = A[:,j]
        for k = 1:j-1
            R[k,j] = Q[:,k]⋅A[:,j]
            v -= R[k,j]*Q[:,k]
        end
        R[j,j] = norm(v)
        Q[:,j] = v/R[j,j]
    end
    return Q,R
end;   

function mgs(A)
    m,n = size(A)
    Q = zeros(m,n)
    R = zeros(n,n)
    B = copy(A)
    for k = 1:n
        R[k,k] = norm(B[:,k])
        Q[:,k] = B[:,k]/R[k,k]
        for j = k+1:n
            R[k,j] = Q[:,k]⋅B[:,j]
        end
        B -= Q[:,k]*R[k,:]'
    end
    return Q,R
end;  

We are going to try these out on *Vandermonde* matrices. Given a vector $x$ of $m$ points, the columns of a Vandermonde matrix are evaluations of the monomials $1,x,\ldots,x^{m-1}$, each at all of the points in $x$. For convenience we write a function that makes these matrices using equally spaced points in $[0,1]$. 

In [2]:
vander(m) = [ (i/m-1)^j for i=0:m-1, j=0:m-1 ];

The test we will use is solving square linear systems of the form $Ax=b$ for vector $x$. If $A=QR$ is a full factorization, then $Rx=Q^*b$ and $x=R^{-1}Q^*b$. In practice we don't compute inverse matrices. Instead, since $R$ is triangular, we can use backward substitution. 

In [3]:
function backsub(R,v)
    x = zero(v)
    n = length(x)
    x[n] = v[n]/R[n,n]
    for i = n-1:-1:1
        x[i] = (v[i] - sum(R[i,j]*x[j] for j=i+1:n))/R[i,i]
    end
    return x
end

backsub (generic function with 1 method)

Finally, we run our experiment with MGS. For each $m$, we define a linear system whose solution we know exactly, and evaluate the accuracy of the result obtained by solving the system with an MGS factorization. 

In [4]:
for m = 3:12
    A = vander(m);
    Q,R = mgs(A);
    xact = ones(m);
    b = A*xact;
    x = backsub(R,Q'*b);
    println("m = $m: relative error in x is $(norm(x-xact)/norm(xact))")
end

m = 3: relative error in x is 2.0643780405813746e-14
m = 4: relative error in x is 2.765084163240567e-13
m = 5: relative error in x is 4.3853264259829185e-11
m = 6: relative error in x is 5.853934111692478e-10
m = 7: relative error in x is 2.168939797200798e-8
m = 8: relative error in x is 2.133216055674811e-6
m = 9: relative error in x is 6.950940446701876e-5
m = 10: relative error in x is 0.009033706217227188
m = 11: relative error in x is 0.6422646160775008
m = 12: relative error in x is 18.82690667221913


Now, these don't look so good as $m$ increases. But as we will be seeing, we cannot always expect accurate solutions to this type of problem. Let's assume that the built-in QR factorization is as good as we can do, and see how it performs.

In [5]:
for m = 3:12
    A = vander(m);
    Q,R = qr(A);
    xact = ones(m);
    b = A*xact;
    x = backsub(R,Q'*b);
    println("m = $m: relative error in x is $(norm(x-xact)/norm(xact))")
end

m = 3: relative error in x is 1.1083711083656976e-15
m = 4: relative error in x is 1.0413039739334374e-14
m = 5: relative error in x is 2.539272045339701e-14
m = 6: relative error in x is 4.3645828251288443e-13
m = 7: relative error in x is 3.150971136357195e-12
m = 8: relative error in x is 4.531953832679162e-12
m = 9: relative error in x is 1.1372841335400557e-10
m = 10: relative error in x is 5.227856787216244e-10
m = 11: relative error in x is 1.0736373728650449e-8
m = 12: relative error in x is 1.6887232227742025e-8


In [6]:
for m = 3:12
    A = vander(m);
    Q,R = cgs(A);
    xact = ones(m);
    b = A*xact;
    x = backsub(R,Q'*b);
    println("m = $m: relative error in x is $(norm(x-xact)/norm(xact))")
end

m = 3: relative error in x is 2.8148916062398038e-14
m = 4: relative error in x is 3.779868218405056e-13
m = 5: relative error in x is 3.851582990433591e-10
m = 6: relative error in x is 3.0706220658225446e-8
m = 7: relative error in x is 2.1840474997942476e-5
m = 8: relative error in x is 0.005523422161571493
m = 9: relative error in x is 1.1591594671943921
m = 10: relative error in x is 117.72592248679791
m = 11: relative error in x is 424.11213131466843
m = 12: relative error in x is 1260.9828870880735


If we dig a little deeper, we find more detail about what is happening. 

In [7]:
for m = 3:12
    A = vander(m);
    Q,R = mgs(A);
    println("m = $m")
    println("    MGS: norm(A-QR) = $(norm(A-Q*R)), norm(Q'Q-I) = $(norm(Q'*Q-I))")
    Q,R = cgs(A);
    println("    CGS: norm(A-QR) = $(norm(A-Q*R)), norm(Q'Q-I) = $(norm(Q'*Q-I))")
end

m = 3
    MGS: norm(A-QR) = 6.206335383118183e-17, norm(Q'Q-I) = 2.8294355608193063e-15
    CGS: norm(A-QR) = 6.206335383118183e-17, norm(Q'Q-I) = 9.795241106856997e-15
m = 4
    MGS: norm(A-QR) = 3.98609107286759e-17, norm(Q'Q-I) = 1.1805770268321427e-14
    CGS: norm(A-QR) = 1.195827321860277e-16, norm(Q'Q-I) = 3.151179182222787e-14
m = 5
    MGS: norm(A-QR) = 2.3637248526677933e-16, norm(Q'Q-I) = 1.1677475425255558e-13
    CGS: norm(A-QR) = 8.820715996190249e-17, norm(Q'Q-I) = 2.302225631664689e-11
m = 6
    MGS: norm(A-QR) = 2.5528169419283355e-16, norm(Q'Q-I) = 4.3400513987301074e-13
    CGS: norm(A-QR) = 1.5771887037391902e-16, norm(Q'Q-I) = 1.1333992862760593e-9
m = 7
    MGS: norm(A-QR) = 1.876458360916472e-16, norm(Q'Q-I) = 1.2528518750336082e-12
    CGS: norm(A-QR) = 1.7558966738132046e-16, norm(Q'Q-I) = 4.3721846056562734e-7
m = 8
    MGS: norm(A-QR) = 3.7611837884610617e-16, norm(Q'Q-I) = 3.090494607388847e-11
    CGS: norm(A-QR) = 3.020416577293461e-16, norm(Q'Q-I) = 6.421

Both algorithms produce matrices such that $QR\approx A$. However, they do a poor job at ensuring that $Q$ is orthogonal/unitary. (As it happens, this problem in CGS is more severe, and it also gets a poor $R$, unlike MGS.)

When one algorithm used in finite precision has solutions with much greater error than can be obtained through a different method, we say the algorithm is *unstable*. We will have a lot more to say on this subject soon. 