# Instability of Gram–Schmidt

Both forms of Gram-Schmidt orthogonalization suffer from instability when used in finite precision. Here are our reference implementations again. 

In [1]:
type cgs
type mgs


function [Q,R] = cgs(A)
    [m,n] = size(A);
    Q = zeros(m,n);
    R = zeros(n,n);
    for j = 1:n
        v = A(:,j);
        for k = 1:j-1
            R(k,j) = Q(:,k)'*A(:,j);
            v = v - R(k,j)*Q(:,k);
        end
        R(j,j) = norm(v);
        Q(:,j) = v/R(j,j);
    end
end  

function [Q,R] = mgs(A)
    [m,n] = size(A);
    Q = zeros(m,n);
    R = zeros(n,n);
    for k = 1:n
        R(k,k) = norm(A(:,k));
        Q(:,k) = A(:,k)/R(k,k);
        for j = k+1:n
            R(k,j) = Q(:,k)'*A(:,j);
        end
        A = A - Q(:,k)*R(k,:);
    end
end


We are going to try these out on *Vandermonde* matrices. Given a vector $x$ of $m$ points, the columns of a Vandermonde matrix are evaluations of the monomials $1,x,\ldots,x^{m-1}$, each at all of the points in $x$. For convenience we write a function that makes these matrices using equally spaced points in $[0,1]$. 

The test we will use is solving square linear systems of the form $Ax=b$ for vector $x$. If $A=QR$ is a full factorization, then $Rx=Q^*b$ and $x=R^{-1}Q^*b$. In practice we don't compute inverse matrices. Instead, since $R$ is triangular, we can use backward substitution. 

In [2]:
type backsub


function x = backsub(R,v)
    x = zeros(size(v));
    n = length(x);
    for i = n:-1:1
        x(i) = (v(i) - R(i,i+1:n)*x(i+1:n))/R(i,i);
    end
end


Finally, we run our experiment with MGS. For each $m$, we define a linear system whose solution we know exactly, and evaluate the accuracy of the result obtained by solving the system with an MGS factorization. 

In [3]:
for m = 3:12
    A = vander((0:m-1)/(m-1));
    [Q,R] = mgs(A);
    xact = ones(m,1);
    b = A*xact;
    x = backsub(R,Q'*b);
    fprintf("m = %d: relative error in x is %.3e\n",m,(norm(x-xact)/norm(xact)))
end

m = 3: relative error in x is 6.410e-16
m = 4: relative error in x is 1.010e-13
m = 5: relative error in x is 3.150e-12
m = 6: relative error in x is 3.963e-10
m = 7: relative error in x is 1.459e-08
m = 8: relative error in x is 2.619e-06
m = 9: relative error in x is 5.191e-05
m = 10: relative error in x is 1.114e-02
m = 11: relative error in x is 2.629e-02
m = 12: relative error in x is 1.748e+01


Now, these don't look so good as $m$ increases. But as we will be seeing, we cannot always expect accurate solutions to this type of problem. Let's assume that the built-in QR factorization is as good as we can do, and see how it performs.

In [4]:
for m = 3:12
    A = vander((0:m-1)/(m-1));
    [Q,R] = qr(A);
    xact = ones(m,1);
    b = A*xact;
    x = backsub(R,Q'*b);
    fprintf("m = %d: relative error in x is %.3e\n",m,(norm(x-xact)/norm(xact)))
end

m = 3: relative error in x is 2.937e-16
m = 4: relative error in x is 9.989e-15
m = 5: relative error in x is 8.352e-15
m = 6: relative error in x is 2.258e-13
m = 7: relative error in x is 5.217e-13
m = 8: relative error in x is 1.608e-12
m = 9: relative error in x is 9.540e-12
m = 10: relative error in x is 4.794e-10
m = 11: relative error in x is 1.487e-09
m = 12: relative error in x is 3.415e-08


So clearly, MGS is not as accurate as it could be! But CGS is even worse.

In [5]:
for m = 3:12
    A = vander((0:m-1)/(m-1));
    [Q,R] = cgs(A);
    xact = ones(m,1);
    b = A*xact;
    x = backsub(R,Q'*b);
    fprintf("m = %d: relative error in x is %.3e\n",m,(norm(x-xact)/norm(xact)))
end

m = 3: relative error in x is 5.439e-16
m = 4: relative error in x is 7.792e-14
m = 5: relative error in x is 1.365e-11
m = 6: relative error in x is 1.851e-09
m = 7: relative error in x is 4.174e-06
m = 8: relative error in x is 6.338e-03
m = 9: relative error in x is 5.289e+00
m = 10: relative error in x is 4.211e+04
m = 11: relative error in x is 3.899e+05
m = 12: relative error in x is 9.623e+05


If we dig a little deeper, we find more detail about what is happening. 

In [6]:
for m = 3:12
    A = vander((0:m-1)/(m-1));
    [Q,R] = mgs(A);
    fprintf("m = %d\n",m)
    fprintf("    MGS: norm(A-QR) = %.2e, norm(Q'Q-I) = %.2e\n",(norm(A-Q*R)),(norm(Q'*Q-eye(m))))
    [Q,R] = cgs(A);
    fprintf("    CGS: norm(A-QR) = %.2e, norm(Q'Q-I) = %.2e\n",(norm(A-Q*R)),(norm(Q'*Q-eye(m))))
end

m = 3
    MGS: norm(A-QR) = 0.00e+00, norm(Q'Q-I) = 4.85e-17
    CGS: norm(A-QR) = 1.11e-16, norm(Q'Q-I) = 1.24e-16
m = 4
    MGS: norm(A-QR) = 1.11e-16, norm(Q'Q-I) = 1.25e-15
    CGS: norm(A-QR) = 0.00e+00, norm(Q'Q-I) = 2.26e-15
m = 5
    MGS: norm(A-QR) = 2.22e-16, norm(Q'Q-I) = 8.58e-15
    CGS: norm(A-QR) = 3.14e-16, norm(Q'Q-I) = 3.13e-13
m = 6
    MGS: norm(A-QR) = 2.25e-16, norm(Q'Q-I) = 9.00e-14
    CGS: norm(A-QR) = 2.02e-16, norm(Q'Q-I) = 1.20e-11
m = 7
    MGS: norm(A-QR) = 2.49e-16, norm(Q'Q-I) = 5.16e-13
    CGS: norm(A-QR) = 1.64e-16, norm(Q'Q-I) = 8.18e-09
m = 8
    MGS: norm(A-QR) = 2.78e-16, norm(Q'Q-I) = 1.11e-11
    CGS: norm(A-QR) = 2.66e-16, norm(Q'Q-I) = 2.86e-06
m = 9
    MGS: norm(A-QR) = 3.35e-16, norm(Q'Q-I) = 3.22e-11
    CGS: norm(A-QR) = 3.73e-16, norm(Q'Q-I) = 5.01e-04
m = 10
    MGS: norm(A-QR) = 4.63e-16, norm(Q'Q-I) = 8.67e-10
    CGS: norm(A-QR) = 5.93e-16, norm(Q'Q-I) = 4.47e-01
m = 11
    MGS: norm(A-QR) = 4.17e-16, norm(Q'Q-I) = 2.82e-10
    CGS: 

Both algorithms produce matrices such that $QR\approx A$. However, they do a poor job at ensuring that $Q$ is orthogonal/unitary. As it happens, this problem in CGS is much more severe, and it also gets a poor $R$, unlike MGS.

When one algorithm used in finite precision has solutions with much greater error than can be obtained through a different method, we say the algorithm is *unstable*. We will have a lot more to say on this subject soon. 