# Gram-Schmidt

The Gram-Schmidt procedure constructs an orthogonal basis for an arbitrary basis.  In the context of the QR decomposition, the result is an orthonormal basis - the orthogonal basis is normalized.

In [1]:
A = rand(4,4)

4×4 Array{Float64,2}:
 0.633741  0.324408  0.00401967  0.148788
 0.572262  0.130239  0.014494    0.197192
 0.491111  0.892663  0.890411    0.334833
 0.76267   0.309708  0.345518    0.949567

In [2]:
using LinearAlgebra
q1 = A[:,1]/norm(A[:,1]) # first vector of Q, normalized

4-element Array{Float64,1}:
 0.5087106606683517
 0.4593604143077865
 0.3942198189591957
 0.6122027505893181

In [3]:
q2 = A[:,2] - (A[:,2]'*q1)*q1 # second orthogonal vector
q2 = q2/norm(q2)  # normalized

4-element Array{Float64,1}:
 -0.10008403441089209
 -0.3391678377594752 
  0.9030433315930022 
 -0.2438464786520667 

In [4]:
using LinearAlgebra
function gs(X)
    n = size(X,1)
    p = size(X,2)
    Q = zeros(size(X))
    R = zeros(p,p)
    for i = 1:p
        Q[:,i] = X[:,i] # copy next vector
        if i>1
            R[1:(i-1),i] = (Q[:,1:(i-1)])'*Q[:,i] # coefficients to remove prev vecs
            Q[:,i] = Q[:,i] - Q[:,1:(i-1)]*R[1:(i-1),i] # new vec
        end
        R[i,i] = norm(Q[:,i])   # normalizing constant for this vector
        Q[:,i] = Q[:,i]./R[i,i] # normalize this vector
    end
    return Q, R
end

gs (generic function with 1 method)

In [5]:
X, Y = gs(A)
display(X)  # first two vectors the same as above
display(Y)

4×4 Array{Float64,2}:
 0.508711  -0.100084  -0.782225    0.345428
 0.45936   -0.339168  -0.0202816  -0.820696
 0.39422    0.903043   0.0726765  -0.154343
 0.612203  -0.243846   0.61841     0.428154

4×4 Array{Float64,2}:
 1.24578  0.766366  0.571248   0.879598 
 0.0      0.653951  0.714508  -0.0109523
 0.0      0.0       0.274945   0.491171 
 0.0      0.0       0.0        0.244443 

In [6]:
display(X'*X) # check, orthogonality, should be I
norm(X'*X-I) # Frobenius/Euclidean norm, should be 0

4×4 Array{Float64,2}:
  1.0           1.94711e-16  -2.56463e-16   3.99213e-16
  1.94711e-16   1.0          -2.55028e-16  -1.77641e-16
 -2.56463e-16  -2.55028e-16   1.0           1.08061e-15
  3.99213e-16  -1.77641e-16   1.08061e-15   1.0        

1.7792299781408315e-15

In [7]:
display(X*Y - A) # check reproducing A, should be zeros
norm(X*Y - A)

4×4 Array{Float64,2}:
 0.0  0.0   1.38778e-17  2.77556e-17
 0.0  0.0  -6.93889e-18  0.0        
 0.0  0.0   0.0          0.0        
 0.0  0.0   0.0          0.0        

3.179800655392251e-17

In [8]:
function pertmat(n,m, delta)
    P = zeros(n,m)
    P[1,:] = repeat(1:1, inner=m)
    for i = 1:(min(n,m)-1)
        P[i+1,i] = delta
    end
    return P
end
V=pertmat(10,10, 1e-8)
Q, R = gs(V)
display(norm(Q*R-V))
display(norm(Q'*Q-I))

2.8906658809266317e-25

4.690415759823429

In [9]:
# Failing Gram-Schmidt
δ = 1e-8
V = [1 1 1; δ 0 0; 0 δ 0]
display(V)
Q, R = gs(V)
display(norm(Q*R-V))
display(norm(Q'*Q-I))

3×3 Array{Float64,2}:
 1.0     1.0     1.0
 1.0e-8  0.0     0.0
 0.0     1.0e-8  0.0

0.0

1.0

In [10]:
function hilbert(n)
    H = Matrix{Float64}(undef, n, n)
    for i = 1:n
        for j=1:n
            H[i,j] = 1/(i+j-1)
        end
    end
    return H
end

A = hilbert(4)
Q, R = gs(A)

([0.838116 -0.522648 0.153973 -0.0263067; 0.419058 0.441713 -0.727754 0.31568; 0.279372 0.528821 0.139506 -0.7892; 0.209529 0.502072 0.653609 0.526134], [1.19315 0.670493 0.474933 0.369835; 0.0 0.118533 0.125655 0.117542; 0.0 0.0 0.00622177 0.00956609; 0.0 0.0 0.0 0.000187905])

In [11]:
norm(Q'*Q-I) # Identity matrix?

6.236040661077094e-11

In [12]:
norm(Q*R-A) # Original matrix?

6.798699777552591e-17

# Modified Gram-Schmidt

The key distinction between Gram-Schmidt and modified Gram-Schmidt is how we step through the orthogonalization step for each new vector.  In Gram-Schmidt, we take a vector, $v$, and construct it's projection onto our already-determined orthogonal basis vectors.  Vector $v$ is projected onto each of these, and all of these projections are subtracted from $v$, so that what is left is orthogonal to that preceded it.

In modified Gram-Schmidt, $v_{i}$ is projected onto the first vector, and the projection is substracted immediately to produce an intermediate vector, $u_{i,1}$.  As we accumlate orthogonal basis vectors, the $u_{i,j}$ are updated by projecting them on the remaining basis vectors and subtracting.

So where in Gram-Schmidt we make repeated use of the original vector we are currently working on, in modified Gram-Schmidt we make use of a repeatedly updated vector.

While the two procedures yield mathematically identical results, in a finite precision world, the modified procedure produces smaller deviations in the orthogonality of Q and in the ability of $QR$ to reproduce the original matrix.

In [13]:
function mgs(X)
    n = size(X,1)
    p = size(X,2)
    Q = zeros(n, p)
    R = zeros(p,p)
    for i = 1:p # columns
        Q[:,i] = X[:,i] # copy next vector
        for j = 1:(i-1) # rows, use previous vectors one at a time
            R[j,i] = (Q[:,j])'*Q[:,i] # build next R value
            Q[:,i] = Q[:,i] - R[j,i]*Q[:,j]
#            display(Q)
        end
        R[i,i] = norm(Q[:,i]) # normalizing constant
        Q[:,i] = Q[:,i]./R[i,i] # normalize this vector
    end
    return Q, R
end

mgs (generic function with 1 method)

In [14]:
XM, YM = mgs(A)

([0.838116 -0.522648 0.153973 -0.0263067; 0.419058 0.441713 -0.727754 0.31568; 0.279372 0.528821 0.139506 -0.7892; 0.209529 0.502072 0.653609 0.526134], [1.19315 0.670493 0.474933 0.369835; 0.0 0.118533 0.125655 0.117542; 0.0 0.0 0.00622177 0.00956609; 0.0 0.0 0.0 0.000187905])

In [15]:
norm(XM*YM - A)

3.925231146709438e-17

In [16]:
norm(XM'*XM-I)

4.0501609238466034e-13

In [17]:
H = hilbert(10)
Q, R = gs(H)
Qm, Rm = mgs(H)
display(norm(Q'*Q-I))
display(norm(Qm'*Qm-I))
display(norm(Q*R-H))
display(norm(Qm*Rm-H))

3.4313291893377946

0.0002550167142523661

1.088322983216908e-16

1.1978388074229732e-16