# Pivoted QR

Here's another look at our modified Gram-Schmidt algorithm for QR.

In [1]:
type mgs


function [Q,R] = mgs(A)
    [m,n] = size(A);
    Q = zeros(m,n);
    R = zeros(n,n);
    for k = 1:n
        R(k,k) = norm(A(:,k));
        Q(:,k) = A(:,k)/R(k,k);
        for j = k+1:n
            R(k,j) = Q(:,k)'*A(:,j);
        end
        A = A - Q(:,k)*R(k,:);
    end
end


It's apparent that something bad happens if for some $k$, $R_{kk}=0$. In terms of $A=QR$, one interpretation of this event is that the columns $a_1,\ldots,a_k$ all lie in the span of $q_1,\ldots,q_{k-1}$. This implies a defect in the dimension of the column space; i.e., the rank of $A$ is less than $n$. 

In machine arithmetic, which is inexact for most numbers, it's rare for something to be exactly zero. But the behavior is often about as bad for "near zero" as for zero itself. Let's set out to avoid this situation by replacing $R_{kk}$ with whatever is as large as possible, in context.

Again considering $A = \sum_{k=1}^n q_k r_k^T$, define $A_j=\sum_{k=j}^n q_k r_k^T$. The MGS algorithm starts off by noting 

$$Ae_1=A_1e_1 = \sum_{k=1}^n q_k (r_k^Te_1) = R_{11}q_1,$$

where to simplify the sum we applied the upper triangular structure of $R$, specifically, $R_{21}=\cdots=R_{m1}=0$. This formula is used immediately to compute $R_{11}$, then $q_1$, both from the first column $A_1e_1$.

But if $R_{11}$ is zero (or small), this creates a problem. So instead suppose that we use the column of $A$ that produces the largest possible value. That is, let 

$$j_1 = \text{argmax}_k \| A_1 e_k \|_2,$$

and then replace the triangular requirement with 

$$R_{2,j_1}=\cdots=R_{m,j_1}=0.$$

This creates $A_1 e_{j_1}=R_{1,j_1}q_{1}$, from which we get $R_{1,j_1}$ and then $q_{1}$. Furthermore, $q_{1}^TA_1=r_{1}^T$, just as before, allowing us to fill out the rest of the first row of $R$. 

For example,

In [2]:
Q = zeros(6,3);
R = zeros(3,3);
A = randi(9,6,3)


A =

     8     3     9
     9     5     5
     2     9     8
     9     9     2
     6     2     4
     1     9     9



In [3]:
A1 = A;
c = sqrt( sum(A1.^2,1) )


c =

   16.3401   16.7631   16.4621



In [4]:
j = zeros(3,1);
[cmax,j(1)] = max(c)


cmax =

   16.7631


j =

     2
     0
     0



In [5]:
R(1,j(1)) = cmax


R =

         0   16.7631         0
         0         0         0
         0         0         0



In [6]:
Q(:,1) = A1(:,j(1))/R(1,j(1));
cols = true(3,1);
cols(j(1)) = false;
R(1,cols) = Q(:,1)'*A1(:,cols)


R =

   11.2748   16.7631   13.7803
         0         0         0
         0         0         0



By our definitions, $A_2=A_1-q_1r_1^T$.

In [8]:
A2 = A1 - Q(:,1)*R(1,:)


A2 =

    5.9822    0.0000    6.5338
    5.6370         0    0.8897
   -4.0534         0    0.6014
    2.9466         0   -5.3986
    4.6548         0    2.3559
   -5.0534         0    1.6014



Note that we have zeroed out the third column. In fact, we're not going to touch that column of $R$ any more (since doing so would wreck the quasi-triangularity we set up). Instead, we choose the next column like we did at the start.

In [9]:
c = sqrt( sum(A2.^2,1) );
[cmax,j(2)] = max(c)


cmax =

   11.8270


j =

     2
     1
     0



Now the game is to set $R_{k,j_2}=0$ for all $k>2$. Doing so means that 

$$A_2 e_{j_2}=\sum_{k=2}^m q_k (r_k^Te_{j_2}) =  R_{2,j_2}q_2.$$

And then also $q_2^T A_2=r_2^T$. 

In [10]:
R(2,j(2)) = cmax
Q(:,2) = A2(:,j(2))/R(2,j(2));
cols(j(2)) = false;
R(2,cols) = Q(:,2)'*A2(:,cols)


R =

   11.2748   16.7631   13.7803
   11.8270         0         0
         0         0         0


R =

   11.2748   16.7631   13.7803
   11.8270         0    2.4207
         0         0         0



So we have $A_3=A_2-q_2r_2^T$:

In [11]:
A3 = A2 - Q(:,2)*R(2,:)


A3 =

         0    0.0000    5.3094
         0         0   -0.2641
         0         0    1.4311
         0         0   -6.0017
         0         0    1.4031
         0         0    2.6357



Another column zeroed out! Etc.

In [12]:
c = sqrt( sum(A3.^2,1) );
[cmax,j(3)] = max(c)


cmax =

    8.6743


j =

     2
     1
     3



In [13]:
R(3,j(3)) = cmax
Q(:,3) = A3(:,j(3))/R(3,j(3));


R =

   11.2748   16.7631   13.7803
   11.8270         0    2.4207
         0         0    8.6743



So now we have a factorization...

In [14]:
norm(A-Q*R)


ans =

   1.8812e-15



...but $R$ is no longer upper triangular. However, it's virtually so. In column $j_1$ we enforced zero below row 1, in column $j_2$ we enforced zero below row 2, and so on. So if we take the columns in order $j_1,\ldots,j_m$, the triangularity is restored.

In [15]:
R(:,j)


ans =

   16.7631   11.2748   13.7803
         0   11.8270    2.4207
         0         0    8.6743



This reordering might be expressed using right-multiplication by a permutation matrix $P$. In fact, we might say $RP = U$ for a truly triangular $U$, or $A=QUP^{-1}$. However, it's customary to put the permutation next to $A$ and continue using $R$ to mean a truly upper triangular matrix: $AP=QR$. This is a **pivoted QR factorization**. 

I'm not going to put a complete algorithm here. In practice one doesn't use MGS for this anyway, and there are some further efficiencies I've skipped. 

The default in MATLAB is to get an *unpivoted* QR.

In [17]:
[Q,R] = qr(A,0);
R


R =

  -16.3401  -11.5666  -11.2606
         0   12.1332    8.3039
         0         0    8.6743



However, you can request the pivoted form.

In [18]:
[Q,R,P] = qr(A,0);
R


R =

  -16.7631  -11.2748  -13.7803
         0  -11.8270   -2.4207
         0         0    8.6743



This is the same as ours (up to arbitrary signs). The permutation is the third output.

In [19]:
P


P =

     2     1     3

