### Givens rotation

#### Triangularization using Givens rotation

1. Settings

In [1]:
import numpy as np

# 4-by-4 example
n = 4

# creat a quick, full rank matrix without typing
tmp = np.arange(1, n+1, dtype=np.float64)

# Creat A and duplicate it (A changes along the line)
A = tmp.reshape(-1,1) ** tmp
A_ini = A.copy()

print(A)


[[  1.   1.   1.   1.]
 [  2.   4.   8.  16.]
 [  3.   9.  27.  81.]
 [  4.  16.  64. 256.]]


2. Zero out (2,1) entry to begin with


2.A. Submatrix

Let $R = \begin{bmatrix} c & - s \\ s & c \end{bmatrix}$, where a $c=\cos(\theta)$ and $s=\sin(\theta)$ for an angle $\theta$ to be determined. Partition 4-by-4 matrices into nested 2-by-2 blocks. 

$$
\left[\begin{array}{ll}
R & 0 \\
O & I
\end{array}\right]\left[\begin{array}{ll}
A_{11} & A_{12} \\
A_{21} & A_{22}
\end{array}\right]=\left[\begin{array}{ll}
R A_{11} & R A_{12} \\
A_{41} & A_{22}
\end{array}\right]
$$



We want $RA_{11}=\begin{bmatrix} * & * \\ 0 & * \end{bmatrix}$.

Let $A_{11}=\begin{bmatrix} e & f \\ g & h \end{bmatrix}$. Then, 

$$
RA_{11}=\begin{bmatrix} ce - sg & cf - sh \\ se+cg & sf+ch\end{bmatrix}=\begin{bmatrix} * & * \\ 0 & * \end{bmatrix}
$$

We need to find $c$ and $s$ that satisfies (2,1)-entry equality, $se+cg=0$, and $c^2 + s^2 = 1$. The following always satisfies them.

$$
c = \frac{e}{\sqrt{e^2 + g^2}}, \qquad s = -\frac{g}{\sqrt{e^2 + g^2}}
$$

In [2]:
# Specify rows involved in the Givens rotation: (2,1)-entry
i, j = 0, 1

# This creates indices for submatrix
# c.f. A[[i,j], [i,j]] returns [A[i,i], A[j,j]] by fancy indexing rule
ind = np.ix_([i,j], [i,j])
A_ = A[ind]
print(A_)

# computing cosine and sine: no need to find the angle.
# c.f.: unpacking is not allowed (e, g = *A_[:,0] --> error)
e, g = A_[0,0], A_[1,0]
den = np.sqrt(e*e + g*g)
c, s = e/den, -g/den

G_ = np.array([[c, -s], [s, c]])
print(G_@A_)


[[1. 1.]
 [2. 4.]]
[[2.23606798 4.02492236]
 [0.         0.89442719]]


2.B Full matrix

In [3]:
# Only rows i, j are affected (See the block multiplication above)
A[[i,j], :] = G_@A[[i,j], :]

print(A)

[[  2.23606798   4.02492236   7.60263112  14.75804865]
 [  0.           0.89442719   2.68328157   6.26099034]
 [  3.           9.          27.          81.        ]
 [  4.          16.          64.         256.        ]]


3. Repeat for the rest of the first column

3.A Modularize the procedure

In [4]:
def givens(A, ind):
    """
    Return 2-by-2 Givens rotation that zeros an element of given matrix

    Input:
        A (array): matrix whose single element is to be zeroed.
        ind (tuple/array of int): Two row indices involved. 
    Output:
        2-by-2 rotation matrix
            If ind = (i, j) is given (i < j), then A[i, j]-entry is to be zeroed
            by left-multiplying the returned matrix.
    """
    i, j = ind
    assert i < j, "Index must be increasing"
    
    # extract submatrix; only the i-th column matters
    A_ = A[[i, j], i]
    den = np.sqrt(A_[0]*A_[0] + A_[1]*A_[1])
    c, s = A_[0]/den, -A_[1]/den
    return np.array([[c, -s], [s, c]])

# Test the function
R1 = givens(A, (0, 1))
print(R1@A[[0,1], :])
print(R1.T @ R1)

[[ 2.23606798  4.02492236  7.60263112 14.75804865]
 [ 0.          0.89442719  2.68328157  6.26099034]]
[[1. 0.]
 [0. 1.]]


3.B Apply Givens rotation to the column

Note: We need to take the first index to be $i=0$. Otherwise, the rows that are already zeroed out are messed up.

In [5]:
for i in range(1):
    for j in range(2, n):
        ind = (i, j)
        A[ind, :] = givens(A, ind) @ A[ind, :]

print(A)
    
    

[[  5.47722558  18.25741858  64.63126179 237.34644159]
 [  0.           0.89442719   2.68328157   6.26099034]
 [  0.           2.1514115   10.03992032  36.57399545]
 [  0.           3.90360029  24.59268184 121.01160905]]


4. Contructing Q

Review the first three steps in block form.

$$
G_3 G_2 G_1 A 
= 
\left[\begin{array}{llll}
* & * & * & * \\
0 & * & * & * \\
0 & * & * & * \\
0 & * & * & * 
\end{array}\right],
$$

where

$$
G_1 = \left[\begin{array}{cccc}
\cos \theta_1 & -\sin \theta_1 & 0 & 0 \\
\sin \theta_1 & \cos \theta_1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 
\end{array}\right], 
\quad
G_2 = \left[\begin{array}{cccc}
\cos \theta_2 & 0 & -\sin \theta_2 & 0 \\
0 & 1 & 0 & 0 \\
\sin \theta_2 & 0 & \cos \theta_2 & 0 \\
0 & 0 & 0 & 1 
\end{array}\right], 
\quad
G_3 = \left[\begin{array}{cccc}
\cos \theta_1 & 0 & 0 & -\sin \theta_3 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
\sin \theta_3 & 0 & 0 & \cos \theta_3 
\end{array}\right].
$$

$$
G_4 = \left[\begin{array}{cccc}
1 & 0 & 0 & 0 \\
0 & \cos \theta_4 & -\sin \theta_4 & 0 \\
0 & \sin \theta_4 & \cos \theta_4 & 0 \\
0 & 0 & 0 & 1 
\end{array}\right], 
\quad
G_5 = \left[\begin{array}{cccc}
\cos \theta_2 & 0 & -\sin \theta_2 & 0 \\
0 & 1 & 0 & 0 \\
\sin \theta_2 & 0 & \cos \theta_2 & 0 \\
0 & 0 & 0 & 1 
\end{array}\right], 
\quad
G_3 = \left[\begin{array}{cccc}
\cos \theta_1 & 0 & 0 & -\sin \theta_3 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
\sin \theta_3 & 0 & 0 & \cos \theta_3 
\end{array}\right].
$$


After three more steps to zero out the whole subdiagonal entries, we have
$$
G_6 G_5 G_4 G_3 G_2 G_1 A 
= 
\left[\begin{array}{llll}
* & * & * & * \\
0 & * & * & * \\
0 & 0 & * & * \\
0 & 0 & 0 & * 
\end{array}\right]
= R,
$$

**Fact**

- Each Givens rotation is orthogonal: $G^{-1}=G^T$.
- Each Givens rotation is not symmetric. 

Therefore, we have $A = G_1^T G_2^T G_3^T G_4^T G_5^T G_6^T R$, and conclude

$$
Q = G_1^T G_2^T G_3^T G_4^T G_5^T G_6^T
$$

Thus, if we set $Q=I$, we can accumulate right multiplication of $G_k^T$'s in the loop, say, `Q = Q @ G.T`

But in real computation, we can right-multiply $m\times 2$ matrix of column $(i,j)$ of the previous product by $2$-by-$2$ matrix $\hat G_k^T = \left[\begin{array}{cc}
\cos \theta_k & \sin \theta_k \\
-\sin \theta_k & \cos \theta_k\end{array}\right]$.

5. Apply Givens rotation to all lower triaular part.

In [1]:
R = A_ini.copy()
Q = np.eye(n)

for i in range(n-1): # last column is not needed to be zeroed.
    for j in range(i+1, n): # row index must exhaust all the way
        ind = (i, j)
        G_ = givens(R, ind)
        R[ind, :] = givens(R, ind) @ R[ind, :]
        Q[:, ind] = Q[:, ind] @ G_.T

print(R)
print(Q.T @ Q)
print(A_ini)
print(Q@R)
print(np.allclose(Q@R, A_ini))

NameError: name 'A_ini' is not defined

6. Improvement

In step 2, we used the following formula for sine and cosine:

$$
c = \frac{e}{\sqrt{e^2 + g^2}}, \qquad s = -\frac{g}{\sqrt{e^2 + g^2}}.
$$



**Issue**: If both $e$ and $g$ are small, $e^2$ and $g^2$ can be too small and end up getting *underflown*, i.e. becoming 0, in the course of computing the denominator. Then, dividing by zero can occur. Or $e$ or $g$ is big, say $e \gg 1$, then their $e^2$ can be overflown.



**Improvement** (Golub, Van Loan (1997) Matrix Computations 3ed. p. 216)






- **if** $g=0$
  - $c=1$ 
  - $s=0$
- **else**
  - **if** $\vert g \vert > \vert e \vert $
    - $\tau=-e / g$
    - $s=1 / \sqrt{1+\tau^2}$
    - $c=s \tau$
  - **else**
    - $\tau=-g / e$ 
    - $c=1 / \sqrt{1+\tau^2}$ 
    - $s=c \tau$
