## Question 1
Given
\begin{equation}
A =
\begin{bmatrix}
| & |\\
a_1 & a_2 \\
| & |
\end{bmatrix}
\end{equation}

To obtain $A=QR$, we proceed as follows:

\begin{equation}
\hat{q}_1 = \frac{1}{||a_1||}a_1
\end{equation}

\begin{equation}
\hat{q}_2 = a_2  -  <q_1, a_2>q_1 \implies \hat{q}_2 = \frac{1}{||q_2||}\hat{q}_2
\end{equation}

\begin{equation}
Q =
\begin{bmatrix}
| & |\\
q_1 & q_2 \\
| & |
\end{bmatrix}
\end{equation}

\begin{equation}
R = 
\begin{bmatrix}
||q_1|| & <q_1, a_2>\\
0 & ||q_2||
\end{bmatrix}
\end{equation}

In [1]:
import numpy as np

In [2]:
A = np.array([[1,5],
             [2,3],
             [4,3]])

In [3]:
a1 = A[:,0].reshape(3,1)
a2 = A[:,1].reshape(3,1)
q1_hat = a1
q1 = q1_hat/np.linalg.norm(q1_hat)
q1 = q1.reshape(3,1)
print(q1)

[[0.21821789]
 [0.43643578]
 [0.87287156]]


In [4]:
q1_dot_a2 = (q1.T@a2).item()
q2_hat = a2 - q1_dot_a2*q1
q2 = q2_hat/np.linalg.norm(q2_hat)
print(q2)

[[ 0.92526984]
 [ 0.19182423]
 [-0.32722958]]


In [5]:
Q = np.concatenate([q1,q2],axis=1)
print(Q)

[[ 0.21821789  0.92526984]
 [ 0.43643578  0.19182423]
 [ 0.87287156 -0.32722958]]


In [6]:
# We see that Q is an orthonormal matrix
QT_Q = Q.T@Q
print(np.round(QT_Q,4))

[[ 1. -0.]
 [-0.  1.]]


In [7]:
# We get R using the description provided earlier above
R = np.array([[np.linalg.norm(q1_hat), q1_dot_a2],
             [0, np.linalg.norm(q2_hat)]])
print(R)

[[4.58257569 5.01901148]
 [0.         4.22013315]]


In [8]:
# We compute QR and get the original matrix A
QR = Q@R
print(QR)

[[1. 5.]
 [2. 3.]
 [4. 3.]]


In [9]:
# Computing QR using the built-in numpy method
Q_np , R_np = np.linalg.qr(A)

print("Q=\n",Q_np)
print()
print("R=\n",R_np)

Q=
 [[-0.21821789  0.92526984]
 [-0.43643578  0.19182423]
 [-0.87287156 -0.32722958]]

R=
 [[-4.58257569 -5.01901148]
 [ 0.          4.22013315]]


In [10]:
Q_np @ R_np

array([[1., 5.],
       [2., 3.],
       [4., 3.]])

## Question 3

We compute $Q$, $R$ and $H_i$ such that

\begin{equation}
Q = H_1 . H_2 .  \cdots . H_n \, , \text{    and    } R^{(k)} = H_kR^{(k-1)}
\end{equation}

and

\begin{equation}
H_i = I - 2v_iv_i^T
\end{equation}

$v_i$ is computed using

\begin{equation}
v_i = ||u_i||e_i - u_i
\end{equation}

$v_i$ is sometimes computed as
\begin{equation}
\hat{v}_i = u_i + \mathrm{sign}(u_i^{(i)}) ||u_i||e_i \, \implies v_i = \frac{1}{||\hat{v}_i||}\hat{v}_i
\end{equation}

where $\mathrm{sign}(u_i^{(i)})$ is the sign of the $i^{\text{th}}$ index of the vector $u_i$. This does not in anyway change the computed result (you can check and see for yourself).


where $e_i = [0,0,\cdots,0,1,0,\cdots,0,0]^T$ where we have $1$ only at index $i$.

Also, after the first iteration, for subsequent iterations $i$, we set $u_i^{(1)}, u_i^{(2}), \cdots , u_{i}^{(i-1)} = 0$. Note that the superscript is the index within the vector. 

Finally, $u_i = R_i^{(i)}$ is the $i^{\text{th}}$ column of $R^{(i)}$. And for the $0^{\text{th}}$ iteration, $R^{(0)} = A$.

In [11]:
A = np.array([[2,3,4],
             [1,1,1],
             [4,3,2]])

In [12]:
u1 = A[:,0].reshape(3,1)
print(u1)

[[2]
 [1]
 [4]]


In [13]:
e1 = np.array([1,0,0]).reshape(3,1)
v1_hat = u1 + (1)*np.linalg.norm(u1)*e1
print(v1_hat)

[[6.58257569]
 [1.        ]
 [4.        ]]


In [14]:
v1 = v1_hat/np.linalg.norm(v1_hat)
print(v1)

[[0.84747737]
 [0.12874556]
 [0.51498222]]


In [15]:
H1 = np.eye(3) - 2*v1@v1.T
print(H1)

[[-0.43643578 -0.21821789 -0.87287156]
 [-0.21821789  0.96684916 -0.13260335]
 [-0.87287156 -0.13260335  0.46958662]]


In [16]:
# We see that H1 is orthonormal, so H1.H1^T = I (identity)
H1_H1T = H1@H1.T
print(np.round(H1_H1T,4))

[[ 1.  0. -0.]
 [ 0.  1.  0.]
 [-0.  0.  1.]]


In [17]:
R0 = A
R1 = H1@R0
print(np.round(R1,4))

[[-4.5826 -4.1461 -3.7097]
 [-0.     -0.0856 -0.1712]
 [ 0.     -1.3425 -2.6849]]


In [18]:
u2 = R1[:,1]
u2[0] = 0
u2 = u2.reshape(3,1)
print(u2)

[[ 0.        ]
 [-0.08561454]
 [-1.34245818]]


In [19]:
e2 = np.array([0,1,0]).reshape(3,1)
v2_hat = u2 + (-1)*np.linalg.norm(u2)*e2
print(v2_hat)

[[ 0.        ]
 [-1.43079996]
 [-1.34245818]]


In [20]:
v2 = v2_hat/np.linalg.norm(v2_hat)
print(v2)

[[ 0.        ]
 [-0.72926167]
 [-0.68423491]]


In [21]:
H2 = np.eye(3) - 2*v2@v2.T
print(H2)

[[ 1.          0.          0.        ]
 [ 0.         -0.06364516 -0.99797259]
 [ 0.         -0.99797259  0.06364516]]


In [22]:
# We see that H2 is orthonormal, so H2.H2^T = I (identity)
H2_HT = H2@H2
print(np.round(H2_HT,4))

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [23]:
R2 = H2@R1
print(np.round(R2,4))

[[-4.5826  0.     -3.7097]
 [ 0.      1.3452  2.6904]
 [ 0.      0.     -0.    ]]


In [24]:
# Now we have that Q = H1.H2

Q = H1 @ H2
print(Q)

[[-0.43643578  0.88499041  0.16222142]
 [-0.21821789  0.07079923 -0.97332853]
 [-0.87287156 -0.46019501  0.16222142]]


In [25]:
# We check that Q is orthonormal Q.Q^T = I (identity) 
Q_QT = Q@Q.T
print(np.round(Q_QT,4))

[[ 1. -0. -0.]
 [-0.  1.  0.]
 [-0.  0.  1.]]


In [26]:
# Since we now have R^(2) as an upper triangular matrix, we stop the process
# And R = R2

R = R2

print(np.round(R,4))

[[-4.5826  0.     -3.7097]
 [ 0.      1.3452  2.6904]
 [ 0.      0.     -0.    ]]


In [27]:
# Finally we check that A = QR

print(Q@R)
# There is a problem with all calculation somewhere....

[[ 2.          1.19047619  4.        ]
 [ 1.          0.0952381   1.        ]
 [ 4.         -0.61904762  2.        ]]


In [28]:
# Compute QR using numpy built-in function 
Q_np, R_np = np.linalg.qr(A)

In [29]:
print("Q_np=\n",np.round(Q_np,4))

Q_np=
 [[-0.4364  0.885   0.1622]
 [-0.2182  0.0708 -0.9733]
 [-0.8729 -0.4602  0.1622]]


In [30]:
print("Q=\n",np.round(Q,4))

Q=
 [[-0.4364  0.885   0.1622]
 [-0.2182  0.0708 -0.9733]
 [-0.8729 -0.4602  0.1622]]


In [31]:
print("R_np=\n",np.round(R_np,4))

R_np=
 [[-4.5826 -4.1461 -3.7097]
 [ 0.      1.3452  2.6904]
 [ 0.      0.      0.    ]]


In [32]:
print("R=\n",np.round(R,4))

R=
 [[-4.5826  0.     -3.7097]
 [ 0.      1.3452  2.6904]
 [ 0.      0.     -0.    ]]


In [33]:
Q_np@R_np

array([[2., 3., 4.],
       [1., 1., 1.],
       [4., 3., 2.]])