<a href="https://colab.research.google.com/github/zyin36/MAT-422/blob/main/hw1_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### 1.3.1 QR Decomposition
*An orthogonal set of unit vectors is called an orthonormal set.* <br/>
*We only consider matrices of real numbers in this assignment.* <br/> <br/>
For any square matrix of real numbers $\mathbf{A}$,
we could decompose it as
$\mathbf{A = QR}$ , where:<br>
- $\mathbf{Q}$ is the result of applying the Gram-Schmidt process on 𝐀
- $\mathbf{R}$ is a combination of the vectors of 𝐐 and the vectors of 𝐀, which is also an upper triangular matrix.







In [5]:
import numpy as np

# generated from http://www.bluetulip.org/2014/programs/basis.html
A = np.array([
    [2.2222222222222223, 12.555555555555555, 7.947368421052632, 1.3888888888888888],
    [26.166666666666668, 1, 18.4, 12.75],
    [3.923076923076923, 4.9375, 1.2857142857142858, 10.923076923076923],
    [26.6, 4.588235294117647, 16.22222222222222, 13.333333333333334] ])

# Fortunately, numpy has a built-in function for qr factorization.
(q, r) = np.linalg.qr(A, mode = "reduced") # Q would be M x K, and R would be K x N in "reduced" mode.
print("QR Decomposition: ")
print(f"A: \n{A}\n")
print(f"Q: \n{q}\n")
print(f"R: \n{r}\n")



QR Decomposition: 
A: 
[[ 2.22222222 12.55555556  7.94736842  1.38888889]
 [26.16666667  1.         18.4        12.75      ]
 [ 3.92307692  4.9375      1.28571429 10.92307692]
 [26.6         4.58823529 16.22222222 13.33333333]]

Q: 
[[-0.05912627  0.92056623  0.37801267  0.07853869]
 [-0.69621179 -0.19700763  0.47494335 -0.50090513]
 [-0.1043806   0.33029916 -0.68826237 -0.63741828]
 [-0.70774142  0.06817824 -0.39727843  0.58019278]]

R: 
[[-37.58434845  -5.20123829 -24.8955374  -19.53552975]
 [  0.          13.30488275   5.22181153   3.2836433 ]
 [  0.           0.           4.41351581  -6.23444328]
 [  0.           0.           0.          -5.50412396]]



### 1.3.2 Least-squares Problems

When a linear system is inconsistent, how can we get the best estimation?
<br> <br>
If the system <i>𝐀 </i>$\mathbf{ x = b}$
is not consistent, then we might want to find the best approximate solution. <br><br>
Goal: Find a linear combination of the columns of <i>𝐀 </i> that minimizes the residual.
<br>
We do so by first projecting 𝐛 onto the columns of <i>𝐀 </i>, which gives us $\hat{b}$. <br>
Now we could find the apporximate solution $\hat{x}$ using the normal equation.



In [10]:
A = np.array([
    [1, 0, 0],
    [0, 1, 0],
    [0, 0, 1],
    [1, 0, 0]
])

b = np.array([5, 2, 2, 3])
# using built-in least squared function in numpy
(x, residual, _, _) = np.linalg.lstsq(A, b, rcond=None)
print(f"x: {x}")
print(f"residual: {residual}")

x: [4. 2. 2.]
residual: [2.]


### Linear Regression 1.3.3

The point is to fit a linear function over a set of data points, so that we can predict the results for certain inputs somewhat accurately.

In [25]:
'''
To demonstrate, let's generate an array of (x, y) pairs,
where y = x + (random number between 0 and 0.25)
'''
x = np.empty(50)
y = np.empty(50)

for i in range(50):
  x[i] = i
  y[i] = i + np.random.uniform(0, 0.25)

# function form is mx + c
# A (c, m) = x, find the approximation of c and m
A = np.empty([50, 2])
for i in range(50):
  A[i][0] = 1
  A[i][1] = x[i]

(b, residual, _, _) = np.linalg.lstsq(A, y, rcond=None)
print(f"Estimated: y = {b[1]}x + {b[0]}")
print(f"A: \n{A}\n")
print(f"y: \n{y}\n")


Estimated: y = 1.000362737260959x + 0.11170649104572278
A: 
[[ 1.  0.]
 [ 1.  1.]
 [ 1.  2.]
 [ 1.  3.]
 [ 1.  4.]
 [ 1.  5.]
 [ 1.  6.]
 [ 1.  7.]
 [ 1.  8.]
 [ 1.  9.]
 [ 1. 10.]
 [ 1. 11.]
 [ 1. 12.]
 [ 1. 13.]
 [ 1. 14.]
 [ 1. 15.]
 [ 1. 16.]
 [ 1. 17.]
 [ 1. 18.]
 [ 1. 19.]
 [ 1. 20.]
 [ 1. 21.]
 [ 1. 22.]
 [ 1. 23.]
 [ 1. 24.]
 [ 1. 25.]
 [ 1. 26.]
 [ 1. 27.]
 [ 1. 28.]
 [ 1. 29.]
 [ 1. 30.]
 [ 1. 31.]
 [ 1. 32.]
 [ 1. 33.]
 [ 1. 34.]
 [ 1. 35.]
 [ 1. 36.]
 [ 1. 37.]
 [ 1. 38.]
 [ 1. 39.]
 [ 1. 40.]
 [ 1. 41.]
 [ 1. 42.]
 [ 1. 43.]
 [ 1. 44.]
 [ 1. 45.]
 [ 1. 46.]
 [ 1. 47.]
 [ 1. 48.]
 [ 1. 49.]]

y: 
[ 0.07650222  1.03360941  2.12852004  3.16438319  4.21817588  5.21600181
  6.04365121  7.14923635  8.01616519  9.03134593 10.09953431 11.05822086
 12.15442852 13.06124412 14.24863741 15.1031499  16.10099411 17.22962688
 18.24931158 19.00124462 20.22285796 21.1444963  22.01663275 23.16497307
 24.01846261 25.06745412 26.11264432 27.0256641  28.19287764 29.15654569
 30.22156514 31.021