# AM120 HW05
## Zachary Miller

### 2a

In [22]:
import numpy as np

In [23]:
A = np.array([[-4,-1,-3],[1,-4,5],[3,4,3],[-5,-1,2]]);
b = np.array([-1,4,-4,-2]).reshape(4,1)

The solution to $A\vec{x}=\vec{b}$ with $A$ and $\vec{b}$ as given above does not exist. Instead, we will look for a solution that minimizes the norm of the error $\vec{e}=A\vec{x}-\vec{b}$, which is given by $\vec{e}^T\vec{e}$. To do this, we take the derivative with respect to $\vec{x}$ of $\vec{e}^T\vec{e}$ and set it to zero. After some calculus, this comes out to be 

$$2(A^TA\vec{x}-A^T\vec{b})=0$$ $$A^TA\vec{x}=A^T\vec{b}$$ $$\vec{x}=(A^TA)^{-1}A^T\vec{b}$$ 

Calculating below...

In [24]:
x = np.linalg.inv(A.T@A)@A.T@b
print("x:")
print(x)

x:
[[ 0.58732799]
 [-1.19370936]
 [-0.22879177]]


### 2b

Next we are asked to calculate the residual $\vec{r}=A\vec{x}-\vec{b}$. Calculating below...

In [25]:
r = A@x-b
print("r:")
print(r)

r:
[[ 0.53077272]
 [ 0.21820656]
 [ 0.30077121]
 [-0.20051414]]


The residual is non-zero. This makes sense because our system of linear equations we were trying to solve for has more equations than unkowns, and the equations are inconsistent. As a result, there will be no exact solution which is why we found the least squares solution instead.

### 2c

In [26]:
r_norm = np.linalg.norm(r)
b_norm = np.linalg.norm(b)
print("r_norm = ", r_norm)
print("b_norm = ", b_norm)

r_norm =  0.6782352278863029
b_norm =  6.082762530298219


Comparing the norm of the residual with the norm of $\vec{b}$, the norm of the residual is approximately one tenth of the norm of $\vec{b}$. Whether this residual is big or small will depend on the specific application, but for most applications this would probably be considered a relatively small residual. In general, it is a good idea to compare the norm of the residual with the norm of $\vec{b}$ since the relative sizes of these two numbers tell you something about the relative "fit" of your least squares solution. For example, if you had a really small $||\vec{b}||$, then even if the residual is also small (but on the same order as $||\vec{b}||$), it would likely not be considered a good fit since the $||\vec{b}||$ was small to begin with

### 2d

In [31]:
py_x = np.linalg.lstsq(A,b,rcond=None)[0]

In [32]:
print("Python's solution: \n", py_x)
print("My solution: \n", x)

Python's solution: 
 [[ 0.58732799]
 [-1.19370936]
 [-0.22879177]]
My solution: 
 [[ 0.58732799]
 [-1.19370936]
 [-0.22879177]]


As you can see, python's solution and my solution are the same.

### 3a

In [113]:
import numpy as np
import scipy.linalg

In [100]:
A = np.array([[-0.9, 0.7, -0.1],[0.6, 0.3, -0.1]])
b = np.array([0.6, 0.8])

We are given the under-determined system above and told to solve it using the pseudo inverse of $A$, denoted as $A^{\dagger}$. We can do this by getting the usual $U, V^{T},$ and $\Sigma$ via standard SVD of $A$. We then take the pseudo inverse of $\Sigma$, defined as the transpose of $\Sigma$ with the non-zero diagonal elements of $\Sigma$ replaced by their inverse, to obtain $\Sigma^{\dagger}$. Then, we finally obtain $A^{\dagger}=V \Sigma^{\dagger} U^T$. Doing this below...

In [112]:
# Get the SVD of A
U, Sigma, V_t = np.linalg.svd(A, full_matrices=True)

# Make the Sigma array from the list of singular values
Sigma_diag = np.diag(Sigma)
Sigma_mat = np.zeros(A.shape)
Sigma_mat[:Sigma_diag.shape[0],:Sigma_diag.shape[1]] = Sigma_diag

# Calculate the pseudo inverse of Sigma_mat
Sigma_mat_pinv = np.copy(Sigma_mat)
Sigma_mat_pinv[Sigma_mat_pinv != 0] = 1/Sigma_mat_pinv[Sigma_mat_pinv != 0]
Sigma_mat_pinv = Sigma_mat_pinv.T

# Calculate the pseudo inverse of A using the formula above
A_pinv = V_t.T@Sigma_mat_pinv@U.T

print("Pseudo Inverse of A:\n", A_pinv)

Pseudo Inverse of A:
 [[-0.44382247  0.99560176]
 [ 0.83566573  1.2335066 ]
 [-0.15593762 -0.32586965]]


Now that we have $A^{\dagger}$, we can use it to find $x_1$ by calculating $x_1 = A^{\dagger}\vec{b}$. Doing this below and comparing to two different ways of sovling this problem with python...

In [125]:
# Calculate the equation above
x_1 = A_pinv@b

# Solve using two other python functions
x_2 = scipy.linalg.pinv(A)@b
x_3 = np.linalg.lstsq(A,b,rcond=None)[0]

print("My solution: \n", x_1)
print("\nSolution using scipy.linalg.pinv: \n", x_2)
print("\nSolution using np.linalg.lstsq: \n", x_3)

My solution: 
 [ 0.53018792  1.48820472 -0.3542583 ]

Solution using scipy.linalg.pinv: 
 [ 0.53018792  1.48820472 -0.3542583 ]

Solution using np.linalg.lstsq: 
 [ 0.53018792  1.48820472 -0.3542583 ]


As you can see, my answer is the same as using scipy.linalg.pinv or np.linalg.lstsq. scipy.linalg.pinv is just the scipy function for calculating the pseudo inverse, and np.linalg.lstsq is the numpy function for finding the least squares solution to a system of linear equation. 