In today's class, you solved the exact same system of equations using LU as you did last week using row-reduction of an augmented matrix and back substitution. It certainly seems you had to do more work this time when you already had a way of solving equations. What is the advantage of using LU decomposition? This homework will explore this question.

> ## Make a copy of this notebook (File menu -> Make a Copy...)

Suppose that you are doing the same experiment on a number of different samples. You measure your outputs at the same time points, but get different results each time, depending on your sample. You want to fit polynomials to each of your data sets. As we saw when we fitted polynomials to data, this will involve solving an equation $Ax=b$ for different $b$'s, but always the same $A$.

Consider the following table, showing the results of three such experiments:

$t$ | 1 | 2 | 3 | 4 | 5 | 6 | 7
--- | :---: | :---: | :---: |:---: |:---: |:---: |:---: |
$y_1$ | 10 | 15 | -1 | 2 | -4 | 5 | 10
$y_2$ | 10 | 13 | 0 | 2 | -3 | 5 | 11
$y_3$ | 11 | 14 | -1 | 3 | -5 | 4 | 9

We will fit polynomials to each of these and compare using row-reduction and back-substitution to LU decomposition. To do so, we'll need to do the following:
* Understand the role of pivoting.<br><br>
* Solve the equations using row-reduction and back-substitution.<br><br>
* Lastly, compare this to solving them using LU decomposition, followed by forward- and back-substitution.

## Pivoting

Recall that LU decomposition with pivoting takes a matrix $A$ and returns matrices $P$, $L$, and $U$ so that $$PA=LU$$

We are trying to solve $Ax=v$. If we have a matrix $P$, then we can multiply both sides by it to get $$PAx=Pv$$ But $PA=LU$, so this is equivalent to $$LUx=Pv.$$

So all we need to do is multiply our $v$ by $P$ before we begin foward- or back-substitutions! Remember that each row in an augmented matrix represents one of the equations in the system. So all we are really doing here is swapping around the equations.

### Homework Question 1 

Use your row-reduction code and your `backsub(U,v)` function to find the coefficients of a sixth degree polynomial that fits each of the above data sets. In each case, this will involve solving $Ax=b$. 
1. Explain why the matrix $A$ is the same in each case. What is it?<br><br>
1. Write down the sixth degree polynomials in each case. Write the coefficients of each power to two decimal places.

In [1]:
import numpy as np
from Qiureferencefunctions import backsub, fwdsub, LU
def LUSolve(L,U,P,v):
    v = P@v
    y = fwdsub(L,v)
    x = backsub(U,y)
    return x

t = np.array([1.,2,3,4,5,6,7])
y1 = np.array([10,15,-1,2,-4,5,10])
y2 = np.array([10,13,0,2,-3,5,11])
y3 = np.array([11.,14,-1,3,-5,4,9])
rows = t.shape[0]
A = np.zeros((rows,rows))
for i in range(0,rows):
    x = pow(t,i)
    A[:,i] = x
A = np.fliplr(A)
#np.set_printoptions(precision=10)
np.set_printoptions(suppress=True)
print(A)

U,L,P = LU(A)
print(LUSolve(L,U,P,y1))
print(LUSolve(L,U,P,y2))
print(LUSolve(L,U,P,y3))

[[     1.      1.      1.      1.      1.      1.      1.]
 [    64.     32.     16.      8.      4.      2.      1.]
 [   729.    243.     81.     27.      9.      3.      1.]
 [  4096.   1024.    256.     64.     16.      4.      1.]
 [ 15625.   3125.    625.    125.     25.      5.      1.]
 [ 46656.   7776.   1296.    216.     36.      6.      1.]
 [117649.  16807.   2401.    343.     49.      7.      1.]]
[  -0.29861111    7.27083333  -70.09027778  339.47916667 -859.61111111
 1052.25       -459.        ]
[  -0.23888889    5.80833333  -55.88888889  270.125      -682.37222222
  832.56666667 -360.        ]
[  -0.33055556    8.00833333  -76.68055556  368.125      -921.98888889
 1114.86666667 -481.        ]


The matrix $A$ is going to be the same in every case because here we are using the same values of t (time) as our "x" in order to generate a 6-degree polynomial that fits the data given. In utilizing the data from $y_1$, $y_2$, and $y_3$, we can generate coefficients for each of the "x" values in the polynomial (the values that are in our $A$ matrix) that best fit each data set. Here, $A$ is represented by the following matrix.

$$\begin{bmatrix} 
1&1&1&1&1&1&1\\
64&32&16&8&4&2&1\\
729&243&81&27&9&3&1\\
4096&1024&256&64&16&4&1\\
15625&3125&625&125&25&5&1\\
46656&7776&1296&216&36&6&1\\
117649&16807&2401&343&49&7&1\end{bmatrix}$$


$$\begin{align*} y_1 = -0.30x^6 + 7.27x^5 - 70.09x^4 + 339.48x^3 - 859.61x^2 + 1052.25x - 459 \\ y_2 = -0.24x^6 + 5.81x^5 - 55.89x^4 + 270.13x^3 - 682.37x^2 + 832.57x - 360 \\ y_3 = -0.33x^6 +8.01x^5 - 76.68x^4 + 368.13x^3 - 921.99x^2 + 1114.87x - 481\end{align*}$$

### Homework Question 2
Write a function called `LUSolve(L,U,P,v)` that does the following given an LU decomposition of a matrix $A$:
1. First, multiplies the vector $v$ by $P$, as we discussed was needed.<br><br>
1. Solves $Ly=Pv$ by forward substition.<br><br>
1. Lastly, solves $Ux=y$ to find the solution of $Ax=v$.

Test your function on the data above. 

In [25]:
def LUSolve(L,U,P,v):
    v = P@v
    y = fwdsub(L,v)
    x = backsub(U,y)
    return x


print(LUSolve(L,U,P,y1))
print(LUSolve(L,U,P,y2))
print(LUSolve(L,U,P,y3))

[  -0.2986111111    7.2708333333  -70.0902777778  339.4791666666
 -859.6111111111 1052.25         -459.          ]
[  -0.2388888889    5.8083333333  -55.8888888889  270.125
 -682.3722222222  832.5666666666 -360.          ]
[  -0.3305555556    8.0083333333  -76.6805555556  368.125
 -921.9888888888 1114.8666666666 -481.          ]


Note that since the matrix $A$ is always the same, we only have to use our $LU$ decomposition code once! This is much faster than having to do the row-reduction over and over for each output vector. The LU decomposition encodes the process of row-reduction in the lower-triangular matrix $L$, thus avoiding the need to recompute it.

Lastly, if you look at the data sets given above, you may notice that they are all quite similar to each other numerically. Yet the polynomials you generated are rather vastly different from each other. This is a serious problem. We say that the polynomial model has high *variance*. We will study this further in future labs.

### Homework Question 3

Write code that takes a set of *n* times (as a vector) and the outcomes of a number (say, *m*) of different experiments with measurements at those times (as an $m\times n$ array), and returns the coefficients of polynomials that fit each set of measurements. Your code should use LU decomposition and your `LUSolve(L,U,P,v)` function to make it as efficient as possible. Test your code on the above data.

In [31]:
def nxm(n,m):
    rows = n.shape[0]
    A = np.zeros((rows,rows))
    for i in range(0,rows):
        x = pow(n,i)
        A[:,i] = x
    A = np.fliplr(A)
    U,L,P = LU(A)
    ret = np.zeros((m.shape[0],m.shape[1]))
    for i in range(m.shape[0]):
       ret[i] = (LUSolve(L,U,P,m[i]))
    return ret
    
t = np.array([1.,2,3,4,5,6,7])
y1 = np.array([10,15,-1,2,-4,5,10])
y2 = np.array([10,13,0,2,-3,5,11])
y3 = np.array([11.,14,-1,3,-5,4,9])
y = np.array([[10,15,-1,2,-4,5,10],[10,13,0,2,-3,5,11],[11.,14,-1,3,-5,4,9]])

print(nxm(t,y))

[[  -0.2986111111    7.2708333333  -70.0902777778  339.4791666666
  -859.6111111111 1052.25         -459.          ]
 [  -0.2388888889    5.8083333333  -55.8888888889  270.125
  -682.3722222222  832.5666666666 -360.          ]
 [  -0.3305555556    8.0083333333  -76.6805555556  368.125
  -921.9888888888 1114.8666666666 -481.          ]]


### Homework Question 4 

Suppose you have a number of different output vectors $\vec{c}$ for the same set of equations. We have two different ways of solving $A\vec{x}=\vec{c}$:

* Row reduce the augmented matrix $[A|\vec{c}]$, then back substitute. Repeat for every different $\vec{c}$.<br><br>

* Find the $PA=LU$ decomposition of $A$, then use our `LUSolve(L,U,P,v)` function we wrote above.

Explain why we expect the second method to be far more efficient than the first if we have many different output vectors.

If we try to do Gaussian elimination for many different output vectors, this means we have to reduce the augmented matrix for every single output vector even though the left side of the matrix is the exact same every time. This takes an immense amount of time and floating point operations. On the other hand, using decomposition takes $n^2$ floating point operations and is much cheaper/faster. This is because although the initial decomposition of$A$ into $PA=LU$ is fairly expensive, the subsequent operations necessary to solve for a number of different output vectors are just back-substitution and forward-substitution, which are fairly fast, and we simply reuse the matrices we found in the decomposition step for these fast operations.

### Homework Question 5

Let's examine once again the *LU* decomposition of the matrix from the last homework: $$A=\begin{bmatrix} 10^{-4} & 0 & 10^4 \\ 10^4 & 10^{-4} & 0 \\ 0 & 10^4 & 1\end{bmatrix}.$$

As you saw in the lab, the code for *LU* decomposition without pivoting results in matrices *L* and *U* such that $A\neq LU$.

* By looking back at Question 4 from the lab and the work you did on floating point errors on Homework 3, explain exactly why you get the incorrect result you saw.

* Compute by hand the *PA=LU* decomposition for this matrix. Do you still expect a floating point error to occur? Explain why in this case, we still get the right answer using our `LU(A)` code.

This is because without pivoting here, we end up having to do various functions between values that are extremely disparate (ie one is very small and the other is very large), like subtracting a very small number from a very large number. Because python can only retain 15 significant digits of information, when we use over 15 significant digits with LU decomposition here, we lose information. However, if we include pivoting, we will not encounter a floating point error. This is because by placing the largest pivot at the top of the matrix, and using that large pivot to eliminate smaller non-pivot values in that column, we eliminate by dividing this large value and its row's values, and no longer have the risk of adding a very large number to a much smaller number that may occur when we choose a very small pivot to eliminate other rows.

In [35]:
a = np.array([[0.0001,0,10000],[10000,.0001,0],[0,10000,1]])
print(a)

U,L,P = LU(a)
print(P@a)
print(L@U)

[[    0.0001     0.     10000.    ]
 [10000.         0.0001     0.    ]
 [    0.     10000.         1.    ]]
[[10000.         0.0001     0.    ]
 [    0.     10000.         1.    ]
 [    0.0001     0.     10000.    ]]
[[10000.         0.0001     0.    ]
 [    0.     10000.         1.    ]
 [    0.0001     0.     10000.    ]]


### Optional Bonus Question

Write code that takes the coefficients of a polynomial and prints the polynomial with the coefficients printed to two decimal places. You should research Python functions that help you.

In [53]:
def coefficients(poly,decimal):
    copy = np.full_like(poly,0.)
    num = poly.shape[0]
    for i in range (num) :
        copy[i] = '%.2f' %poly[i]
    return copy
example = nxm(t,y)
print(example[1])
print(coefficients(example[1],2))

# %.2f gets 2 decimal places and print out the matrix here.

[  -0.2388888889    5.8083333333  -55.8888888889  270.125
 -682.3722222222  832.5666666666 -360.          ]
[  -0.24    5.81  -55.89  270.12 -682.37  832.57 -360.  ]
