# Homework set 2

Please **submit this Jupyter notebook through Canvas** no later than **Mon Nov. 13, 9:00**. **Submit the notebook file with your answers (as .ipynb file) and a pdf printout. The pdf version can be used by the teachers to provide feedback. A pdf version can be made using the save and export option in the Jupyter Lab file menu.**

Homework is in **groups of two**, and you are expected to hand in original work. Work that is copied from another group will not be accepted.

# Exercise 0
Write down the names + student ID of the people in your group.

**Pablo Alves** - 15310191
**Nitai Nijholt** - 12709018

## Importing packages
Execute the following statement to import the packages `numpy`, `math` and `scipy.sparse`. If additional packages are needed, import them yourself.

In [15]:
import math
import numpy as np
import scipy.sparse as sp
from scipy.linalg import lu_factor, lu_solve

# Sparse matrices

A matrix is called sparse if only a small fraction of the entries is nonzero. For such matrices, special data formats exist. `scipy.sparse` is the scipy package that implements such data formats and provides functionality such as the LU decomposition (in the subpackage `scipy.sparse.linalg`).

As an example, we create the matrix 
$$\begin{bmatrix}
1 & 0 & 2 & 0 \\ 
0 & 3 & 0 & 0 \\
0 & 0 & 4 & 5 \\
0 & 0 & 0 & 6 \end{bmatrix}$$ in the so called compressed sparse row (CSR) format. As you can see, the arrays `row`, `col`, `data` contain the row and column coordinate and the value of each nonzero element respectively.

In [14]:
# a sparse matrix with 6 nonzero entries
row = np.array([0, 0, 1, 2, 2, 3])
col = np.array([0, 2, 1, 2, 3, 3])
data = np.array([1.0, 2, 3, 4, 5, 6])
sparseA = sp.csr_array((data, (row, col)), shape=(4, 4))

# convert to a dense matrix. This allows us to print to screen in regular formatting
denseA = sparseA.toarray()
print(denseA)


[[1. 0. 2. 0.]
 [0. 3. 0. 0.]
 [0. 0. 4. 5.]
 [0. 0. 0. 6.]]


For sparse matrices, a sparse data format is much more efficient in terms of storage than the standard array format. Because of this efficient storage, very large matrices of size $n \times n$ with $n = 10^7$ or more can be stored in RAM for performing computations on regular computers. Often the number of nonzero elements per row is quite small, such as 10's or 100's nonzero elements per row. In a regular, dense format, such matrices would require a supercomputer or could not be stored.

In the second exercise you have to use the package `scipy.sparse`, please look up the functions you need (or ask during class).

# Heath computer exercise 2.1

## (a)
Show that the matrix
$$ A = \begin{bmatrix} 
0.1 & 0.2 & 0.3 \\
0.4 & 0.5 & 0.6 \\
0.7 & 0.8 & 0.9
\end{bmatrix}.$$
is singular. Describe the set of solutions to the system $A x = b$ if
$$ b = \begin{bmatrix} 0.1 \\ 0.3 \\ 0.5 \end{bmatrix}. $$
(N.B. this is a pen-and-paper question.)


#### (a.i) Showing A is singular
It will suffice to show that $det(A) = 0$. 


By simple inspection we see that $R_3 = 2 \cdot R_2 - R_1$. 

Because the third row of $A$ is a linear combination of the previous two rows, this in turn implies $det(A) = 0$ 

Which in turns determines that $A$ is singular. 

$\blacksquare$

#### (a.ii) Describing the set of solutions
Observing the $Ax=b$ system we want to solve, we notice that we can first simplify it.

Taking the common term $1/10$ out of both $A$ and $b$ and cancelling it, we are then left with:

$ \begin{pmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9 
\end{pmatrix}  
$
$ * $
$ \begin{pmatrix}
x \\
y \\
z 
\end{pmatrix} 
$
$ = $
$ \begin{pmatrix}
1 \\
3 \\
5 
\end{pmatrix} 
$

Giving rise to equations:

$x + 2y + 3z = 1$

$4x + 5y + 6z = 3$

$7x + 8y + 9z = 5$.

Solving for $x$ in first equation yields $x=1-2y-3z$

Substituting this in the second equation yields $3y+6z=1$

Which yields $y=-(6z-1)/3$

Substituting $y$ back in the $x$ equation yields $ x = 1 - 3z + 4z - 1/3 $

Which yields $x= -9z + 5/3$

Therefore our infinite solutions will be of the form:

$x= -9z + 5/3$

$y=-(6z-1)/3$

$z = z$

$\blacksquare$

## (b)
If we were to use Gaussian elimination with partial pivoting to solve this system using exact arithmetic, at what point would the process fail?


#### (b) Answer 
Let

$ A' = (A|b) =
\begin{pmatrix}
1 & 2 & 3 & 1 \\
4 & 5 & 6 & 3 \\
7 & 8 & 9 & 5
\end{pmatrix}  
$

We first try to create a zero at element $a'_{21} = 4$

For this we compute $R_2 \leftarrow R_3/2 + R_1/2 = (R_3 + R_1)/2$

Yielding:

$ A' = \begin{pmatrix}
1 & 2 & 3 & 1 \\
0 & 0 & 0 & 0 \\
7 & 8 & 9 & 5
\end{pmatrix}  
$

We then try to create a zero at element $a'_{31} = 7$

For this we compute $R_3 \leftarrow R_3 - 7 \cdot R_1$

Yielding:

$ A' = \begin{pmatrix}
1 & 2 & 3 & 1 \\
0 & 0 & 0 & 0 \\
0 & -6 & -12 & -2
\end{pmatrix}  
$

Our next element is $a'_{32} = -6$

However, the process fails here, as it is not possible to create a zero in this position.

 $\blacksquare$

## (c)
Because some of the entries of $A$ are not exactly representable in a binary floating point system, the matrix is no longer exactly singular when entered into a computer; thus, solving the system by Gaussian elimination will not necessarily fail. Solve this system on a computer using a library routine for Gaussian elimination. Compare the computed solution with your description of the solution set in part (a). What is the estimated value for $\text{cond}(A)$? How many digits of accuracy in the solution would this lead you to expect?

#### (c.i) Solve the system with a library routine for Gaussian elimination

In [34]:
# Define A and b
A = np.array([[0.1,0.2,0.3],[0.4,0.5,0.6],[0.7,0.8,0.9]])
b = np.array([0.1,0.3,0.5])

# Compute LU decomposition
lu, piv = lu_factor(A)

# Solve the system
x = lu_solve((lu, piv), b)

# Estimate cond(A)
cond_number = np.linalg.cond(A)

# Print results
print('A:',  A)
print('b',   b)
print('LU',  lu)
print('piv', piv)
print('x',   x)
print('Condition number:', cond_number)

A: [[0.1 0.2 0.3]
 [0.4 0.5 0.6]
 [0.7 0.8 0.9]]
b [0.1 0.3 0.5]
LU [[7.00000000e-01 8.00000000e-01 9.00000000e-01]
 [1.42857143e-01 8.57142857e-02 1.71428571e-01]
 [5.71428571e-01 5.00000000e-01 1.11022302e-16]]
piv [2 2 2]
x [ 0.16145833  0.67708333 -0.171875  ]
Condition number: 2.1118968335779856e+16


#### (c.ii) Compare the computed solution with your description of the solution set in part (a). 
Unlike the solution obtained in part (a), the compution performed yields a unique solution for $x$, which is mathematically inaccurate.

#### (c.iii) What is the estimated value for $\text{cond}(A)$? 

The estimated value is $2.1118968335779856 \cdot 10^{16}$

#### (c.iv) How many digits of accuracy in the solution would this lead you to expect?

In our case, because the exponent in our condition number is 16, we are expect to loose at least up to 16 digits of accuracy in our result [1]

Because the solution values in $x$ are small (within $-1$ and $1$), this renders our result basically useless in terms of accuracy.

Intuitively, our condition number reflects the fact that the output $x$ values of the system vary greatly to a small change in the input matrix $A$,

which is an unexpected behavior for a simple system like this one which is computed assuming a unique solution for $x$.

Thus, the huge condition number is serving as a cuantitative proxy for the qualitative fact that our system does not actually have a unique solution.

In short, this example illustrates the importance of analyzing the results of our computations and how the condition number can be used as an indicator in systems of linear equations.

[1] Fore a detailed explanation of the underlying math, see https://math.stackexchange.com/questions/2392992/matrix-condition-number-and-loss-of-accuracy

#### (c.v) EXTRA: Solving the system after simplifying it first

We will illustrate this point further by repeating the previous computation on the equivalent system that results from first simplifying the $1/10$ term, as done in part (a)

In [32]:
print('Results when simplifying A and b first:')

# Define A and b
A = 10*np.array([[0.1,0.2,0.3],[0.4,0.5,0.6],[0.7,0.8,0.9]])
b = 10*np.array([0.1,0.3,0.5])

# Compute LU decomposition
lu, piv = lu_factor(A)

# Solve the system
x = lu_solve((lu, piv), b)

# Estimate cond(A)
cond_number = np.linalg.cond(A)

# Print results
print('A:',  A)
print('b',   b)
print('LU',  lu)
print('piv', piv)
print('x',   x)
print('Condition number:', cond_number)

Results when simplifying A and b first:
A: [[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]
b [1. 3. 5.]
LU [[7.         8.         9.        ]
 [0.14285714 0.85714286 1.71428571]
 [0.57142857 0.5        0.        ]]
piv [2 2 2]
x [ nan -inf  inf]
Condition number: 3.813147060626918e+16


  lu, piv = lu_factor(A)


In this case, and unlike with the previous computation, the library used now gives a warning when printing the solution, 

indicating that our original matrix $A$ was singular, which is consistent with our previous results.

In particular, when solving the system, the *diagonal number 3 [of the matrix] is exactly zero.*, 

because now the elements in $A$ and $b$ of our equivalent system are not loosing accuracy due to approximations arising due to their storing in the computer.

This additional computation we perfomed highlights that:

1. Quantitative innacuracies of the storing method of decimal numbers can give rise to qualitative innacuracies in the results,

2. Mathematically equivalent systems can give rise to different computations,

3. Using a mathematically equivalent system can simplify result interpretation of limit cases, and that

4. Understanding the interplay between the mathematical model and its computation is important to properly evaluate the accuracy of its results

# Heath computer exercise 2.17

Consider a horizontal cantilevered beam that is clamped at one end but free along the remainder of its length. A discrete model of the forces on the beam yields a system of linear equations $A x = b$, where the $n \times n$ matrix $A$ has the banded form
$$
\begin{bmatrix}
 9 & -4     &  1 &  0 & \ldots & \ldots & 0 \\
-4 &  6     & -4 &  1 & \ddots && \vdots \\
 1 & -4     &  6 & -4 &  1 & \ddots & \vdots \\
 0 & \ddots & \ddots & \ddots & \ddots & \ddots & 0 \\
 \vdots & \ddots & 1 & -4 &  6 & -4 &  1 \\ 
 \vdots && \ddots    &  1 & -4 &  5 & -2 \\
 0 & \ldots & \ldots & 0 & 1 & -2 & 1 
\end{bmatrix}, $$
the $n$-vector $b$ is the known load on the bar (including its own weight), and the $n$-vector $x$ represents the resulting deflection of the bar that is to be determined. We will take the bar to be uniformly loaded, with $b_i = 1/n^4$ for each component of the load vector.


## (a)
Make a python function that creates the matrix $A$ given the size $n$.

In [4]:
# your code here

## (b)

Solve this linear system using both a standard library routine for dense linear systems and a library routine designed for sparse linear systems. Take $n=100$ and $n=1000$. How do the two routines compare in the time required to compute the solution? And in the memory occupied by the LU decomposition? (Hint: as part of this assignment, look for the number of nonzero elements in the matrices $L$ and $U$ of the sparse LU decomposition.)

In [5]:
# your code here

your answer (text) here

## (c)
For $n=100$, what is the condition number? What accuracy do you expect based on the condition number?

In [6]:
# your code here

your answer (text) here

## (d)
How well do the answers of (b) agree with each other (make an appropriate quantitative comparison)?

Should we be worried about the fact that the two answers are different?

In [7]:
# your code here

your answer (text) here