In [None]:
using Pkg, Revise
gla_dir = "../GenLinAlgProblems"
Pkg.activate(gla_dir)

using GenLinAlgProblems, LinearAlgebra, Latexify, Printf, SymPy
;

<div style="float:center;width:100%;text-align:center;"><strong style="height:100px;color:darkred;font-size:40px;">Condition Number</strong>
</div>

The examples in this notebook were taken from the following [reference: julia.quantecon.org](https://julia.quantecon.org/tools_and_techniques/iterative_methods_sparsity.html)

# 1. Some Comments and Definition

Formally, the condition number of a function measures how much the output value of the function can change for a small change in the input argument.<br>
Rather than providing a detailed analysis, the following observation will serve as sufficient motivation.

Finite precision arithmetic with numbers of different sizes presents a problem when numbers are of different sizes.<br>
$\qquad$ E.g., consider the following addition with 4 significant figures

\begin{align}
&111.1       &   \\
&\,\,\;\;0.01231 &   \\
&\rule{2cm}{0.4pt} & \\
&111,1
\end{align}

The SVD of a matrix $A = U \Sigma V^t$ allows us to estimate the relative sizes of numbers in a computation involving $A$
such as $A x$.

The orthogonal matrices $U$ and $V$ applied to a vector do not change the length of the vector, e.g., $\Vert U x \Vert = \Vert x \Vert$.<br>
Significant changes are due to the singular values in $\Sigma$. The condition number measures the relative sizes of the largest to the smallest singular value:

$\qquad
C(A) = \frac{\sigma_{max}(A)}{\sigma_{min}(A)}
$

**Remarks:**
* For symmetric matrices, $\sigma_{max} = \max_{i}\vert \lambda_i \vert$, and $\sigma_{min} = \min_{i}\vert \lambda_i \vert$, where $\lambda_i, i=1,...$ are the eigenvalues of $A$.<br>
$\qquad
C(A) = \frac{max_i \vert \lambda_i(A) \vert}{min_i \vert \lambda_i(A) \vert }
\quad$
* The definitions depend on the chosen matrix norm: there exist other definitions of condition numbers.
<br>
* **The larger the condition number,** the less well behaved computations with $A$.
* Computing the condition number is expensive, use it judiciously!

# 2 Examples

In [2]:
@syms e::(positive)
@syms lambda::(real)

ϵ = 1e-6

1.0e-6

#### **2 x 2 Examples**

In [3]:
println("The condition number for an identity matrix")
cond(1I(2))

The condition number for an identity matrix


1.0

In [21]:
println( "The condition number of a matrix that is almost singular:")
A = [1 0
     1 e ]
display(latexify(A))

println("\nCondition number for   e = $ϵ")
@printf ".  C(A)   = %.1e\n" cond( N.(subs.(  A, e, ϵ )))

The condition number of a matrix that is almost singular:


L"\begin{equation}
\left[
\begin{array}{cc}
1 & 0 \\
1 & e \\
\end{array}
\right]
\end{equation}
"


Condition number for   e = 1.0e-6
.  C(A)   = 2.0e+06


#### **3 x 4 Example, Product of Matrices**

In [5]:
println("The condition number of a product of matrices can change radically")
läuchli(N, ϵ) = [ones(Int,N)'; ϵ * I(N)]'

L   = läuchli(3, e)
LtL = L'L

println( "\nconsider the following matrix L =")
display(latexify(L))
println( "\nand Lᵗ L =")
display(latexify(LtL))

println("\nCondition number for   e = $ϵ")
@printf ".  C(L)   = %.2e\n" cond( N.(subs.(   L, e, ϵ )))
@printf ".  C(L'L) = %.2e\n" cond( N.(subs.( LtL, e, ϵ )))

The condition number of a product of matrices can change radically

consider the following matrix L =


L"\begin{equation}
\left[
\begin{array}{cccc}
1 & e & 0 & 0 \\
1 & 0 & e & 0 \\
1 & 0 & 0 & e \\
\end{array}
\right]
\end{equation}
"


and Lᵗ L =


L"\begin{equation}
\left[
\begin{array}{cccc}
3 & e & e & e \\
e & e^{2} & 0 & 0 \\
e & 0 & e^{2} & 0 \\
e & 0 & 0 & e^{2} \\
\end{array}
\right]
\end{equation}
"


Condition number for   e = 1.0e-6
.  C(L)   = 1.73e+06
.  C(L'L) = 3.43e+28


____
Theoretical computation: $L^t L$ has an eigenvalue 0 with multiplicity 1 and and eigenvalue $e^2$ with algebraic multiplicity 2.<br>
$\qquad$ The remaining eigenvalue is $trace(A) - 2 e^2 = 3$

In [6]:
evals_LtL = SymPy.solve(det(LtL-lambda*I), lambda)'
println("distinct eigenvalues: ", evals_LtL )

distinct eigenvalues: Sym{PyCall.PyObject}[0 e^2 e^2 + 3]


In [7]:
@printf ".  C(L)   = %.2e\n" subs( sqrt(evals_LtL[3]/evals_LtL[2]), e, ϵ)

.  C(L)   = 1.73e+06


In [8]:
@printf ".  C(LᵗL)   = %.2e\n" subs( evals_LtL[3]/evals_LtL[2], e, ϵ)

.  C(LᵗL)   = 3.00e+12


____
Why the discrepancy? cond() uses the svd to compute the condition number:

In [9]:
# The SVD of L  returns
svd( N.(subs.(L,e,ϵ))).S

3-element Vector{Float64}:
 1.7320508075691659
 1.0e-6
 1.0e-6

In [10]:
# The SVD of Lᵗ L  returns
svd( N.(subs.(LtL,e,ϵ))).S

4-element Vector{Float64}:
 3.0000000000009996
 1.0e-12
 1.0e-12
 8.744621874862984e-29

In [11]:
println("Theoretical values")
subs.(evals_LtL, e, 1e-6)

Theoretical values


1×3 Matrix{Sym{PyCall.PyObject}}:
 0  1.00000000000000e-12  3.00000000000100

# 3. Take Away

* The value of the condition number is an indicator of numerical difficulties for computations involving a matrix $A$.
* It is better to choose algorithms based on $A$ rather than $A^t A$, e.g., for linear regression computations<br>
$\qquad$ i.e., choose a more appropriate algorithm
<br><br>Damping: Solve $argmin_x \left( \Vert A x -b \Vert^2 + \Vert \lambda x \Vert^2 \right)$ rather than $argmin_x \left( \Vert A x -b \Vert^2\right)$<br><br>
* choose alternative representations of the data resulting in matrices with lower condition number
* transform the problem using a suitable **preconditioner** $P$, e.g., solve $A x = b$ in two steps: $A P P^{-1} x = b \Leftrightarrow (A P) \tilde{x} = b, x = P \tilde{x}$,<br>
$\qquad$ where $A P$ has much lower condition number compared to $A$.