# Conditioning

## Condition of a Problem

In numerical computing, as we will soon see, we constantly make small errors in representing real numbers and the operations on them. Consequently we need to know whether the problems we want to solve are very sensitive to perturbations. The condition number measures this senstivity.

We view a _problem_ as a function $f:X\to Y$ between the _data_ $X$ and the _solutions_ $Y$, where $X$ and $Y$ are both normed vector spaces.

A problem is _well-conditioned_ if small perturbations of $x$ lead to only small changes in $f(x)$.  Otherwise a problem is called _ill-conditioned_.

## Absolute Condition Number

Let $\delta x$ denote a small perturbation of $x$, and write: $\delta f = f(x+\delta x) - f(x)$.  We define the _absolute condition number_:

$$
\hat{\kappa} = \lim_{\delta \to 0} \sup_{\|\delta x\| \leq \delta} \frac{\|\delta f\|}{\|\delta x\|} \quad \left(=\sup_{\delta x} \frac{\|\delta f\|}{\|\delta x\|}.\right)
$$

If $\mathbf{f}$ is differentiable, let $\mathbf{Df}(\mathbf{x}) = \left(\frac{\partial f_i}{\partial x_j}\right)$ be the _Jacobian_ or _total derivative_ of $\mathbf{f}$ at $\mathbf{x}$.  Since $\delta \mathbf{f} \approx \mathbf{Df}(\mathbf{x}) \delta \mathbf{x}$, we see:

$$
\hat{\kappa} = \|\mathbf{Df}(\mathbf{x})\|
$$

where $\|\cdot \|$ is the norm of $\mathbf{Df}$ induced by the norms of $X$ and $Y$.

## Relative Condition Number

The _relative condition number_ is defined to be:
$$
\kappa = \lim_{\delta \to 0} \sup_{\|\delta x\|\leq \delta} \left(\frac{\|\delta f\|}{\|f(x)\|}\left/\frac{\|\delta x\|}{\|x\|}\right.\right)
$$

In the case where $f$ is differentiable:
$$\kappa = \frac{\|\mathbf{Df}(\mathbf{x})\|}{\|\mathbf{f}(\mathbf{x})\|/\|\mathbf{x}\|}.$$

## Examples

- Consider the problem of obtaining the scalar $x/2$ from $x\in\mathbb{C}$.   The Jacobian of $f(x) = x/2$, is $1/2$.  Therefore: $$\kappa = \frac{\|\mathbf{D}f\|}{\|f(x)\|/\|x\|} = \frac{1/2}{(x/2)/x} = 1.$$

- Consider the problem of obtaining the square-root $\sqrt{x}$ from $x>0$.   The Jacobian of $f(x) = \sqrt{x}$, is $1/2\sqrt{x}$.  Therefore: $$\kappa = \frac{\|\mathbf{D}f\|}{\|f(x)\|/\|x\|} = \frac{1/(2\sqrt{x})}{\sqrt{x}/x} = \frac{1}{2}.$$

- Consider the problem of obtaining the scalar $f(\mathbf{x}) = x_1-x_2$ from $\mathbf{x} = (x_1,x_2)* \in \mathbb{C}^2$.  The Jacobian is:
$$
\mathbf{D}f = \begin{bmatrix} f_{x_1} & f_{x_2}\end{bmatrix} = \begin{bmatrix} 1 & -1 \end{bmatrix}.
$$

Therefore $\|\mathbf{D}f\|_\infty=2$:
$$
\kappa = \frac{\|\mathbf{D}f\|}{\|f(\mathbf{x})\|/\|\mathbf{x}\|} = \frac{2}{|x_1-x_2|/\max\{|x_1|,|x_2|\}}.
$$

This is large if $x_1\approx x_2$.

## Polynomial roots

The roots of a polynomial become sensitive to the values of the coefficients in the monomial basis when roots are relatively close to one another. Consider, for example,

In [9]:
using Polynomials 
p = poly([1,1,1,0.4,2.2]);          # polynomial with these as roots
q = Poly( p.a + 1e-9*randn(6) );    # small changes to its coefficients
println(roots(q))

LoadError: LoadError: ArgumentError: Module Polynomials not found in current path.
Run `Pkg.add("Polynomials")` to install the Polynomials package.
while loading In[9], in expression starting on line 1

Observe that the triple root at 1 changed a lot more than the size of the perturbation would suggest; the other two roots changed by an amount less than $10^{-9}$. The effect can be more dramatically shown using the Wilkinson polynomial. 

In [10]:
p = poly(1.0:20);
using PyPlot
plot(collect(1:20),zeros(20),"ko")
for k = 1:500
    q = Poly(p.a.*(1+1e-9*randn(21)));  # relative perturbations
    r = roots(q);
    plot(real(r),imag(r),".")
end

LoadError: LoadError: UndefVarError: poly not defined
while loading In[10], in expression starting on line 1

Clearly, having roots close together is not the only way to get sensitivity in the roots. In fact we can't accurately compute the roots even without perturbation to the data:

In [11]:
roots(p)

LoadError: LoadError: UndefVarError: roots not defined
while loading In[11], in expression starting on line 1

## Condition of Matrix-Vector Multiplication

Fix $A\in \mathbb{C}^{m\times n}$:
$$
\kappa = \sup_{\delta x} \left(\frac{\|A(x + \delta x) - Ax\|}{\|Ax\|}\left/\frac{\|\delta x\|}{\|x\|}\right.\right) = \sup_{\delta x} \frac{\|A\delta x\|}{\|\delta x\|}\left/\frac{\|Ax\|}{\|x\|}\right. = \|A\| \frac{\|x\|}{\|Ax\|}
$$

When $A$ is square and nonsingular, then $\|x\|/\|Ax\|\leq \|A^{-1}\|$:

> ** THEOREM. ** Let $A\in\mathbb{C}^{m\times m}$ be nonsingular and consider the equation $A\mathbf{x} = \mathbf{b}$  
- The problem of computing $\mathbf{b}$, given $\mathbf{x}$, has condition number (wrt $\mathbf{x}$): $$\kappa = \|A\|\frac{\|\mathbf{x}\|}{\|\mathbf{b}\|} \leq \|A\|\cdot \|A^{-1}\|.$$
- The problem of computing $\mathbf{x}$, given $\mathbf{b}$, has condition number (wrt $\mathbf{b}$): $$\kappa = \|A\|\frac{\|\mathbf{b}\|}{\|\mathbf{x}\|} \leq \|A\|\cdot \|A^{-1}\|.$$

## Matrix condition number

We have particular interest in the condition number of the problem "given square matrix $A$ and vector $b$, find vector $x$ such that $Ax=b$." More simply: "map $b$ to $A^{-1}b$." The relative condition number of this problem is bounded above by the *matrix condition number* 

$$\kappa(A)=\|A\|\,\|A^{-1}\|.$$

Furthermore, in any particular case there exist perturbations to the data such that the upper bound is achieved. 

$A$ is said to be _well-conditioned_ if $\kappa(A)$ is small, and _ill-conditioned_ otherwise.

In [12]:
hilb(n) = [ 1.0/(i+j) for i=1:n, j=1:n];
A = hilb(5);  kappa = cond(A), A



(1.5350438953289741e6,
[0.5 0.333333 … 0.2 0.166667; 0.333333 0.25 … 0.166667 0.142857; … ; 0.2 0.166667 … 0.125 0.111111; 0.166667 0.142857 … 0.111111 0.1])

Notice that if $\|\cdot \| = \|\cdot \|_2$, then $\|A\|= \sigma_1$, and $\|A^{-1}\| = \frac{1}{\sigma_m}$, thus:
$$\kappa(A) = \frac{\sigma_1}{\sigma_m}.$$

In [13]:
s = svdvals(A); s[1]/s[5]

1.5350438953289741e6

One can view this as the _eccentricity_ of the image of the unit sphere under $A$.

We can extend this to full rank matrices $A\in \mathbb{C}^{m\times n}$, $m\geq n$, by setting $\kappa(A) = \|A\|\|A^+\|$.  Then:
$$
\kappa (A) = \frac{\sigma_1}{\sigma_n},
$$
where $n$ is the rank of $A$.

The importance of _relative_ condition numbers is that they explain accuracy in dimensionless terms, i.e. significant digits. This condition number says we could "lose" up to 5 or so digits in the passage from data to result. So we make relative perturbations to $b$ and see the relative effect on the result.

In [16]:
perturb(z,ep) = z.*(1 + ep*(2*rand(size(z))-1));
x = 0.3+(1:5);  b = A*x;
for k = 1:8
    bb = perturb(b,1e-10);
    @printf(" relative error = %.2e\n", norm( A\bb - x ) / norm( x ) )
end
@show bound = 1e-10*kappa;

 relative error = 1.36e-05
 relative error = 6.24e-06
 relative error = 1.04e-05
 relative error = 2.10e-07
 relative error = 4.76e-06
 relative error = 5.36e-06
 relative error = 4.34e-05
 relative error = 1.30e-05




LoadError: LoadError: MethodError: no method matching *(::Float64, ::Tuple{Float64,Array{Float64,2}})
Closest candidates are:
  *(::Any, ::Any, !Matched::Any, !Matched::Any...) at operators.jl:138
  *(::Float64, !Matched::Float64) at float.jl:244
  *(::Real, !Matched::Complex{Bool}) at complex.jl:158
  ...
while loading In[16], in expression starting on line 218

The same holds for perturbations to $A$, though the error has higher-order terms that vanish only in the limit of infinitesimal perturbations.

In [15]:
x = 0.3+(1:5);  b = A*x;
for k = 1:8
    AA = perturb(A,1e-10);
    @printf(" relative error = %.2e\n", norm( AA\b - x ) / norm( x ) )
end
@show bound = 1e-10*kappa;

 relative error = 8.03e-06
 relative error = 2.49e-05
 relative error = 1.81e-06
 relative error = 1.08e-06
 relative error = 1.08e-05
 relative error = 1.26e-05
 relative error = 1.51e-06
 relative error = 2.43e-06


LoadError: LoadError: MethodError: no method matching *(::Float64, ::Tuple{Float64,Array{Float64,2}})
Closest candidates are:
  *(::Any, ::Any, !Matched::Any, !Matched::Any...) at operators.jl:138
  *(::Float64, !Matched::Float64) at float.jl:244
  *(::Real, !Matched::Complex{Bool}) at complex.jl:158
  ...
while loading In[15], in expression starting on line 218

## Condition of a System of Equations

What happens when we perturb the matrix $A$ itself?

$$
(A + \delta A)(\mathbf{x} + \delta \mathbf{x}) = \mathbf{b}
$$

Expanding, and reducing to first order terms gives:
$$
(\delta A) \mathbf{x} + A(\delta\mathbf{x}) = \mathbf{0}, \quad \Longrightarrow \quad \delta \mathbf{x} = -A^{-1} (\delta A) \mathbf{x}.
$$

Thus $\|\delta \mathbf{x} \| \leq \|A^{-1}\| \|\delta A\| \|\mathbf{x} \|$, i.e.
$$
\frac{\|\delta \mathbf{x}\|}{\|\mathbf{x}\|}\left/\frac{\|\delta A\|}{\|A\|}\right.\leq \|A^{-1}\|\|A\| = \kappa(A).
$$

Equality will hold whenever $\delta A$ is such that:
$$\|A^{-1}(\delta A)\mathbf{x} \| = \|A^{-1} \|\|\delta A\| \|\mathbf{x}\|,$$ This can be attained.

> ** THEOREM. ** Let $\mathbf{b}$ be fixed and consider the problem of computing $\mathbf{x} = A^{-1}\mathbf{b}$, where $A$ is square and nonsingular.  The condition number of this problem with respect to perturbations in $A$ is:
$$\kappa = \|A\|\|A^{-1}\| = \kappa(A).$$

If a problem $A\mathbf{x} =\mathbf{b}$ contains an ill-conditioned $A$, one must expect to "lose $\log_{10} \kappa(A)$ digits".