# Gaussova eliminacija


## Općenito

Sustav $Ax=b$
se rješava u tri koraka (_bez pivotiranja_ ):

1. $A=LU$ (LU rastav, $O(\frac{2}{3}n^3)$ operacija),
2. $Ly=b$ (donje trokutrasti sustav, $n^2$ operacija),
3. $Ux=y$ (gornje torkutasti sustav, $n^2$ operacija).

S pivotiranjem vrijedi

1. $PA=LU$, 
2. $Ly=P^T b$,
3. $Ux=y$. 

## Primjeri

Sljedeći primjeri ukazuju na dva fenomena, jedan od kojih smo već vidjeli, dok drugi nismo. 
U ovom primjeru $\epsilon$ je vrijednost koju daje funkcija `eps()`.

Promotrimo sustav linearnih jednadžbi 

\begin{eqnarray*}
\displaystyle\frac{\epsilon}{10} x_1 + x_2 = 1 \\
x_1 + x_2 = 2
\end{eqnarray*}

Dobro približno rješenje je $x_1 = x_2 =1$. Koristimo proširenu matricu sustava:

\begin{align*}
&\left(\begin{array}{cc|c} \displaystyle\displaystyle\frac{\epsilon}{10} & 1 & 1 \\ 1 & 1 & 2 \end{array} \right) \sim
\left(\begin{array}{cc|c} \displaystyle \displaystyle\frac{\epsilon}{10} & 1 & 1 \\ 0 & 1-\displaystyle\displaystyle\frac{10}{\epsilon} & 2-\displaystyle\displaystyle\frac{10}{\epsilon}\end{array}\right)
\approx \left(\begin{array}{cc|c}  \displaystyle\displaystyle\frac{\epsilon}{10} & 1 & 1 \\ 0 & -\displaystyle\displaystyle\frac{10}{\epsilon} & -\displaystyle\displaystyle\frac{10}{\epsilon}\end{array}\right).
\end{align*}

Zadnja transformacija je zaokruživanje do na točnost stroja $\epsilon$.
Vrlo značajni "1" i "2" u zadnjem retku su _nestali prilikom zaokruživanja_ !

Rješavanje zadnjeg trokutastog sustava daje $x_1 = 0$, $x_2 = 1$. Uvrštavanje u originalni sustav daje $x_1+x_2 = 1$, što je "vrlo netočno".

In [1]:
[eps()/10 1;0 1-10/eps()]\[1; 2-10/eps()]

2-element Array{Float64,1}:
 0.0
 1.0

I ovdje postoji "rješenje" koje se zove _parcijalno pivotiranje_. Stavimo apsolutno najveći element koji još nije poništen u promatranom stupcu na pivotnu odnosno dijagonalnu poziciju:

\begin{align*}
&\left(\begin{array}{cc|c}     1 & 1 & 2 \\\displaystyle\frac{\epsilon}{10} & 1 & 1 \end{array}\right) \sim
\left(\begin{array}{cc|c}     1 & 1 & 2 \\0                   & 1-\displaystyle\frac{\epsilon}{10}&1-\displaystyle\frac{\epsilon}{10}\end{array}\right)
\approx   \left(\begin{array}{cc|c}     1 & 1 & 2 \\0                   & 1                    &1                    \end{array}\right)
\end{align*}

In [2]:
[1 1;0 1-eps()/10]\[2; 1-eps()/10]

2-element Array{Float64,1}:
 1.0
 1.0

Ovo je ispravno rješenje do na točnost stroja.

Ponekad promjena algoritma _ne može nikako pomoći_ ! Promotrimo sustav 

\begin{align*}
(1+2\epsilon)x_1 + (1+2\epsilon)x_2 = 2 \\
(1+\epsilon)x_1 + x_2 =2
\end{align*}

čija proširena matrica glasi

$$
\left(\begin{array}{cc|c}
(1+2\epsilon)&     (1+2\epsilon )&     2 \\
(1+\epsilon)&     1   &2 \end{array}\right).
$$

Pomnožimo prvi redak s $\alpha = (1+\epsilon)/(1+2\epsilon)= 1-\epsilon + O(\epsilon^2)$
i dodajmo drugom:

$$
\left(\begin{array}{cc|c}
(1+2\epsilon)&     (1+2\epsilon )&     2 \\
0           &     -\epsilon & 2\epsilon \end{array}\right).
$$

Rješenje je $x_1 = 4$ i $x_2 =-2$, što je točno do na točnost stroja.

Mađutim, mala promjena na desnoj strani daje

$$
\left(\begin{array}{cc|c}
(1+2\epsilon)&     (1+2\epsilon )&     2+4\epsilon \\
(1+\epsilon)&     1   &2 +\epsilon \end{array}\right).
$$

Točno rješenje je $x_1=x_2=1$, ali zbog zaokruživanja dobije $x_1 =0$, $x_2 =2$. __Objasnite.__
Niti jedan trik kojeg smo vidjeli ne daje točno rješenje. 

In [3]:
[1+2*eps() 1+2*eps(); 1+eps() 1]\[2+4*eps(); 2+eps()]

2-element Array{Float64,1}:
 0.0
 2.0

In [4]:
[BigFloat(1)+2*eps() 1+2*eps(); 1+eps() 1]\[BigFloat(2)+4*eps(); 2+eps()]

2-element Array{BigFloat,1}:
 0.0
 2.0

__Razlog__  IEEE aritmetika ovaj sustav zaokruži na sustav

 $$
\left(\begin{array}{cc|c}
(1+2\epsilon)&     (1+2\epsilon )&     2+4\epsilon \\
(1+\epsilon)&     1   &2           \end{array}\right)
$$

čija su rješenja $x_1=0$ i $x_2=2$. Ovaj problem je vrlo blizu singularnom sustavu

$$
\left(\begin{array}{cc|c} 1 & 1 & 2 \\ 1 & 1 & 2\end{array}\right)
$$

koji ima parametarska rješenja

$$
\mathbf{x} =\begin{pmatrix} x_1 \\ x_2\end{pmatrix} = \begin{pmatrix} 1\\1\end{pmatrix}+ 
\beta \begin{pmatrix}-1 \\1\end{pmatrix}, \quad \beta \in \mathbb{R}.
$$

Primijetimo da su $\begin{pmatrix} x_1 \\ x_2\end{pmatrix}= \begin{pmatrix} 1\\1\end{pmatrix}$ i $\begin{pmatrix} x_1 \\ x_2\end{pmatrix}=\begin{pmatrix} 0\\2\end{pmatrix}$ dva od tih rješenja.

__Pitanje.__ Koja je geometrijska interpretacija ovog sustava?

## LU rastav

In [5]:
function mylu(A₁::Array{T}) where T # Strang, str. 100
    A=copy(A₁)
    n,m=size(A)
    # Ovo prihvaća brojeve i blok-matrice
    U=map(Float64,[zero(A[1,1]) for i=1:n, j=1:n])
    L=map(Float64,[zero(A[1,1]) for i=1:n, j=1:n])
    for k=1:n
        L[k,k]=one(A[1,1])
        for i=k+1:n
            L[i,k]=A[i,k]/A[k,k]
            for j=k+1:n
                A[i,j]=A[i,j]-L[i,k]*A[k,j]
            end
        end
        for j=k:n
            U[k,j]=A[k,j]
        end
    end
    L,U
end

mylu (generic function with 1 method)

In [6]:
using LinearAlgebra
import Random
Random.seed!(123)
A=rand(6,6)

6×6 Array{Float64,2}:
 0.768448  0.586022   0.865412  0.582318  0.20923   0.48
 0.940515  0.0521332  0.617492  0.255981  0.918165  0.790201
 0.673959  0.26864    0.285698  0.70586   0.614255  0.356221
 0.395453  0.108871   0.463847  0.291978  0.802665  0.900925
 0.313244  0.163666   0.275819  0.281066  0.555668  0.529253
 0.662555  0.473017   0.446568  0.792931  0.940782  0.031831

In [7]:
L,U=mylu(A);

In [8]:
L

6×6 Array{Float64,2}:
 1.0       0.0         0.0         0.0        0.0     0.0
 1.22392   1.0         0.0         0.0        0.0     0.0
 0.877039  0.368849    1.0         0.0        0.0     0.0
 0.514613  0.289733   -0.471902    1.0        0.0     0.0
 0.407632  0.113088    0.0869892   0.215088   1.0     0.0
 0.862199  0.0484897   0.896224   -0.0434479  2.3274  1.0

In [9]:
U

6×6 Array{Float64,2}:
 0.768448   0.586022   0.865412   0.582318  0.20923    0.48
 0.0       -0.665108  -0.441699  -0.456726  0.662085   0.202722
 0.0        0.0       -0.310382   0.363608  0.186542  -0.139531
 0.0        0.0        0.0        0.296226  0.591194   0.52933
 0.0        0.0        0.0        0.0       0.25212    0.20895
 0.0        0.0        0.0        0.0       0.0       -0.730114

In [10]:
L*U-A

6×6 Array{Float64,2}:
 0.0  0.0          0.0          0.0  0.0   0.0
 0.0  0.0          0.0          0.0  0.0   0.0
 0.0  0.0          5.55112e-17  0.0  0.0   0.0
 0.0  1.38778e-17  0.0          0.0  0.0   0.0
 0.0  0.0          0.0          0.0  0.0  -1.11022e-16
 0.0  0.0          0.0          0.0  0.0   0.0

## Trokutasti sustavi

In [11]:
function myU(U::Array{T},b₁::Array{T}) where T
    b=copy(b₁)
    n=length(b)
    for i=n:-1:1
       for j=n:-1:i+1
            b[i]=b[i]-U[i,j]*b[j]
       end
        b[i]=b[i]/U[i,i]
    end
    b
end

function myL(L::Array{T},b₁::Array{T}) where T
    b=copy(b₁)
    n=length(b)
    for i=1:n
        for j=1:i-1
            b[i]=b[i]-L[i,j]*b[j]
        end
        b[i]=b[i]/L[i,i]
    end
    b
end

myL (generic function with 1 method)

In [12]:
b=rand(6)

6-element Array{Float64,1}:
 0.9006814789827005
 0.9402992421257947
 0.6213787149845327
 0.34817276542456277
 0.57061318926682
 0.20399662276026764

In [13]:
# Riješimo sustav koristeći ugrađenu funkciju
x=A\b

6-element Array{Float64,1}:
  2.724177128392811
  4.635862756193173
 -3.511396551245912
 -2.6333845620867318
 -0.19586385967020775
  1.4663093725368699

In [14]:
# Riješimo sustav koristeći naše funkcije
y=myL(L,b)

6-element Array{Float64,1}:
  0.9006814789827005
 -0.16205875724336183
 -0.10877893946554916
 -0.11970879824874248
  0.25700386570333467
 -1.0705725308278367

In [15]:
x₁=myU(U,y)

6-element Array{Float64,1}:
  2.724177128392812
  4.635862756193177
 -3.5113965512459138
 -2.6333845620867335
 -0.1958638596702071
  1.46630937253687

In [16]:
# Usporedimo rješenja
x-x₁

6-element Array{Float64,1}:
 -8.881784197001252e-16
 -3.552713678800501e-15
  1.7763568394002505e-15
  1.7763568394002505e-15
 -6.38378239159465e-16
 -2.220446049250313e-16

## Brzina

Program `mylu()` je spor. Između ostalog, alocira nepotrebno tri matrice i ne računa s blok matricama.

Program se može preformulirati tako da su i $L$ i $U$ spremljene u polje $A$, pri čemu se dijagonala od $L$ ne sprema jer su svi elementi jednaki 1 (vidi [Introduction to Linear Algebra, str. 100][St09]):

[St09]: https://books.google.hr/books?id=M19gPgAACAAJ&dq=strang%20introduction&hl=hr&source=gbs_book_other_versions "Gilbert Strang, 'Introduction to Linear Algebra, 4th Edition', Wellesley-Cambridge Press, 2009"


In [17]:
function mylu₁(A₁::Array{T}) where T # Strang, str. 100
    A=copy(A₁)
    n,m=size(A)
    for k=1:n-1
        ρ=k+1:n
        A[ρ,k]=A[ρ,k]/A[k,k]
        A[ρ,ρ]=A[ρ,ρ]-A[ρ,k]*A[k,ρ]'
    end
    A
end

mylu₁ (generic function with 1 method)

In [18]:
mylu₁(A)

6×6 Array{Float64,2}:
 0.768448   0.586022    0.865412    0.582318   0.20923    0.48
 1.22392   -0.665108   -0.441699   -0.456726   0.662085   0.202722
 0.877039   0.368849   -0.310382    0.363608   0.186542  -0.139531
 0.514613   0.289733   -0.471902    0.296226   0.591194   0.52933
 0.407632   0.113088    0.0869892   0.215088   0.25212    0.20895
 0.862199   0.0484897   0.896224   -0.0434479  2.3274    -0.730114

In [19]:
L

6×6 Array{Float64,2}:
 1.0       0.0         0.0         0.0        0.0     0.0
 1.22392   1.0         0.0         0.0        0.0     0.0
 0.877039  0.368849    1.0         0.0        0.0     0.0
 0.514613  0.289733   -0.471902    1.0        0.0     0.0
 0.407632  0.113088    0.0869892   0.215088   1.0     0.0
 0.862199  0.0484897   0.896224   -0.0434479  2.3274  1.0

In [20]:
U

6×6 Array{Float64,2}:
 0.768448   0.586022   0.865412   0.582318  0.20923    0.48
 0.0       -0.665108  -0.441699  -0.456726  0.662085   0.202722
 0.0        0.0       -0.310382   0.363608  0.186542  -0.139531
 0.0        0.0        0.0        0.296226  0.591194   0.52933
 0.0        0.0        0.0        0.0       0.25212    0.20895
 0.0        0.0        0.0        0.0       0.0       -0.730114

Usporedimo brzine LAPACK-ovog programa `lu()` i našeg naivnog programa `mylu()`na većoj dimenziji. 

Izvedite program par puta radi točnijeg mjerenja brzine.

In [21]:
n=512
A=rand(n,n);

In [23]:
@time lu(A);

  0.016983 seconds (4 allocations: 2.004 MiB)


In [25]:
@time mylu₁(A);

  0.627635 seconds (5.49 k allocations: 1.003 GiB, 14.65% gc time)


### Blok varijanta

`mylu()` i `mylu\_1()` su nekoliko desetaka puta sporiji od `lu()`.

Preradimo `mylu\_1()` za rad s blokovima (još uvijek nemamo ugrađeno pivotiranje!):

In [26]:
function mylu₂(A₁::Array{T}) where T # Strang, page 100
    A=copy(A₁)
    n,m=size(A)
    for k=1:n-1
        for ρ=k+1:n
            A[ρ,k]=A[ρ,k]/A[k,k]
            for l=k+1:n
                A[ρ,l]=A[ρ,l]-A[ρ,k]*A[k,l]
            end
        end
    end
    A
end

mylu₂ (generic function with 1 method)

Napravimo prvo mali test:

In [27]:
k,l=2,4
Ab=[rand(k,k) for i=1:l, j=1:l];

In [28]:
A₀=mylu₂(Ab)

4×4 Array{Array{Float64,2},2}:
 [0.19178 0.0976698; 0.234544 0.627093]  …  [0.295925 0.164099; 0.942843 0.830942]
 [5.80254 -0.70825; 3.25685 -0.331261]      [-0.339473 -0.265039; -0.232242 0.107999]
 [3.23452 -0.450395; 2.34086 0.764213]      [0.0241067 1.07794; -0.633465 0.531221]
 [0.838168 0.600568; 0.876648 0.608517]     [-0.124081 0.60866; -0.886185 0.597327]

In [29]:
# Provjera
U=triu(A₀)
L=tril(A₀)
for i=1:maximum(size(L))
    L[i,i]=Matrix{Float64}(I,size(L[1,1])) # eye(L[1,1])
end

In [30]:
Rezidual=L*U-Ab

4×4 Array{Array{Float64,2},2}:
 [0.0 0.0; 0.0 0.0]                    …  [0.0 0.0; 0.0 0.0]
 [0.0 -1.11022e-16; 0.0 -2.77556e-17]     [0.0 0.0; 0.0 0.0]
 [0.0 0.0; 0.0 0.0]                       [0.0 0.0; 0.0 0.0]
 [0.0 -5.55112e-17; 0.0 -5.55112e-17]     [0.0 0.0; 0.0 0.0]

In [31]:
# Pretvaranje blok matrice u običnu
unblock(A) = mapreduce(identity, hcat, [mapreduce(identity, vcat, A[:,i]) 
        for i = 1:size(A,2)])

unblock (generic function with 1 method)

In [32]:
norm(unblock(Rezidual))

3.444376352465766e-16

Probajmo veće dimenzije ($n=k\cdot l$).

In [33]:
k,l=32,16 # 64, 8
Ab=[rand(k,k) for i=1:l, j=1:l];

In [35]:
@time mylu₂(Ab);

  0.078140 seconds (3.32 k allocations: 22.582 MiB, 9.96% gc time)


Vidimo da je `mylu\_2()` gotovo jednako brz kao `lu()` (na jednoj jezgri), uz napomenu da `mylu\_2()` nema ugrađeno pivotiranje. 
Program još uvijek nije optimalan jer alocira previše memorije.

## Pivotiranje

Standardne implementacije uvijek računaju Gaussovu eliminaciju s _parcijalnim pivotiranjem_ :

u svakom koraku se retci pivotiranju tako da pivotni element ima najveću apsolutnu vrijednost u danom stupcu. Na taj 
način je 

$$|L_{ij}| \leq 1,$$

što u praksi dovoljno spriječava rast elemenata.

In [36]:
A=[0.00003 1;2 3]

2×2 Array{Float64,2}:
 3.0e-5  1.0
 2.0     3.0

In [37]:
L,U=mylu(A)
L

2×2 Array{Float64,2}:
     1.0  0.0
 66666.7  1.0

In [38]:
U

2×2 Array{Float64,2}:
 3.0e-5       1.0
 0.0     -66663.7

In [39]:
# s pivoritranjem
P=[0 1;1 0]
L,U=mylu(P*A)
L

2×2 Array{Float64,2}:
 1.0     0.0
 1.5e-5  1.0

In [40]:
U

2×2 Array{Float64,2}:
 2.0  3.0
 0.0  0.999955

In [41]:
L*U-P*A

2×2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

In [42]:
# Slučajna matrica, standardna funkcija lu()
Random.seed!(248)
A=rand(5,5)
L,U,P=lu(A);

In [43]:
P

5-element Array{Int64,1}:
 3
 4
 5
 2
 1

In [44]:
L

5×5 Array{Float64,2}:
 1.0        0.0         0.0       0.0       0.0
 0.0820339  1.0         0.0       0.0       0.0
 0.437335   0.0959387   1.0       0.0       0.0
 0.928548   0.0219166   0.150299  1.0       0.0
 0.386654   0.74805    -0.562755  0.803388  1.0

In [45]:
U

5×5 Array{Float64,2}:
 0.740619  0.105456  0.288167  0.0134151   0.810915
 0.0       0.939039  0.427963  0.871155    0.820178
 0.0       0.0       0.697849  0.776953    0.204272
 0.0       0.0       0.0       0.649058   -0.21677
 0.0       0.0       0.0       0.0         0.183107

In [46]:
L*U-A[P,:]

5×5 Array{Float64,2}:
 0.0  0.0           0.0           0.0          0.0
 0.0  0.0           0.0           0.0          0.0
 0.0  0.0          -1.11022e-16   0.0          0.0
 0.0  0.0           0.0           0.0          0.0
 0.0  1.11022e-16   6.93889e-18  -1.11022e-16  0.0

### Potpuno pivotiranje

Sljedeći program računa Gaussovu eliminaciju s _potpunim pivotiranjem_ - u svakom koraku 
se retci i stupci zamijene takoda se na pivotnu poziciju dovede element koji ima najveću 
apsolutnu vrijednost u trenutnoj podmatrici.

In [47]:
eye(n,m)=Matrix{Float64}(I,n,m)
function gecp(A1::Array{T}) where T
    # Gaussova eliminacija s potpunim pivotiranjem
    # Izlaz: Pr*L*U*Pc'=A ili Pr'*A*Pc=L*U
    A=deepcopy(A1)
    n,m=size(A)
    Pr=eye(n,n)
    Pc=eye(n,n)
    D=zeros(n)
    for i=1:n-1
        amax,indm=findmax(abs.(A[i:n,i:n]))
        imax=indm[1]+i-1
        jmax=indm[2]+i-1
        #  zamijena redaka
        if (imax != i)
            temp = Pr[:,i]
            Pr[:,i] = Pr[:,imax]
            Pr[:,imax] = temp
            temp = A[i,:]
            A[i,:] = A[imax,:]
            A[imax,:] = temp
        end
        # zamijena stupaca
        if (jmax != i)
            temp = Pc[:,i]
            Pc[:,i] = Pc[:,jmax]
            Pc[:,jmax] = temp
            temp = A[:,i]
            A[:,i] = A[:,jmax]
            A[:,jmax] = temp
        end
        # eliminacija
        D[i]=A[i,i]
        A[i+1:n,i] = A[i+1:n,i]/D[i]
        A[i+1:n,i+1:n] = A[i+1:n,i+1:n] - A[i+1:n,i]*A[i,i+1:n]'
        A[i,i+1:n]=A[i,i+1:n]/D[i]
    end
    D[n]=A[n,n]
    L=eye(n,n)+tril(A,-1)
    U=eye(n,n)+triu(A,1)
    U=diagm(0=>D)*U
    L,U,Pr,Pc
end

gecp (generic function with 1 method)

In [48]:
n=5
A=rand(n,n)
b=rand(n)

5-element Array{Float64,1}:
 0.9665983607050483
 0.43239336629240754
 0.2991638766124043
 0.7502210485360699
 0.15514720539742077

In [49]:
L,U,Pr,Pc=gecp(A);

In [50]:
Pr

5×5 Array{Float64,2}:
 0.0  0.0  1.0  0.0  0.0
 1.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  1.0  0.0
 0.0  1.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  1.0

In [51]:
Pr*L*U*Pc'-A

5×5 Array{Float64,2}:
 0.0   5.20417e-18  0.0  0.0   0.0
 0.0   0.0          0.0  0.0   0.0
 0.0  -5.55112e-17  0.0  0.0   0.0
 0.0   0.0          0.0  0.0   0.0
 0.0   0.0          0.0  0.0  -1.11022e-16

In [52]:
y=myL(L,Pr'*b)

5-element Array{Float64,1}:
  0.43239336629240754
  0.4176544037107001
  0.7511943913730601
 -0.2752062137196889
 -0.23700743414336084

In [53]:
z=myU(U,y)

5-element Array{Float64,1}:
  1.664033297394591
  2.289212897580302
 -0.2195060430755223
 -0.4597332965577868
 -5.369713074556756

In [54]:
x=Pc*z

5-element Array{Float64,1}:
  1.664033297394591
  2.289212897580302
 -5.369713074556756
 -0.4597332965577868
 -0.2195060430755223

In [55]:
A*x-b

5-element Array{Float64,1}:
 -6.661338147750939e-16
 -3.885780586188048e-16
 -1.1102230246251565e-16
 -3.3306690738754696e-16
 -3.885780586188048e-16

## Točnost

Neka je zadan sustav $Ax=b$, pri čemu je matrica $A$ regularna.

Da bi primijenili koncepte iz bilježnice 
[NA04 Pogreska unatrag_i stabilni_algoritmi](NA04%20Pogreska%20unatrag%20i%20stabilni%20algoritmi.ipynb), potrebno je:

1. napraviti teoriju smetnje za dani problem
2. analizirati pogreške algoritma (Gaussove eliminacije)

### Teorija smetnje

Neka je 

$$
(A+\delta A)\hat x=(b+\delta b)
$$

za neki $\hat x=x+\delta x$.

Želimo ocijeniti 

$$
\frac{\| \hat x - x \|}{\| x\|} \equiv \frac{\| \delta x\|}{\| x\|}.
$$

Uvedimo oznake (npr. prema [Matrix Computations, poglavlje 2.6.2][GVL13])

$$
\delta A=\varepsilon F, \quad \delta b=\varepsilon f, \qquad \hat x=x(\varepsilon),
$$
čime smo dobili jednodimenzionalni problem 

$$
(A+\varepsilon F)\,x(\varepsilon)=b+\varepsilon f.
$$

za neke (nepoznate) matricu $F$ i vektor $f$. 

Deriviranje po $\varepsilon$ daje

$$
Fx(\varepsilon)+(A+\varepsilon F)\, x(\varepsilon)=f.
$$

Uvrštavanje $\varepsilon=0$ daje

$$
F x+A\dot x(0)=f,
$$

odnosno

$$
\dot x(0)=A^{-1}(f-Fx).
$$

Taylorov razvoj oko $\varepsilon=0$ glasi

$$
x(\varepsilon)=x(0)+\varepsilon \dot x(0) +O(\varepsilon^2),
$$

odnosno, uz zanemarivanje člana $O(\varepsilon^2)$,

$$
\hat x-x=\varepsilon A^{-1}(f-Fx)=A^{-1} (\varepsilon f + \varepsilon F x) = A^{-1} (\delta b + \delta A x).
$$

Svojstva norme povlače

$$
\| \hat x-x\|\leq \| A^{-1} \| (\| \delta b \|  + \| \delta A \| \cdot \|  x\| ).
$$

Konačno, zbog $\| b\| \leq \| A\| \| x\|$, imamo

$$
\frac{\| \hat x-x\|}{\| x\|}\leq \| A\|  \cdot \| A^{-1} \| \bigg(\frac{\| \delta b \|}{\|b\|}  + \frac{\| \delta A \|}{ \|  A\|} \bigg). \tag{1}
$$

Broj 
$$
\kappa(A)\equiv \| A\|  \cdot \| A^{-1} \|
$$ 

je __uvjetovanost__ (__kondicija__)  matrice $A$ i kazuje nam 
koliko se relativno uvećaju relativne promjene u polaznim podacima (matrici $A$ i vektoru $b$).

Pogledajmo primjer iz [Numeričke matematike, str. 42][RS04]:


[GVL13]: https://books.google.hr/books?id=X5YfsuCWpxMC&printsec=frontcover&hl=hr#v=onepage&q&f=false "G. Golub and C. F Van Loan, 'Matrix Computations', 4th Edition, John Hopkins, Baltimore, 2013" 

[RS04]: http://www.mathos.unios.hr/pim/Materijali/Num.pdf "R. Scitovski, 'Numerička matematika', Sveučilište u Osijeku, Osijek, 2004."

In [56]:
A= [0.234 0.458; 0.383 0.750]

2×2 Array{Float64,2}:
 0.234  0.458
 0.383  0.75

In [57]:
b=[0.224;0.367]

2-element Array{Float64,1}:
 0.224
 0.367

In [58]:
x=A\b

2-element Array{Float64,1}:
 -1.0000000000002423
  1.0000000000001237

In [59]:
δb=[0.00009; 0.000005]
x1=A\(b+δb)

2-element Array{Float64,1}:
 -0.24174418604640305
  0.6127906976743631

In [60]:
cond(A), norm(δb)/norm(b), norm(x1-x)/norm(x)

(11322.197586092605, 0.0002096449170953002, 0.6020311134825742)

In [61]:
δA=[-0.001 0;0 0]
x2=(A+δA)\b

2-element Array{Float64,1}:
 0.12951807228916615
 0.42319277108433245

In [62]:
cond(A), norm(δA)/norm(A), norm(x2-x)/norm(x)

(11322.197586092605, 0.0010134105190591591, 0.896804787832142)

### Pogreška Gaussove eliminacije

Prema [Matrix Computations, poglavlje 3.3][GVL13], za izračunate faktore
$\hat L$ i $\hat U$ vrijedi

$$
\hat L\cdot \hat U = A+\delta A
$$

gdje je (nejednakost se čita po elementima matrica, $\varepsilon$ je sada točnost stroja)

$$
| \delta A|\leq 3(n-1) \varepsilon (|A|+|\hat L| \cdot |\hat U|) +O(\varepsilon^2).
$$

Zanemarivanje člana $O(\varepsilon^2)$ i prelazak na normu daju

$$
\|\delta A \| \lesssim O(n)\varepsilon (\| A\| + \| \hat L\| \cdot \| \hat U\|),
$$

pa je 

$$
 \frac{\|\delta A \|}{\|A\|} \lesssim O(n)\varepsilon \bigg(1+\frac{\| \hat L\| \cdot \| \hat U\|}{\|A\|}\bigg).
$$

Ukoliko se Gaussova eliminacija radi s pivotiranjem, tada će najvjerojatnije zadnji kvocijent također biti malen 
($\approx 1$). Također, pogreška kod rješavanja trokutastih sustava nije veća od navedene pa uvrštavanjem u (1) slijedi 
da za relativnu pogrešku izračunatog rješenja vrijedi 

$$
\frac{\| \hat x-x\|}{\| x\|}\leq \kappa(A) O(n\varepsilon).
$$

Zaključimo:

> _Ukoliko je kondicija matrice velika, rješenje može biti netočno._

[GVL13]: https://books.google.hr/books?id=X5YfsuCWpxMC&printsec=frontcover&hl=hr#v=onepage&q&f=false "G. Golub and C. F Van Loan, 'Matrix Computations', 4th Edition, John Hopkins, Baltimore, 2013" 

In [63]:
n=10
v=rand(n)

10-element Array{Float64,1}:
 0.6614350593639147
 0.48650890125687063
 0.41728889115919254
 0.38869327482214744
 0.596574907748231
 0.07592661490035035
 0.4860952120877726
 0.46900411658934393
 0.528264975117307
 0.7552505181107274

In [64]:
# Vandermonmdeove matrice imaju veliku kondiciju.
V=Array{Float64}(undef,n,n)
for i=1:n
    V[:,i]=v.^(i-1)
end
V=V'

10×10 Adjoint{Float64,Array{Float64,2}}:
 1.0        1.0         1.0          …  1.0         1.0         1.0
 0.661435   0.486509    0.417289        0.469004    0.528265    0.755251
 0.437496   0.236691    0.17413         0.219965    0.279064    0.570403
 0.289375   0.115152    0.0726625       0.103164    0.14742     0.430797
 0.191403   0.0560226   0.0303213       0.0483845   0.0778767   0.32536
 0.126601   0.0272555   0.0126527    …  0.0226925   0.0411395   0.245728
 0.0837381  0.01326     0.00527984      0.0106429   0.0217326   0.185586
 0.0553873  0.00645113  0.00220322      0.00499156  0.0114806   0.140164
 0.0366351  0.00313853  0.000919379     0.00234106  0.00606477  0.105859
 0.0242318  0.00152692  0.000383647     0.00109797  0.00320381  0.0799502

In [65]:
bᵥ=rand(n)

10-element Array{Float64,1}:
 0.33104889180804165
 0.8560704147914053
 0.7138667824894915
 0.39559547158529984
 0.2037555212946418
 0.6627753767486095
 0.45341074074282517
 0.23021537565747785
 0.311920548800684
 0.31094083311233645

In [66]:
xᵥ=V\bᵥ

10-element Array{Float64,1}:
    -1.9505175985107422e6
     1.2442697758199965e11
    -1.6094317687911946e8
     3.420574394963029e7
     1.988657200013696e7
 -2127.8676386500338
    -1.2676305033862157e11
     2.7705212826177187e9
    -3.256952028159136e8
 50183.54666973968

In [67]:
cond(V)

1.804967496870472e13

In [68]:
Vbig=map(BigFloat,V)
bbig=map(BigFloat,bᵥ)
xbig=Vbig\bbig;

In [70]:
map(Float64,norm(xbig-xᵥ)/norm(xbig))

5.8453671489411944e-5

### Umjetno loša kondicija

In [71]:
A=[1 1; 1 2]
b=[1;3]
x=A\b
@show x,cond(A)
A₁=[1e-4 1e-4;1 2]
b₁=[1e-4;3]
x₁=A₁\b₁
x,cond(A₁),x-x₁

(x, cond(A)) = ([-1.0, 2.0], 6.854101966249685)


([-1.0, 2.0], 50000.00017991671, [8.881784197001252e-16, -4.440892098500626e-16])

### Procjena kondicije

Računanje kondicije prema definiciji $\kappa(A)=\|A\| \cdot \|A^{-1}\|$ zahtijeva računanje matrice inverzne matrice, za što je potrebno $O(n^3)$ operacija. To je isti red veličine operacija koji je potreban za rješavanje zadanog sustava. Prilikom rješavanja sustava na raspolaganju su nam trokutasti faktori $L$ i $U$, što se može iskoristiti kako bi se kondicija približno izračunala u $O(n^2)$ operacija. 
Detalji se nalaze u [Matrix Computations, poglavlje 3.5.4][GVL13]. 
LAPACK rutina 
[dtrcon.f](http://www.netlib.org/lapack/explore-html/d9/d84/dtrcon_8f_source.html) računa približnu kondiciju trokutaste matrice.

Izračunajmo približnu kondiciju Vandermondeove matrice iz prethodnog primjera.


[GVL13]: #1 "G. Golub, C. Van Loan,'Matrix Computations', 4th Edition, John Hopkins, baltimore, 2013"  

In [72]:
?LAPACK.trcon!

```
trcon!(norm, uplo, diag, A)
```

Finds the reciprocal condition number of (upper if `uplo = U`, lower if `uplo = L`) triangular matrix `A`. If `diag = N`, `A` has non-unit diagonal elements. If `diag = U`, all diagonal elements of `A` are one. If `norm = I`, the condition number is found in the infinity norm. If `norm = O` or `1`, the condition number is found in the one norm.


In [73]:
L,U=lu(V);

In [74]:
cond(V,1),cond(L,1),cond(U,1)

(1.6503694453980852e13, 37.48682914049708, 6.680315740239664e12)

In [75]:
1 ./LAPACK.trcon!('O','L','U',L),1 ./LAPACK.trcon!('O','U','N',U)

(37.48682914049708, 6.680315740239664e12)

## Rezidual


Izračunato rješenje $\hat x$ sustava $Ax=b$ je točno rješenje nekog sličnog sustava (vidi [Afternotes on Numerical Analysis, str. 128][Ste96]):


$$ 
(A+\delta A)\,\hat x=b. \tag{1}
$$

__Rezidual__ (ili __ostatak__) definiramo kao 

$$
r=b-A\hat x.
$$

Tada je 

$$
0=b-(A+\delta A)\,\hat x=r- \delta A\,\hat x
$$

pa je 

$$ 
\| r\| = \| \delta A\,\hat x \| \leq \| \delta A\| \cdot \|\hat x \|,
$$

odnosno

$$
\frac{\|  \delta A\|}{\|A \|} \geq \frac{\|r\|}{\| A\| \cdot \|\hat x \|}.
$$

Dakle,

ako  _relativni rezidual_ 

$$ \frac{r}{\| A\| \cdot \|\hat x \|}$$

ima veliku normu, tada _rješenje nije izračunato stabilno._

S druge strane, ako relativni rezidual ima malu normu, tada je _rješenje izračunato stabilno_. Naime, za

$$
\delta A=\frac{r\hat x^T}{\|\hat x\|^2}
$$

vrijedi (1):

$$
b-(A+\delta A)\hat x=(b-A\hat x)-\delta A \hat x = r-\frac{r\hat x^T \hat x}{\|\hat x\|^2}
= r-\frac{r \|\hat x^T \hat x\|}{\|\hat x\|^2}=r-r=0.
$$

Također vrijedi

$$
\frac{\|  \delta A\|}{\|A \|}  \leq  \frac{\|r\|\|\hat x \|}{\| A\| \cdot \|\hat x \|^2}=
\frac{\|r\|}{\| A\| \cdot \|\hat x \|}.
$$

Izračunajmo reziduale za prethodni primjer dimenzije $2$:


[Ste96]: https://books.google.hr/books?id=w-2PWh01kWcC&printsec=frontcover&hl=hr#v=onepage&q&f=false    "G. W. Stewart, 'Afternotes on Numerical Analysis', SIAM, Philadelphia, 1996"

In [76]:
r=b-A*x

2-element Array{Float64,1}:
 0.0
 0.0

In [77]:
norm(r)/(norm(A)*norm(x))

0.0

In [78]:
r₁=b₁-A₁*x₁

2-element Array{Float64,1}:
 4.0657581468206416e-20
 0.0

In [79]:
norm(r₁)/(norm(A₁)*norm(x₁))

8.131516277378246e-21

Izračunajmo rezidual za Vandermondeov sustav:

In [80]:
rᵥ=bᵥ-V*xᵥ

10-element Array{Float64,1}:
 -3.2326922330128127e-6
  6.154785454404177e-6
 -1.954731272624244e-6
  4.324341323247438e-6
 -1.7407817809456105e-7
  2.618816065780294e-6
  8.780382501072381e-7
  1.6795204071939906e-6
 -5.394571205297183e-7
 -5.536515290671673e-8

In [81]:
norm(rᵥ)/(norm(V)*norm(xᵥ))

1.343381577803109e-17

Zaključujemo da je rješenje $x_v$ izračunato stabilno, odnosno s vrlo malom pogreškom unatrag u početnim podatcima. To još uvijek ne znači da je rješenje relativno vrlo točno.