# LU and PLU Decompositions

Gaussian elimination can be interpreted as:
$$
A = L U = \begin{bmatrix} \times \\ \vdots & \ddots \\ \times & \cdots & \times
\end{bmatrix}  \begin{bmatrix} \times & \cdots & \times \\ & \ddots & \vdots \\ && \times
\end{bmatrix}
$$
Gaussian elimination with pivoting can be interpreted as:
$$
A = P^\top L U = P_\sigma^\top\begin{bmatrix} \times \\ \vdots & \ddots \\ \times & \cdots & \times
\end{bmatrix}  \begin{bmatrix} \times & \cdots & \times \\ & \ddots & \vdots \\ && \times
\end{bmatrix}
$$

In [3]:
using LinearAlgebra, BenchmarkTools

In [6]:
n = 100
A = randn(n,n)

@btime qr(A);

  203.301 μs (7 allocations: 134.55 KiB)


In [7]:
@btime lu(A); # returns the PLU Decomposition

  55.969 μs (4 allocations: 79.08 KiB)


In [10]:
b = rand(n);

In [12]:
norm(lu(A) \ b - A \ b)

0.0

In [13]:
norm(qr(A) \ b - A \ b)

8.597325713189271e-14

In [14]:
norm(lu(A) \ b - big.(A) \ b)

1.425358539790752198650802673203131638232571083112364022139213085517791007107324e-13

In [16]:
norm(qr(A) \ b - big.(A) \ b)

7.927522338547764273693946389183225501504147654085037810246615786067216082179432e-14

Conclusion: 

PLU decompositions are 4x faster, and only slightly less accurate than QR (on this example)

**WARNING** there is an extremely small chance PLU will give very inaccurate results, whereas QR is fine.

## LU Decomposition
$$
L_{n-1} \cdots L_1 A = U
$$
so that
$$
L = L_1^{-1}\cdots L_{n-1}^{-1}
$$

In [23]:
L,U = lu(A, NoPivot())
norm(L*U - A)

6.2001239763046875e-12

In [39]:
A = [2 1 1; 
     2 4 9; 
     3 2 3]

n = size(A,1)
L₁ = Matrix(1.0I, n, n)
L₁[2:end,1] = -A[2:end,1]/A[1,1]

A₁ = L₁*A

L₂ = Matrix(1.0I, n, n)
L₂[3:end,2] = -A₁[3:end,2]/A₁[2,2]

L₂*L₁*A



3×3 Matrix{Float64}:
 2.0  1.0  1.0
 0.0  3.0  8.0
 0.0  0.0  0.166667