# Gaussian elimination - introduction

Consider the following linear system:

$$\mathbf{A} \, \mathbf{x}  = \mathbf{y}\: ,$$

where $\mathbf{A}$ is a $N \times N$ unstructured matrix given by

$$\mathbf{A} = \left[
\begin{array}{cccccc}
a_{00} & a_{01} & a_{02} & a_{03} & \cdots & a_{0(N-1)} \\
a_{10} & a_{11} & a_{12} & a_{13} & \cdots & a_{1(N-1)} \\
a_{20} & a_{21} & a_{22} & a_{23} & \cdots & a_{2(N-1)} \\
a_{30} & a_{31} & a_{32} & a_{33} & \cdots & a_{3(N-1)} \\
\vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\
a_{(N-1)0} & a_{(N-1)1} & a_{(N-1)2} & a_{(N-1)3} & \cdots & a_{(N-1)(N-1)}
\end{array}
\right]_{\, N \times N} \: ,$$

$$\mathbf{y} = \left[
\begin{array}{c}
y_{0} \\
y_{1} \\
\vdots \\
y_{(N-1)}
\end{array}
\right]_{\, N \times 1}$$

and

$$\mathbf{x} = \left[
\begin{array}{c}
x_{0} \\
x_{1} \\
\vdots \\
x_{(N-1)}
\end{array}
\right]_{\, N \times 1} \: .$$

This is a **square and unstructured linear system** because $\mathbf{A}$ is a general square matrix that is not diagonal, triangular, banded or any other type of structured matrix. 

How to solve this linear system? At this part of the course, we only know how to solve diagonal and triangular systems. It would be useful if this system were tranformed into an equivalent triangular system having the same solution $\mathbf{x}$ as the unstructured system presented above. We say that this new system is equivalent because the solution is the same and triangular because its matrix is upper triangular.

[**Gaussian elimination**](https://en.wikipedia.org/wiki/Gaussian_elimination) is a numerical procedure applied to transform a square and unstructured system into this equivalent triangular system, which can be represented as follows: 

$$\mathbf{B} \, \mathbf{x} = \mathbf{z} \: ,$$

where

$$\mathbf{B} = \left[
\begin{array}{ccccc}
b_{00} & b_{01} & b_{02} & b_{03} & \cdots & b_{0(N-1)} \\
0 & b_{11} & b_{12} & b_{13} & \cdots & b_{1(N-1)} \\
0 & 0 & b_{22} & b_{23} & \cdots & b_{2(N-1)} \\
0 & 0 & 0 & b_{33} & \cdots & b_{3(N-1)} \\
\vdots & \vdots & \vdots &  & \ddots & \vdots \\
0 & 0 & 0 & 0 & \cdots & b_{(N-1)(N-1)}
\end{array}
\right]_{\, N \times N}$$

and

$$\mathbf{z} = \left[
\begin{array}{c}
z_{0} \\
z_{1} \\
\vdots \\
z_{(N-1)}
\end{array}
\right]_{\, N \times 1} \: .$$

As pointed out before, this equivalent system has the same solution $\mathbf{x}$ as the unstructured system. The most striking observation to emerge from [Gaussian elimination](https://en.wikipedia.org/wiki/Gaussian_elimination) is that it transforms an unstructured linear system into a triangular system, which can be easily solved.

This transformation is based on three row transformations that do not change the solution of the linear system: **swapping the positions of two rows**, **multiplying a row by a nonzero scalar** and **adding to one row a scalar multiple of another**. All these transformations can be defined as follows:

$$
\underbrace{\mathbf{T} \mathbf{A}}_{\mathbf{A}^{\prime}} \mathbf{x} = \underbrace{\mathbf{T} \mathbf{y}}_{\mathbf{y}^{\prime}} \: ,
$$

where $\mathbf{T}$ is a square matrix representing the desired transformation.

Let's consider a linear system $\mathbf{A} \mathbf{x} = \mathbf{y}$ given by:

In [1]:
import numpy as np

In [2]:
A = np.array([[1.,3.,2.],
              [7.,4.,9.],
              [8.,6.,5.]])

In [3]:
y = np.array([5.23, 6.45, 1.67])

The solution of this system is given by:

In [4]:
x = np.linalg.solve(A,y)

In [5]:
print(x)

[-1.70556701  1.34958763  1.44340206]


In [7]:
A@x

array([5.23, 6.45, 1.67])

### Solution obtained by **swapping the positions of two rows**

In this case, $\mathbf{T}$ is defined by swapping two specific lines of the identity matrix. Let's consider, for example, a transformation that swaps two first lines of the original system. In this case, the matrix $\mathbf{T}$ is given by:

$$
\mathbf{T} =
\begin{bmatrix}
0 & 1 & 0 \\
1 & 0 & 0 \\
0 & 0 & 1
\end{bmatrix}
$$

In [6]:
T = np.identity(3)[[1, 0, 2]]

print(T)

[[0. 1. 0.]
 [1. 0. 0.]
 [0. 0. 1.]]


The matrix $\mathbf{A}^{\prime}$ and vector $\mathbf{y}^{\prime}$ can be obtained as follows:

In [8]:
A_prime = np.dot(T, A)
y_prime = np.dot(T, y)

In [9]:
print(A)
print('\n')
print(A_prime)

[[1. 3. 2.]
 [7. 4. 9.]
 [8. 6. 5.]]


[[7. 4. 9.]
 [1. 3. 2.]
 [8. 6. 5.]]


In [10]:
print(y)
print('\n')
print(y_prime)

[5.23 6.45 1.67]


[6.45 5.23 1.67]


Alternatively, $\mathbf{A}^{\prime}$ and $\mathbf{y}^{\prime}$ can be obtained by directly swapping the rows of $\mathbf{A}$ and $\mathbf{y}$, as follows:

In [11]:
print(A_prime)
print('\n')
print(A[[1, 0, 2]])

[[7. 4. 9.]
 [1. 3. 2.]
 [8. 6. 5.]]


[[7. 4. 9.]
 [1. 3. 2.]
 [8. 6. 5.]]


In [12]:
print(y_prime)
print('\n')
print(y[[1, 0, 2]])

[6.45 5.23 1.67]


[6.45 5.23 1.67]


The solution of the system $\mathbf{A}^{\prime} \mathbf{x} = \mathbf{y}^{\prime}$ is given by:

In [13]:
x1 = np.linalg.solve(A_prime, y_prime)

In [14]:
print(x1)

[-1.70556701  1.34958763  1.44340206]


which is equal to the solution of the original system

In [15]:
print(x)

[-1.70556701  1.34958763  1.44340206]


In [16]:
np.allclose(x, x1)

True

### Solution obtained by **multiplying a row by a nonzero scalar**

In this case, $\mathbf{T}$ is defined by multiplying the $i$-th row of the identity matrix by $\lambda$. Let's consider, for example, a transformation that multiply the second row of the original system by `3`. In this case, the matrix $\mathbf{T}$ is given by:

$$
\mathbf{T} =
\begin{bmatrix}
1 & 0 & 0 \\
0 & 3 & 0 \\
0 & 0 & 1
\end{bmatrix}
$$

In [17]:
T = np.diag([1., 3., 1.])

print(T)

[[1. 0. 0.]
 [0. 3. 0.]
 [0. 0. 1.]]


The matrix $\mathbf{A}^{\prime}$ and vector $\mathbf{y}^{\prime}$ can be obtained as follows:

In [18]:
A_prime = np.dot(T, A)
y_prime = np.dot(T, y)

In [19]:
print(A)
print('\n')
print(A_prime)

[[1. 3. 2.]
 [7. 4. 9.]
 [8. 6. 5.]]


[[ 1.  3.  2.]
 [21. 12. 27.]
 [ 8.  6.  5.]]


In [20]:
print(y)
print('\n')
print(y_prime)

[5.23 6.45 1.67]


[ 5.23 19.35  1.67]


The solution of the system $\mathbf{A}^{\prime} \mathbf{x} = \mathbf{y}^{\prime}$ is given by:

In [21]:
x2 = np.linalg.solve(A_prime, y_prime)

In [22]:
print(x2)

[-1.70556701  1.34958763  1.44340206]


which is equal to the solution of the original system

In [23]:
print(x)

[-1.70556701  1.34958763  1.44340206]


In [24]:
np.allclose(x, x2)

True

### Solution obtained by **adding to one row a scalar multiple of another**

In this case, $\mathbf{T}$ is defined by using a outer product. Let's consider, for example, a transformation that adds to the third row the product of the first row by a constant $\alpha$.

In this case, the $\mathbf{T}$ can be defined as follows:

$$
\mathbf{T} = \mathbf{I} + \mathbf{t} \cdot \mathbf{u}^{\top} \: ,
$$

where $\mathbf{I}$ is the identity matrix,

$$
\mathbf{u} =
\begin{bmatrix}
1 \\ 0 \\ 0
\end{bmatrix}
$$

and

$$
\mathbf{t} =
\begin{bmatrix}
0 \\ 0 \\ \alpha
\end{bmatrix} \: .
$$

Notice that the position of the non-null element of $\mathbf{u}$ defines the row to be multiplied by the constant $\alpha$, whereas the position of the constant $\alpha$ in $\mathbf{t}$ defines the row in which the addition will be performed.

Let's consider, for example, the transformation that adds to the third row the first row multiplied by `3`. In this case, the matrix $\mathbf{T}$ is defined as follows:

In [25]:
alpha = 3.
t = np.array([0., 0., alpha])
u = np.array([1., 0., 0.])

In [26]:
T = np.identity(3) + np.outer(t, u)

print(np.outer(t, u))
print(T)

[[0. 0. 0.]
 [0. 0. 0.]
 [3. 0. 0.]]
[[1. 0. 0.]
 [0. 1. 0.]
 [3. 0. 1.]]


The matrix $\mathbf{A}^{\prime}$ and vector $\mathbf{y}^{\prime}$ can be obtained as follows:

In [27]:
A_prime = np.dot(T, A)
y_prime = np.dot(T, y)

In [28]:
print(A)
print('\n')
print(A_prime)

[[1. 3. 2.]
 [7. 4. 9.]
 [8. 6. 5.]]


[[ 1.  3.  2.]
 [ 7.  4.  9.]
 [11. 15. 11.]]


In [29]:
print(y)
print('\n')
print(y_prime)

[5.23 6.45 1.67]


[ 5.23  6.45 17.36]


The solution of the system $\mathbf{A}^{\prime} \mathbf{x} = \mathbf{y}^{\prime}$ is given by:

In [30]:
x3 = np.linalg.solve(A_prime, y_prime)

In [31]:
print(x3)

[-1.70556701  1.34958763  1.44340206]


which is equal to the solution of the original system

In [32]:
print(x)

[-1.70556701  1.34958763  1.44340206]


In [33]:
np.allclose(x, x3)

True

### Comparison between the solutions `x`, `x1`, `x2` and `x3`

In [34]:
print(x)

[-1.70556701  1.34958763  1.44340206]


In [35]:
print(x1)

[-1.70556701  1.34958763  1.44340206]


In [36]:
print(x2)

[-1.70556701  1.34958763  1.44340206]


In [37]:
print(x3)

[-1.70556701  1.34958763  1.44340206]
