# Solving linear systems of equations using NumPy

Solving linear systems of equations is a foundational task in many areas of
engineering. NumPy, especially through its `numpy.linalg` submodule, offers
several methods to solve or compute approximate solutions to such systems.

## $ \S 1 $ Solving square systems

Consider a set of $ n $ linear equations in $ n $ variables $ x_1, \dots, x_n $:
\begin{equation*}
\begin{cases}
& a_{11} x_1 &+& a_{12}x_2 &+& \cdots &+& a_{1n}x_n &=& b_1 \\
& a_{21} x_1 &+& a_{22}x_2 &+& \cdots &+& a_{2n}x_n &=& b_2 \\
& \vdots &+& \vdots &+& \cdots &+& \vdots &=&\vdots \\
& a_{n1} x_1 &+& a_{n2}x_2 &+& \cdots &+& a_{nn}x_n &=& b_n
\end{cases}
\end{equation*}

Equivalently, using matrix notation, this can be rewritten as:
\begin{equation*}
A\,\mathbf x =
\begin{bmatrix}
a_{11} & a_{12} & \cdots & a_{1n} \\
a_{21} & a_{22} & \cdots & a_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
a_{n1} & a_{n2} & \cdots & a_{nn}
\end{bmatrix}
\begin{bmatrix}
x_1 \\
x_2 \\
\vdots \\
x_n
\end{bmatrix} =
\begin{bmatrix}
b_1 \\
b_2 \\
\vdots \\
b_n
\end{bmatrix} = \mathbf b\,.
\end{equation*}

This matrix $ A $ is called the __coefficient matrix__ associated to the system
of equations.  Notice that in our case, this is a _square_ matrix. Systems of $
m $ linear equations in $ n $ variables where $ m \ne n $ will be considered
only in the next section, for the sake of simplicity.

__Exercise:__ Write down the coefficient matrix $ A $ and the vectors $ \mathbf x $ and $ \mathbf b $ for the systems of linear equations below:

(a) $$ \left\{
\begin{align*}
2x &+ 3y &\ =& \ 5 \\
4x &- \ y &\ =& \ 1
\end{align*} \right.$$

(b)
$$
\left\{\begin{align*}
x &\ \ +\ &2y & \ \ - & z & \ = & \ 4 \\
2x &\ \ - &y & \ \ + &3z & \ = & -6 \\
-3x &\ \ + &4y &\ \ + & z & \ = & 10
\end{align*} \right.
$$

__Example:__ 

Consider the system of equations:
$$
\left\{\begin{alignat*}{4}
3x\  &+\ 2y\ &&-\ &z\ &=&\ 1\\
2x\ &-\ 2y\ &&+\ 4&z\ &=&\ -2\\
-x\  &+\ \tfrac{1}{2}y\ &&- &z\ &=&\ 0
\end{alignat*}\right.
$$
or, in matrix form,
$$
\begin{align*}
\begin{bmatrix}
3 & 2 & -1 \\
2 & -2 & 4 \\
-1 & 0.5 & -1
\end{bmatrix}
\begin{bmatrix}
x \\
y \\
z
\end{bmatrix}
=
\begin{bmatrix}
\phantom{-}1 \\
-2 \\
\phantom{-}0
\end{bmatrix}
\end{align*}
$$

We can task NumPy with solving it by using the function `np.linalg.solve`:

In [None]:
# Coefficient matrix A and constant vector b:
A = np.array([[3, 2, -1],
              [2, -2, 4],
              [-1, 0.5, -1]])
b = np.array([1, -2, 0])

# Solve the system of equations:
x = np.linalg.solve(A, b)
print(x)  # Solution vector x

[ 1. -2. -2.]


__Exercise:__ Verify by direct substitution that $ (1, -2, -2) $ is, in fact, the solution to the preceding equations.

üìù `np.linalg.solve` solves a (square) linear system $ A\mathbf x = \mathbf b $
by first finding the $ LU $ decomposition of $ A $, then rewriting the original
system as
$$
\left\{
\begin{alignat*}{4}
L\mathbf y &= \mathbf b \\
U\mathbf x &= \mathbf y
\end{alignat*}
\right.
$$
The first of these equations can easily be solved by forward-substitution, and
the second by backward-substitution.  To find the $ LU $ decomposition itself,
Gaussian elimination is applied to $ A $. This is discussed in any Linear Algebra
course, where you will do many of these computations by hand.

__Exercise:__ Given an electrical network with three resistors and a $ 12 V $
battery, where the resistances are $ 3 \Omega $, $ 2\Omega $, and $ 4\Omega $.
Using Kirchhoff's laws, we obtain the following system of equations for currents
$ I_1 $, $ I_2 $, and $ I_3 $ flowing through each resistor:

$$
\left\{
\begin{alignat*}{4}
3I_1 &- I_2 &- I_3 &= 4,\\
-I_1 &+ 2I_2 &  &= -1,\\
I_1 &+ I_2 &+ 4I_3 &= 3.
\end{alignat*}\right.
$$

Find the currents $I_1$, $I_2$, and $I_3$.

__Exercise:__ Consider the following two linear system of equations. Try to solve them using NumPy and explain the output in each case.

(a)
$$
\begin{align*}
\begin{bmatrix}
1 & -2 \\
-3 & 6 \\
\end{bmatrix}
\begin{bmatrix}
x \\
y
\end{bmatrix}
=
\begin{bmatrix}
1 \\
3
\end{bmatrix}
\end{align*}
$$

(b) 
$$
\begin{align*}
\begin{bmatrix}
1 & -2 \\
-3 & 6 \\
\end{bmatrix}
\begin{bmatrix}
x \\
y
\end{bmatrix}
=
\begin{bmatrix}
\phantom{-}1 \\
-3
\end{bmatrix}
\end{align*}
$$

(c)
$$
\begin{align*}
\begin{bmatrix}
1 & -2 & 3 \\
-3 & 6 & 4 \\
\end{bmatrix}
\begin{bmatrix}
x \\
y \\
z
\end{bmatrix}
=
\begin{bmatrix}
1 \\
3
\end{bmatrix}
\end{align*}
$$

(d) Can you find (by hand) the solutions to the systems in items (a) and (b)?

## $ \S 2 $ Least-squares approximate solution to overdetermined systems

The most general linear system of equations is $ A \mathbf x = \mathbf b $ where
$$
A \text{ is $ m \times n $}, \quad \mathbf x \in \R^n \quad \text{and} \quad \mathbf b \in \R^m\,. 
$$
Geometrically, a solution $ \mathbf x $ corresponds to a choice of scalars $ x_k
$ that would express $ \mathbf b $ as a linear combination $ \mathbf b = x_1\,
\mathbf a_1 + \cdots + x_n\, \mathbf a_n $ of the $ n $ column-vectors of $ A $:
$$
\begin{bmatrix}
b_{1} \\
b_{2} \\
\vdots \\
b_{m}
\end{bmatrix}
=
x_1
\begin{bmatrix}
a_{11} \\
a_{21} \\
\vdots \\
a_{m1}
\end{bmatrix}
+
x_2
\begin{bmatrix}
a_{12} \\
a_{22} \\
\vdots \\
a_{m2}
\end{bmatrix}
+
x_n
\begin{bmatrix}
a_{1n} \\
a_{2n} \\
\vdots \\
a_{mn}
\end{bmatrix}\,.
$$
Thus, there will be a solution if and only if $ \mathbf b $ happens to lie in
the hyperplane $ W $ through the origin spanned by the $ n $ column-vectors of $ A $.

Now the dimension of the subspace $ W $ is at most $ n $. Hence if the
system is __over-determined__, that is, if we have fewer variables than
equations ($ n < m $), then $ W $ cannot coincide with $ \mathbb R^m $.
Therefore, for most choices of $ \mathbf b \in \mathbb R^m $, no exact solution
exists. In this situation, _the best that we can do is to pick the vector $
\mathbf{\hat{b}} $ in $ W $ that minimizes the distance to $ \mathbf b $ and to
find the corresponding solution $ \mathbf{\hat{x}} $ to the linear system_
$$
A \mathbf{x} = \mathbf{\hat{b}}\,.
$$
This $ \mathbf{\hat{b}} $ is the __orthogonal projection__ of $ \mathbf b $ onto $ W $. It is the
closest vector to $ \mathbf b $ in $ W $, so that $ \mathbf{\hat x} $ is such that the distance
from $ A\mathbf{\hat x} $ to $ \mathbf b $ is as small as possible.
This method of obtaining an approximate solution $ \mathbf{\hat x} $
to the original system $ A\mathbf x = \mathbf b $ is known as the __least-squares method__.
In NumPy it is implemented through the `np.linalg.lstsq` function.

üìù We have tacitly assumed above that the modified system $ A \mathbf{y} = \mathbf{\hat{b}} $
has a _unique_ solution $ \mathbf{\hat x} $. This will be the case if and only if the column-vectors
$ \mathbf a_1,\cdots,\mathbf a_n $ of $ A $ which span $ W $ are _linearly independent_. If this
is not the case, then there will be an infinite number of solutions to the modified system. In
this situation, one usually picks the solution $ \mathbf{\hat x} $ that minimizes the Euclidean norm
$ \Vert \mathbf{\hat x} \Vert $. This is also the default behavior of `np.linalg.lstsq`.

__Exercise:__ For the linear system given below:
$$
\left\{\begin{align*}
    x + 2y &= 2 \\
    3x + 4y &= 5 \\
    5x + 6y &= 5
\end{align*}\right.
$$

(a) Show by hand that an exact solution does not exist.

(b) Show that the least-squares approximate solution to the following
linear system is given by $ \mathbf{\hat x} = \big(-1, \frac{7}{4}\big) $.
_Hint:_ Set up the coefficient matrix $ A $, the vector $ \mathbf b $ and apply
`np.linalg.lstsq`.

In [None]:
# Define the matrix A and vector b:
# A = ...
# b = ...

# Compute the least squares solution:
x, _, _, _ = np.linalg.lstsq(A, b, rcond=None)
# This function returns three other values besides the approximate solution, hence the '_'s.
print(x)

### $ \S 3 $ Other features

With the help of `np.linalg` we can also compute:
* The eigenvalues and eigenvectors of a square matrix (`eig`).
* The $ QR $-decomposition of a square matrix (`qr`).
* The singular value decomposition (`svd`) of a matrix.
* The (Moore-Penrose) pseudo-inverse of a matrix (`pinv`).

In [None]:
A = np.array([[4, 2],
              [1, 3]])

# Compute the eigenvalues, eigenvectors, determinant and trace:
eigenvalues, eigenvectors = np.linalg.eig(A)
determinant = np.linalg.det(A)
trace = np.trace(A)

print(f"Eigenvalues: {eigenvalues}")
print(f"Eigenvectors:\n{eigenvectors}")
print(f"Determinant: {determinant:.2f}")
print(f"Trace: {trace:.2f}")

Eigenvalues: [5. 2.]
Eigenvectors:
[[ 0.89442719 -0.70710678]
 [ 0.4472136   0.70710678]]
Determinant: 10.00
Trace: 7.00


__Exercise:__ Recall that any real $ n \times n $ _symmetric_ matrix $ S $ can
be diagonalized over $ \mathbb R $ , and that we can find a basis for $ \mathbb
R^n $ consisting of orthogonal eigenvectors of $ S $. Check that the matrix
below has a full set of eigenvalues and that its eigenvectors are indeed
orthogonal.

In [None]:
B = np.array([[2, -1, 0],
              [-1, 2, -1],
              [0, -1, 2]])