<a href="https://colab.research.google.com/github/luisfranc123/Tutorials_Statistics_Numerical_Analysis/blob/main/Numerical_Methods/Chapter14_Linear_Algebra_Systems_Linear_Equations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##**14. Linear Algebra and Systems of Linear Equations**
---
**Textbook**: Python Programming and Numerical Methods

###**14.1 Basics of Linear Algebra**

####**14.1.1 Sets**

In mathematics, a **set** is a collection of objects. As defined earlier, sets are usually denoted by braces { }.For example, $S = \{orange, apple, banana\}$ means $S$ is the set containing "orange", "apple", and  "banana".

The **empty set** is the set containing no objects and is typically by empty braces such as { } or by 0. Given two sets, $A$ and $B$, the **union** of $A$ and $B$ is denoted by $A ∪ B$ and is eual to the set containing al the elemtns of $A$ and $A$. The **intersection** of $A$ and $B$ is denoted by $A ∩ B$ and is qeual to the set containing all the elemtns tht belong to both $A$ and $B$.

**Try it!** Let $S$ be the set of all real $(x, y)$ pairs such that $x^{2} + y^{2} = 1$. Write $S$ using set notation.

$S = \{(x, y): x, y ∈ \mathbb{R}, x^{2} + y^{2} = 1\}$



####**14.1.2 Vectors**

The set $\mathbb{R}^{n}$ is the set of all $n$-tuples of real numbers. In set notation, this is $\mathbb{R}^{n} = \{(x_{1}, x_{2}, x_{3}, ..., x_{n}): x_{1}, x_{2}, x_{3}, ..., x_{n} ∈ \mathbb{R}\}$. For example, the set $\mathbb{R}^{3}$ represents the set of real triplets $(x, y, z)$ coordinates in 3D space.

A **vector** in $\mathbb{R}^{n}$ is a $n$-tuple, or point, in $\mathbb{R}^{n}$. Vectors can be written horizontally (i.e., with the elements of the vector next to each other) in a **row vector**, or vertically, (i.e., with the elements of the vector on top of each other) in a **column vector**. If the context of a vector is ambiguous, it usually means the vector is a column vector. The $i$-th element of a vector $v$, is denoted by $v_{i}$. the transpose of a column vector is a row vector of the same length, and the transpose of a row vector is a column vector. In mathematics, the transpose is denoted by a superscript $T$, or $v^{T}$. The **zero vector** is the vector in $\mathbb{R}^{n}$ containing all zeros.

The **norm** of a vector is a measure of its length. There are many ways of defining the length of a vector depending on the metric used (i.e., the distance formula chosen). The most common is called the $L_{2}$ norm, which is computed according to the distance formula. The $L_{2}$ **norm** of a vector $v$ is denoted by $||v||_{2}$ and $||v||_{2} = \sqrt{\Sigma_{i}{v_{i}^{2}}}$. This is sometimes also called Euclidean distance and refers to the "physical" length of a vector in 1, 2, or 3D space. The $L_{1}$ norm or "Manhattan distance" is computed as $||v||_{1} = \Sigma_{i}{|v_{i}|}$, and is named after the grid-like road structure in New york City. In general, the $p-$**norm**, $L_{p}$, of a vector is $||v||_{p} = \sqrt[p]{(\Sigma_{i}{v_{i}^{p}})}$. The $L_{\infty}$ **norm** is the $p$-norm, where $p = \infty$. The $L_{\infty}$ norm is written as $||v||_{\infty}$ and is equal to the maximum absolute value in $v$.   

**Try it!**: Create a row vector and a column vector, and show their shape.

In [None]:
import numpy as np

vector_row = np.array([[1, -5, 3, 2, 4]])
vector_column = np.array([[1], [2], [3], [4]])
print(vector_row.shape)
print(vector_column.shape)

(1, 5)
(4, 1)


**Try it!**: Transpose the row vector defined above into a column vector and calculate its $L_{1}$, $L_{2}$, and $L_{\infty}$ norm. Verify that the $L_{\infty}$ norm of a vector is equivalent to the maximum value of the elements in the vector.

In [None]:
from numpy.linalg import norm
new_vector = vector_row.T
print(new_vector)
norm_1 = norm(new_vector, 1)
norm_2 = norm(new_vector, 2)
norm_inf = norm(new_vector, np.inf)
print(f"L_1 is: {norm_1:.1f}")
print(f"L_2 is: {norm_2:.1f}")
print(f"L_inf is: {norm_inf:.1f}")


[[ 1]
 [-5]
 [ 3]
 [ 2]
 [ 4]]
L_1 is: 15.0
L_2 is: 7.4
L_inf is: 5.0


**Vector addition** is defined as the pairwise addition of the elements of the added vectors. For example, if $v$ and $w$ are vectors in $\mathbb{R}^{n}$, then $u = v + w$ is defined as having elements $u_{i} = v_{i} + w_{i}$.

**Vector multiplication** can be defined in several ways depending on the context. **Scalar multiplication** of a vector is the product of a vector and a **scalar** (i.e., a number in $\mathbb{R}$). Scalar multiplication is defined as the product of each element of the vector by the scalar. More specifically, if $\alpha$ is a scalar and $v$ is a vector, then $u = \alpha{v}$ is defined as having elements $u_{i} = \alpha{v_{i}}$.

The **dot-product** of two vectors is the sum of the products of the respective elements and is denoted by $\cdot$, and $v\cdot{w}$ is read "$v$ dot $w$". Therefore, for $v, w \in \mathbb{R}^{n}$, $d = v\cdot{w}$ is defined as $d = \Sigma^{n}_{i = 1}{v_{i}w_{i}}$. The **angle between two vectors**, $\theta$, is defined by the formula:

$$v\cdot{w} = ||v||_{2}||w||_{2}\cos{\theta}.$$

Finally, the **cross-product** between two vectors, $v$ and $w$, is written as $v× w$. It is efined by $v× w = ||v||_{2}||w||_{2}\sin{\theta}\space{n}$, where $\theta$ is the angle between $v$ and $w$ (which can be computed from the dot-product), and $n$ is a vector perpendicular to both $v$ and $w$ with unit length (i.e., its length is one). The geometric interpretation of the cross-product is a vector perpendicular to both $v$ and $w$ , with the length equal to the area enclosed by the parallelogram created by the two vectors.  

**Try it!**: Given the vectors $v = [0, 2, 0]$ and $w = [3, 0, 0]$, use the `NumPy` function `cross` to compute the cross-product of `v` abd `w`.

In [None]:
v = np.array([[0, 2, 0]])
w = np.array([[3, 0, 0]])
print(np.cross(v, w))

[[ 0  0 -6]]


Assuming that $S$ is a set in which addition and scalar multiplication are defined, a **linear combination** of $S$ is defined as

$$\Sigma{\alpha_{i}s_{i}},$$

where $\alpha_{i}$ is any real number, and $s_{i}$ is the $i$-th object in $S$. Sometimes the $\alpha_{i}$ values are called **coefficients** of $s_{i}$.

A set is called **linearly independent** if no object in the set can be written as a linear combination of other objects in the set. For the purposes of this book, we will only consider the linear independence of a set of vectors. A set of vectors that is not linearly independent **linearly dependent**.

**Try it!**: Given the row vectors $v = [0, 3, 2], w = [4, 1, 1]$, and $u = [0, -2, 0]$, write the vector $x = [-8, -1, 4]$ as a linear combination of $v, w$, and $u$.   

In [None]:
import numpy as np
v = np.array([[0, 3, 2]])
w = np.array([[4, 1, 1]])
u = np.array([[0, -2, 0]])
x = 3*v - 2*w + 4*u
print(x)

[[-8 -1  4]]


####**14.1.3 Matrices**

An $m× n$ **matrix** is a rectangular table of numbers consisiting of $m$ rows and $n$ columns. The norm of a matrix can be considered as a particular kind of vector norm. If we treat the $m× n$ elements of $M$ as the elements of an $mn-dimensional$ vector, then the $p$-norm of this vector can be written as

$$||M||_{p} = \sqrt[p]{\sum_{i}^{m}\sum_{j}^{m}{|a_{i\space{j}}|^{p}}.}$$

It is possible to calculate the matrix norm using the same `norm` function in `NumPy` as that for a vector.

Matrix addition and scalar multiplication for matrices work the same was as for vectors. However, **matrix multiplication** between two matrices, $P$ and $Q$, is defined when $P$ is an $m× p$ matrix and $Q$ is a $p× n$ matrix. The result of $M = P\space{Q}$ is a matrix $M$ that is $m× n$. The diension $p$ is called the **inner matrix dimension**, and the inner matrix dimensions must match (i.e., the number of columns in $P$ and the number of rows in $Q$ must be the same) for matrix multiplication to be defined. the dimensions $m$ and $n$ are called **outer matrix dimensions**. Formaly, if $P$ is $m× p$ and $Q$ is $p× n$, then $M = P\space{Q}$ is defined as

$$M_{i\space{j}} = \sum_{k = 1}^{p}{P_{ik}Q_{kj}.}$$

The product of two atrices $P$ and $Q$ in Python is achieved by using the **dot** method in `NumPy`. The **transpose** of a matrix is a reversal of its rows with its columns. The transpose is denoted by a superscript, $T$, such as $M^{T}$ is the transpose of matrix $M$. In Python the method `T` for a `NumPy` array is used to get the transpose. For example, if `M` is a matrix, then `M.T` is its transpose.

**Try it!** Let the matrices `P` and `Q` be $[[1, 7], [2, 3], [5, 0]]$ and $[[2, 6, 3, 1], [1, 2, 3, 4]]$, respectively. Compute the Python matrix product of `P` and `Q`. Show that the product of `Q` and `P` will produce an error.

In [None]:
P = np.array([[1, 7], [2, 3], [5, 0]])
Q = np.array([[2, 6, 3, 1], [1, 2, 3, 4]])
print(P)
print(Q)
print(f"P * Q: {np.dot(P, Q)}")
np.dot(Q, P)

[[1 7]
 [2 3]
 [5 0]]
[[2 6 3 1]
 [1 2 3 4]]
P * Q: [[ 9 20 24 29]
 [ 7 18 15 14]
 [10 30 15  5]]


ValueError: shapes (2,4) and (3,2) not aligned: 4 (dim 1) != 3 (dim 0)

A **square matrix** is an $n× n$ matrix, that is, it has the same number of rows as columns. The **determinant** is an important property of square matrices. It is a psecial number that can be calculated directly from a square matrix. The determinant is denoted by `det`, both in mathematics and in `NumPy`'s `linalg` package. In the case of a $2× 2$ matrix, the determinant is

$$|M| = \left[\begin{array}{cc}
a & b\\
c & d
\end{array}\right] = ad - bc.$$

We can use a similar approach to calculate the determinant for a higher-dimensional matrix but it is easier to calculate using Python.

The **identity matrix** is a square matrix with 1s on the diagonal and 0s elsewhere. The identity matrix is usually denoted by $I$ and is analogous to the real number identity, 1. That is, multiplying any matrix by $I$ (of compatible size) will produce the same matrix.

**Try it!**: Find the determinant of matrix $M = [[0, 2, 1, 3], [3, 2, 8, 1], [1, 0, 0, 3], [0, 3, 2, 1]]$. use the `np.eye` function to produce a $4× 4$ identity matrix, $I$. Multiply $M$ by $I$ to show that the result is $M$.

In [None]:
import numpy as np
from numpy.linalg import det

M = np.array([[0, 2, 1, 3],
              [3, 2, 8, 1],
              [1, 0, 0, 3],
              [0, 3, 2, 1]])
print(f"M:\n {M}")
print("--------")
print(f"Determinant: {det(M):.1f}")
I = np.eye(4)
print(f"I: \n {I}")
print("--------")
print(f"M*I:\n {np.dot(M, I)}")

M:
 [[0 2 1 3]
 [3 2 8 1]
 [1 0 0 3]
 [0 3 2 1]]
--------
Determinant: -38.0
I: 
 [[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]
--------
M*I:
 [[0. 2. 1. 3.]
 [3. 2. 8. 1.]
 [1. 0. 0. 3.]
 [0. 3. 2. 1.]]


The **inverse** of a square matrix $M$ is a matrix of the same size, $N$, such that $M\dot\space{N} = I$. The inverse of a matrix is analogous to the inverse of a real number. For example, the inverse of $3$ is $\frac{1}{3}$ because $(3)(\frac{1}{3}) = 1$. A matrix is said to be **invertible** if it has an inverse. The inverse of a matrix is unique, that is, for an invertible matrix, there is only one inverse for that matrix. If $M$ is a square matrix, its inverse is denoted by $M^{-1}$ in mathematics, and it can be computed in Python using the function `inv` from `NumPy`'s `linalg` package. For a $2× 2$ matrix, the analytical solution of the matrix inverse is

$$M^{-1} = \left[\begin{array}{cc}
a & b\\
c & d
\end{array}\right]^{-1} = \frac{1}{|M|} \left[\begin{array}{cc}
d & -b\\
-c & a
\end{array}\right].$$

Calculating the matrix inverse for the analytical solution becomes complicated as the dimension of the matrix increases. There are many other methods which can make things easier, such as Gaussian elimination, Newton's method, eigendecomposition, etc.

Recall that zero has no inverse for multiplication i the setting of real numbers. Similarly, there are matrices that do not have inverses. These matrices are called **singular**. atrices that do have an inverse are called **nonsingular**.

**One way to determine if a matrix is singular is by computing its determinant. If the determinant is 0, then the matrix is singular, if not, the matrix is non-singular**.


**Try it!**: The matrix $M$ (in the previous example) has a nonzero determinant. Compute the inverse of $M$. Show that the matrix $P = [[0, 1, 0], [0, 0, 0], [1, 0, 0]]$ has a determinant value of zero, and therefore has no inverse.



In [None]:
import numpy as np
from numpy.linalg import inv

print(f"Inv M:\n {inv(M)}")
P = np.array([[0, 1, 0],
              [0, 0, 0],
              [1, 0, 0]])
print(f"det (P):\n {det(P)}")

Inv M:
 [[-1.57894737 -0.07894737  1.23684211  1.10526316]
 [-0.63157895 -0.13157895  0.39473684  0.84210526]
 [ 0.68421053  0.18421053 -0.55263158 -0.57894737]
 [ 0.52631579  0.02631579 -0.07894737 -0.36842105]]
det (P):
 0.0


A matrix that is close to being singular (i.e., the determinant is close to zero) is called **ill-conditioned**. Although ill-conditioned matrices have inverses, they are problematic numerically in the same way that dividing a number by a very, very small number is problematic. That is, it can result
in computations that result in overﬂow, underﬂow, or numbers small enough to result in signiﬁcant
round-off errors. The condition number
is a measure of how ill-conditioned a matrix is: it is deﬁned as the norm of the matrix times the norm
of the inverse of the matrix, that is, $||M||||M^{-1}||$. In Python, it can be computed using `NumPy`’s function `cond` from `linalg`. The higher the condition number, the closer the matrix to being singular.

The **rank** of an $m × n$ matrix $A$ is the number of linearly independent columns or rows of $A$
and is denoted by rank($A$). It can be shown that the number of linearly independent rows is always
equal to the number of linearly independent columns for any matrix. A matrix has **full rank** if $rank(A) = min(m, n)$. The matrix $A$ is also of full rank if all of its columns are linearly independent.
An **augmented matrix** is a matrix $A$ concatenated with a vector y and is written $[A, y]$.
This is commonly read as “$A$ augmented with $y$.” You can use `np.concatenate` to concatenate. If
$rank([A, y]) = rank(A)+ 1$, then the vector $y$ is “new” information. That is, it cannot be created as a
linear combination of the columns in $A$. Rank is an important characteristic of matrices because of its
relationship to solutions of linear equations, which is discussed in the last section of this chapter.

**Try it!**: For the matrix $A = [[1, 1, 0], [0, 1, 0], [1, 0, 1]]$, compute the condition number and rank. If $y = [[1], [2]. [1]]$, get augmented matrix $[A, y]$.

In [None]:
import numpy as np
from numpy.linalg import cond, matrix_rank

A = np.array([[1, 1, 0],
              [0, 1, 0],
              [1, 0, 1]])

print(f"Condition number:\n {cond(A):.3f}")
print("------------")
print(f"Rank:\n {matrix_rank(A)}")
y = np.array([[1], [2], [1]])
A_y = np.concatenate((A, y), axis = 1)
print("------------")
print(f"Augmented Matrix:\n {A_y}")


Condition number:
 4.049
------------
Rank:
 3
------------
Augmented Matrix:
 [[1 1 0 1]
 [0 1 0 2]
 [1 0 1 1]]


###**14.2 Linear Tranformations**

For any vectors $x$ and $y$, and scalars $a$ and $b$, we say that a function $F$ is a **linear transformation** if

$$F(ax+by)=aF(x)+bF(y)$$

It can be shown that multiplying an $m×n$ matrix $A$ and an $n×1$ vector $v$ of compatible size is a linear transformation of $v$. Therefore from this point forward, a matrix will be synonymous with a linear transformation function.

###**Try it!**:
Let $x$ be a vector and let $F(x)$ be defined by $F(x) = Ax$, where $A$ is a rectangular matrix of appropiate size. Show that $F(x)$ is a linear transformation.

Proof: Since $F(x) = Ax$, then for vectors $v$ and $w$, and scalars $a$ and $b$, $F(av + bw) = A(av + bw) $ (by definition of $F$) $=aAv + bAv$ (by distributive property of matrix multiplication) $= aF(v) + bF(w)$ (by definition of $F$).  





###**14.3 System of Linear Equations**

A **linear equation** is an equality of the form

$$\sum_{i = 1}^{n}{a_{i}x_{i}} = y$$

where $a_{i}$ are scalars, $x_{i}$ are unknown variables in $\mathbb{R}$, and $y$ is a scalar.

A **system of linear equations** is a set of linear equations that share the same variables. The **matrix form** of a system of linear equations is **Ax = Y**, where $A$ is an $m×n$ matrix, $A(i, j)=a_{i, j}$, $y$ is a vector in $\mathbb{R}^{m}$, and $x$ is an unknown vector in $\mathbb{R}^{n}$.

**Try it!** Put the following system of equations into matrix form:

$$4x + 3y - 5z = 2,$$
$$-2x - 4y + 5z = 5,$$
$$7x + 8y = -3,$$
$$x + 2z = 1,$$
$$9x + y - 6z = 6,$$

$$\left[\begin{array}{cc}
4 & 3 & -5\\
-2 & -4 & 5\\
7 & 8 & 0\\
1 & 0 & 2\\
9 & 1 & -6
\end{array}\right]\left[\begin{array}{cc}
x\\
y\\
z\\
\end{array}\right] = \left[\begin{array}{cc}
2\\
5\\
-3\\
1\\
6\\
\end{array}\right].$$

###**14.3 Solutions to Systems of Linear Equations**

Let us say we have $n$ equations with $n$ variables, $Ax = y$, as follows:

$$\left[\begin{array}{cc}
a_{1, 1} & a_{1, 2} & ... & a_{1, n}\\
a_{2, 1} & a_{2, 2} & ... & a_{2, n}\\
. & . & . & .\\
. & . & . & .\\
. & . & . & .\\
a_{n, 1} & a_{n, 2} & ... & a_{n, n}\\
\end{array}\right]\left[\begin{array}{cc}
x_{1}\\
x_{2}\\
.\\
.\\
.\\
x_{n}\\
\end{array}\right] = \left[\begin{array}{cc}
y_{1}\\
y_{2}\\
.\\
.\\
.\\
y_{n}\\
\end{array}\right].$$


####**14.4.1 Gauss Elmination Method**

The **Gauss elimination** method is a procedure that turns the matrix $A$ into an **upper-triangular** form to solve the system of equations. Let us use a system of four equations and four variables to illustrate the idea. Gauss elimination essentially turns the system of equation into

$$\left[\begin{array}{cc}
a_{1, 1} & a_{1, 2} & a_{1, 3} & a_{1, 4}\\
0 & a^{'}_{2, 2} & a^{'}_{2, 3} & a^{'}_{2, 4}\\
0 & 0 & a^{'}_{3, 3} & a^{'}_{3, 4}\\
0 & 0 & 0 & a^{'}_{4, 4}
\end{array}\right]\left[\begin{array}{cc}
x_{1}\\
x_{2}\\
x_{3}\\
x_{4}\\
\end{array}\right] = \left[\begin{array}{cc}
y_{1}\\
y^{'}_{2}\\
y^{'}_{3}\\
y^{'}_{4}\\
\end{array}\right].$$

By returning to the matrix form using this method, we can see the equations turn into:

$$a_{1, 1}x_{1} + a_{1, 2}x_{2} + a_{1, 3}x_{3} + a_{1, 4}x_{4} = y_{1},$$
$$a^{'}_{2, 2}x_{2} + a^{'}_{2, 3}x_{3} + a^{'}_{2, 4}x_{4} = y^{'}_{2},$$
$$a^{'}_{3, 3}x_{3} + a^{'}_{3, 4}x_{4} = y^{'}_{3},$$
$$a^{'}_{4, 4}x_{4} = y^{'}_{4}.$$

Now, $x_{4}$ can be easily solved for by dividing both sides by $a^{'}_{4, 4}$, and then by substituting the result into the third equation to solve for $x_{3}$. With $x_{3}$ and $x_{4}$, we can substitute them into the second equation to solve for $x_{2}$, and we are now able to solve for all $x$. We solved the system of equations bottom-up; this is called **backward substitution**.





####**14.4.2 Gauss-Jordan Elimination Method**

Gauss-Jordan elimination solves system of equations. It is a procedure to turn $A$ into a diagonal form such that the matrix form of the equations becomes

$$\left[\begin{array}{cc}
1 & 0 & 0 & 0\\
0 & 1 & 0 & 0\\
0 & 0 & 1 & 0\\
0 & 0 & 0 & 1\\
\end{array}\right]\left[\begin{array}{cc}
x_{1}\\
x_{2}\\
x_{3}\\
x_{4}\\
\end{array}\right] = \left[\begin{array}{cc}
y^{'}_{1}\\
y^{'}_{2}\\
y^{'}_{3}\\
y^{'}_{4}\\
\end{array}\right].$$

Essentially, the equations become:

$$x_{1} = y^{'}_{1}, $$

$$x_{2} = y^{'}_2,$$

$$x_{3} = y^{'}_3,$$

$$x_{4} = y^{'}_4,$$

In both cases (Gauss and Gauss-Jordan elimination) we need to construct the augmented matrix $[A, y]$ to make all the needed algebraic manipulations.

####**14.4.3 Lu Decomposition Method**

The two methods shown above involve changing both $A$ and $y$ at the same time while trying to turn $A$ to an upper triangular or diagonal matrix form. Sometimes we may have same set of equations but different sets of $y$ for different experiments. This is actually quite common in the real world, where we have different experiment observations $y_a, y_b, y_c, ...$ Thereforem we must solve $Ax= y_a, Ax = y_b, ...$ many times, since every time the $[A, y]$ will change. Obviusly, this is really inefficient. Is there a method by which we only change the left side of $A$ but ot the right-hand side?

The $LU$ decomposition method changes the matrix $A$ only, instead of $y$. It is ideal for solving the system with the same coefficient matrices $A$ but different constant vectors $y$. The $LU$ decomposition method aims to turn $A$ into the product of two matrices $L$ and $U$, where $L$ is a lower triangular matrix while $U$ is an upper triangular matrix. With this decomposition, we convert the system of equations to the following form:

$$LUx = y → \left[\begin{array}{cc}
l_{1, 1} & 0 & 0 & 0\\
l_{2, 1} & l_{2, 2} & 0 & 0\\
l_{3, 1} & l_{3, 2} & l_{3,3} & 0\\
l_{4, 1} & l_{4, 2} & l_{4,3} & l_{4,3}\\
\end{array}\right]\left[\begin{array}{cc}
u_{1, 1} & u_{1, 2} & u_{1,3} & u_{1,4}\\
0 & u_{2, 2} & u_{2,3} & u_{2,4}\\
0 & 0 & u_{3,3} & u_{3,4}\\
0 & 0 & 0 & u_{4,4}\\
\end{array}\right]\left[\begin{array}{cc}
x_{1}\\
x_{2}\\
x_{3}\\
x_{4}\\
\end{array}\right] = \left[\begin{array}{cc}
y_{1}\\
y_{2}\\
y_{3}\\
y_{4}\\
\end{array}\right].$$

If we define $Ux = M$, then the above equations become:

$$\left[\begin{array}{cc}
l_{1, 1} & 0 & 0 & 0\\
l_{2, 1} & l_{2, 2} & 0 & 0\\
l_{3, 1} & l_{3, 2} & l_{3,3} & 0\\
l_{4, 1} & l_{4, 2} & l_{4,3} & l_{4,3}\\
\end{array}\right]M = \left[\begin{array}{cc}
y_{1}\\
y_{2}\\
y_{3}\\
y_{4}\\
\end{array}\right].$$

We can easily solve the above problem by forward substitution. After we solve for $M$, we can easily solve the rest of the problem using backward substitution:

$$\left[\begin{array}{cc}
u_{1, 1} & u_{1, 2} & u_{1,3} & u_{1,4}\\
0 & u_{2, 2} & u_{2,3} & u_{2,4}\\
0 & 0 & u_{3,3} & u_{3,4}\\
0 & 0 & 0 & u_{4,4}\\
\end{array}\right]\left[\begin{array}{cc}
x_{1}\\
x_{2}\\
x_{3}\\
x_{4}\\
\end{array}\right] = \left[\begin{array}{cc}
m_{1}\\
m_{2}\\
m_{3}\\
m_{4}\\
\end{array}\right].$$

But how do we obtain the $L$ and $U$ matrices? There are different ways to obtain the $LU$ decomposition.
Below is one example that uses the Gauss elimination method. From the above, we know that we obtain an upper triangular matrix after we conduct the Gauss elimination. At the same time, we also obtain the lower triangular matrix even though it is never explicitly written out. During the Gauss elimination procedure, the matrix A actually turns into the product of two matrices as shown below.
The right upper triangular matrix is the one we obtained earlier. The diagonal elements in the left lower triangular matrix are 1, and the elements below the diagonal elements are the multipliers that multiply
the pivot equations to eliminate the elements during the calculation:

$$A = \left[\begin{array}{cc}
1 & 0 & 0 & 0\\
m_{2, 1} & 1 & 0 & 0\\
m_{3, 1} & m_{3, 2} & 1 & 0\\
m_{4, 1} & m_{4, 2} & m_{4, 3} & 1\\
\end{array}\right]\left[\begin{array}{cc}
u_{1, 1} & u_{1, 2} & u_{1, 3} & u_{1, 4}\\
0 & u_{2, 2} & u_{2, 3} & u_{2, 4}\\
0 & 0 & u_{3, 3} & u_{3, 4}\\
0 & 0 & 0 & u_{4, 4}\\
\end{array}\right].$$


Note that we obtain both $L$ and $U$ at the same time whe we perform the Gauss elimination. Using the above example, where $U$ is the one we used before to solve the equations, and $L$ is composed of the multipliers, we obtain:


$$ L = \left[\begin{array}{cc}
1 & 0 & 0\\
-0.5 & 1 & 0\\
2 & -0.8 & 1\\
\end{array}\right],$$

$$ U = \left[\begin{array}{cc}
4 & 3 & -5\\
0 & -2.5 & 2.5\\
0 & 0 & 60\\
\end{array}\right].$$

**Try it!** Verify that the above $L$ and $U$ matrices are the $LU$ decomposition of matrix $A$. The result should be $A = LU$.

In [None]:
import numpy as np

u = np.array([[4, 3, -5],
              [0, -2.5, 2.5],
              [0, 0, 60]])
l = np.array([[1, 0, 0],
              [-0.5, 1, 0],
              [2, -0.8, 1]])
print(f"LU = \n {np.dot(l, u)}")

LU = 
 [[ 4.  3. -5.]
 [-2. -4.  5.]
 [ 8.  8. 48.]]


####**14.4.4 Iterative Methods - Gauss-Seidel Method**

The methods introduced above are all direct methods where the solution is computed using a finite number of operations. This section introduces a different class of methods, namely the **iterative methods**, or **indirect methods**. They start with an initial guess of the solution and then repeatedly improve the solution until the change of the solution is below a chosen threshold. In order to use this iterative process, we first need to write the explicit form of a system of equations. If we have a system of linear equations

$$\left[\begin{array}{cc}
a_{1, 1} & a_{1, 2} & ... & a_{1, n}\\
a_{2, 1} & a_{2, 2} & ... & a_{2, n}\\
. & . & . & .\\
. & . & . & .\\
. & . & . & .\\
a_{m, 1} & a_{m, 2} & ... & a_{m, n}\\
\end{array}\right]\left[\begin{array}{cc}
x_{1}\\
x_{2}\\
.\\
.\\
.\\
x_{n}\\
\end{array}\right] = \left[\begin{array}{cc}
y_{1}\\
y_{2}\\
.\\
.\\
.\\
y_{m}\\
\end{array}\right],$$

we can write its explicit form as

$$x_i = \frac{1}{a_{i, i}}[y_i-\sum^{j = n}_{j = 1, j\neq{i}}{a_{i, j}x_j}].$$

This is the basics of the iterative methods; we can assume initial values for all the $x$, and use it as $x^{(0)}$. In the first iteration, we can substitute $x^{(0)}$ into the right-hand side of the explicit equation above to obtain the first iteration solution $x^{(1)}$. By substituting $x^{(1)}$ into the equation, we obtain $x^{(2)}$, and the iterations continue until the difference between $x^{(k)}$ and $x^{(k-1)}$ is smaller than some predefined value.

Iterative methods require having specific conditions for the solution to converge. A sufficient, but not necessary, condition of the convergence is that the coefficient matrix $a$ is **diagonal dominant**. This means that in each row of the matrix of coefficients $a$, the absolute value of the diagonal element is greater than the sum of the absolute values of the off-diagonal elements. If the coefficient matrix satisfies this condition, the iterations will converge to the solution. Note that the solution process might still converge even when this condition is not satisfied.  

####**14.4.4.1 Gauss-Seidel Method**

The **Gauss-Seidel method** is a specific iterative method that is always using the latest estimated value for each element in $x$. For example, first assume that the initial values for $x_2, x_3, ..., x_n$ (except for $x_1$) are given and calculate $x_1$. Using the calculated $x_1$ and the rest of the $x$ (except for $x_2$), we can calculate $x_2$. Continuing in the same manner and calculating all the elements in $x$ will conclude the first iteration. The unique part of the Gauss-Seidel method is the use of the latest value to calculate the next value in $x$. Such iterations are continued until the value converges. Let us use this method to solve the same problem we just solved above.

**Example**: Solve the following system of linear equations using Gauss-Seidel method using a predefined threshold $\epsilon = 0.01$. Remember to check if the converge condition is satisfied or not.

$$8x_1 + 3x_2 - 3x_3 = 14,$$
$$-2x_1-8x_2+5x_3 = 5,$$
$$3x_1+5x_2+10x_3 = -8.$$

Let us first check if the coefficient matrix is diagonally dominant or not.



In [None]:
import numpy as np
a = np.array([[8, 3, -3], [-2, -8, 5], [3, 5, 10]])

# Find diagonal coefficients
diag = np.diag(np.abs(a))

# Find row sum without diagonal
off_diag = np.sum(np.abs(a), axis = 1) - diag

if np.all(diag > off_diag):
  print("Matrix is diagonally dominant")

else:
  print("Not diagonally dominant")

Matrix is diagonally dominant


Since it is guaranteed to converge, we can use Gauss-Seidel method to solve the system.

In [None]:
# Set initial conditions:
import numpy as np
import pandas as pd

x1 = 0
x2 = 0
x3 = 0
epsilon = 0.01
converged = False

x_old = np.array([x1, x2, x3])

print("Iteration results")

results = []  # List to store iteration results

for k in range(1, 50):
  x1 = (14 - 3*x2 + 3*x3)/8
  x2 = (5 + 2*x1 - 5*x3)/(-8)
  x3 = (-8 - 3*x1 - 5*x2)/(-5)
  x = np.array([x1, x2, x3])

  # check if it smaller than the threshold
  dx = np.sqrt(np.dot(x - x_old, x - x_old))

  results.append({"k": k, "x_1": x1, "x_2": x2, "x_3": x3})

  if dx < epsilon:
    converged = True
    print("Converged!")
    break

  # Assign the latest x value to the ld value
  x_old = x

df_results = pd.DataFrame(results)
print(df_results)

if not converged:
  print("Not converged, increase the # of iterations")

Iteration results
Converged!
     k       x_1       x_2       x_3
0    1  1.750000 -1.062500  1.587500
1    2  2.743750 -0.318750  2.927500
2    3  2.967344  0.462852  3.843258
3    4  3.017652  1.022623  4.433214
4    5  3.028972  1.388516  4.805899
5    6  3.031519  1.620807  5.039718
6    7  3.032092  1.766801  5.186056
7    8  3.032221  1.858230  5.277562
8    9  3.032250  1.915414  5.334764
9   10  3.032256  1.951163  5.370517
10  11  3.032258  1.973509  5.392863
11  12  3.032258  1.987475  5.406830
12  13  3.032258  1.996204  5.415559
13  14  3.032258  2.001660  5.421015


###**14.5 Solving System of Linear Equations in Python**

The examples presented above demonstrated the various methods we can use to solve system of linear equations. This is also very easy to do in Python, as shown below. The easiest way to get the solution is via the `solve` function in `NumPy`.

**Try it!**: Use `numpy.linalg.solve` to solve the following equations:

$$4x_1 + 3x_2 - 5x_3 = 2,$$
$$-2x_1 - 4x_2 + 5x_3 = 5,$$
$$8x_1 + 8x_2 = -3.$$



In [None]:
import numpy as np

A = np.array([[4, 3, -5],
             [-2, -4, 5],
             [8, 8, 0]])
y = np.array([2, 5, -3])

x = np.linalg.solve(A, y)
print(x.round(4))

[ 2.2083 -2.5833 -0.1833]


We get the same results as those in the previous section when calculates by hand. Under the "hood", the solver is actually doung an $LU$ decomposition to get the results.

**Try it!**: Try to solve the above equations using the matrix inversion approach.

In [None]:
A_inv = np.linalg.inv(A)
x = np.dot(A_inv, y)
print(x.round(4))

[ 2.2083 -2.5833 -0.1833]


We can also obtain the $L$ and $U$ matrices used in the $LU$ decomposition uing the `SciPy` package.

In [None]:
from scipy.linalg import lu

P, L, U = lu(A)
print("P:\n", P)
print("L:\n", L)
print("U:\n", U)
print("LU:\n", np.dot(L, U))

P:
 [[0. 0. 1.]
 [0. 1. 0.]
 [1. 0. 0.]]
L:
 [[ 1.    0.    0.  ]
 [-0.25  1.    0.  ]
 [ 0.5   0.5   1.  ]]
U:
 [[ 8.   8.   0. ]
 [ 0.  -2.   5. ]
 [ 0.   0.  -7.5]]
LU:
 [[ 8.  8.  0.]
 [-2. -4.  5.]
 [ 4.  3. -5.]]


###**14.6 Matrix Inversion**

We defined the inverse of a square matrix $M$ as a matrix of the same size, $M^{-1}$, such that $M\dot\space{M^{-1}} = M^{-1}\dot\space{M} = I$. If the dimension of the matrix is high, the anlytical solution for the matrix inversion will be complicated. Therefore, we need some otherefficient ways to obtain the inverse of the matrix.

Let us use a $4 ×  4$ matrix for illustration. Suppose we have

$$\left[\begin{array}{cc}
m_{1, 1} & m_{1, 2} & m_{1, 3} & m_{1, 4}\\
m_{2, 1} & m_{2, 2} & m_{2, 3} & m_{2, 4}\\
m_{3, 1} & m_{3, 2} & m_{3, 3} & m_{3, 4}\\
m_{4, 1} & m_{4, 2} & m_{4, 3} & m_{4, 4}\\
\end{array}\right],$$

and the inverse of $M$ is

$$X = \left[\begin{array}{cc}
x_{1, 1} & x_{1, 2} & x_{1, 3} & x_{1, 4}\\
x_{2, 1} & x_{2, 2} & x_{2, 3} & x_{2, 4}\\
x_{3, 1} & x_{3, 2} & x_{3, 3} & x_{3, 4}\\
x_{4, 1} & x_{4, 2} & x_{4, 3} & x_{4, 4}\\
\end{array}\right],$$

Therefore, we will have

$$M\dot\space{X} =\left[\begin{array}{cc}
m_{1, 1} & m_{1, 2} & m_{1, 3} & m_{1, 4}\\
m_{2, 1} & m_{2, 2} & m_{2, 3} & m_{2, 4}\\
m_{3, 1} & m_{3, 2} & m_{3, 3} & m_{3, 4}\\
m_{4, 1} & m_{4, 2} & m_{4, 3} & m_{4, 4}\\
\end{array}\right] \left[\begin{array}{cc}
x_{1, 1} & x_{1, 2} & x_{1, 3} & x_{1, 4}\\
x_{2, 1} & x_{2, 2} & x_{2, 3} & x_{2, 4}\\
x_{3, 1} & x_{3, 2} & x_{3, 3} & x_{3, 4}\\
x_{4, 1} & x_{4, 2} & x_{4, 3} & x_{4, 4}\\
\end{array}\right] = \left[\begin{array}{cc}
1 & 0 & 0 & 0\\
0 & 1 & 0 & 0\\
0 & 0 & 1 & 0\\
0 & 0 & 0 & 1\\
\end{array}\right].$$

We can rewrite the above equation as four separate equations, i.e.,

$$\left[\begin{array}{cc}
m_{1, 1} & m_{1, 2} & m_{1, 3} & m_{1, 4}\\
m_{2, 1} & m_{2, 2} & m_{2, 3} & m_{2, 4}\\
m_{3, 1} & m_{3, 2} & m_{3, 3} & m_{3, 4}\\
m_{4, 1} & m_{4, 2} & m_{4, 3} & m_{4, 4}\\
\end{array}\right]\left[\begin{array}{cc}
x_{1, 1}\\
x_{2, 1}\\
x_{3, 1}\\
x_{4, 1}\\
\end{array}\right] = \left[\begin{array}{cc}
1\\
0\\
0\\
0\\
\end{array}\right],$$

$$\left[\begin{array}{cc}
m_{1, 1} & m_{1, 2} & m_{1, 3} & m_{1, 4}\\
m_{2, 1} & m_{2, 2} & m_{2, 3} & m_{2, 4}\\
m_{3, 1} & m_{3, 2} & m_{3, 3} & m_{3, 4}\\
m_{4, 1} & m_{4, 2} & m_{4, 3} & m_{4, 4}\\
\end{array}\right]\left[\begin{array}{cc}
x_{1, 2}\\
x_{2, 2}\\
x_{3, 2}\\
x_{4, 2}\\
\end{array}\right] = \left[\begin{array}{cc}
0\\
1\\
0\\
0\\
\end{array}\right],$$

$$\left[\begin{array}{cc}
m_{1, 1} & m_{1, 2} & m_{1, 3} & m_{1, 4}\\
m_{2, 1} & m_{2, 2} & m_{2, 3} & m_{2, 4}\\
m_{3, 1} & m_{3, 2} & m_{3, 3} & m_{3, 4}\\
m_{4, 1} & m_{4, 2} & m_{4, 3} & m_{4, 4}\\
\end{array}\right]\left[\begin{array}{cc}
x_{1, 3}\\
x_{2, 3}\\
x_{3, 3}\\
x_{4, 3}\\
\end{array}\right] = \left[\begin{array}{cc}
0\\
0\\
1\\
0\\
\end{array}\right],$$

$$\left[\begin{array}{cc}
m_{1, 1} & m_{1, 2} & m_{1, 3} & m_{1, 4}\\
m_{2, 1} & m_{2, 2} & m_{2, 3} & m_{2, 4}\\
m_{3, 1} & m_{3, 2} & m_{3, 3} & m_{3, 4}\\
m_{4, 1} & m_{4, 2} & m_{4, 3} & m_{4, 4}\\
\end{array}\right]\left[\begin{array}{cc}
x_{1, 4}\\
x_{2, 4}\\
x_{3, 4}\\
x_{4, 4}\\
\end{array}\right] = \left[\begin{array}{cc}
0\\
0\\
0\\
1\\
\end{array}\right].$$

Solving the above four system of equations will provide the inverse of the matrix. We can use any method introduced previously to solve these equations (e.g., Gauss elimination, Gauss-Jordan, and LU decomposition). Below is an example of matrix inverion using Gauss-Jordan method. Recall that in the Gauss-Jordan method, we convert our problem from

$$\left[\begin{array}{cc}
m_{1, 1} & m_{1, 2} & m_{1, 3} & m_{1, 4}\\
m_{2, 1} & m_{2, 2} & m_{2, 3} & m_{2, 4}\\
m_{3, 1} & m_{3, 2} & m_{3, 3} & m_{3, 4}\\
m_{4, 1} & m_{4, 2} & m_{4, 3} & m_{4, 4}\\
\end{array}\right]\left[\begin{array}{cc}
x_{1}\\
x_{2}\\
x_{3}\\
x_{4}\\
\end{array}\right] = \left[\begin{array}{cc}
y_1\\
y_2\\
y_3\\
y_4\\
\end{array}\right]$$

to

$$\left[\begin{array}{cc}
1 & 0 & 0 & 0\\
0 & 1 & 0 & 0\\
0 & 0 & 1 & 0\\
0 & 0 & 0 & 1\\
\end{array}\right]\left[\begin{array}{cc}
x_{1}\\
x_{2}\\
x_{3}\\
x_{4}\\
\end{array}\right] = \left[\begin{array}{cc}
0\\
0\\
0\\
1\\
\end{array}\right]$$

To obtain the solution. Essentially, we are converting


$$\left[\begin{array}{cc}
m_{1, 1} & m_{1, 2} & m_{1, 3} & m_{1, 4} & y_1\\
m_{2, 1} & m_{2, 2} & m_{2, 3} & m_{2, 4} & y_2\\
m_{3, 1} & m_{3, 2} & m_{3, 3} & m_{3, 4} & y_3\\
m_{4, 1} & m_{4, 2} & m_{4, 3} & m_{4, 4} & y_4\\
\end{array}\right]$$

to

$$\left[\begin{array}{cc}
1 & 0 & 0 & 0 & y^{'}_1\\
0 & 1 & 0 & 0 & y^{'}_2\\
0 & 0 & 1 & 0 & y^{'}_3\\
0 & 0 & 0 & 1 & y^{'}_4\\
\end{array}\right].$$


####**14.7.2 Problems**

1. Write a function `my_is_orthogonal(v1, v2, tol)` where `v1` and `v2` are column vectors of the same size, and `tol` is a scalar value strictly larger than zero. The output should be 1 if the angle between `v1` and `v2` is within tol of $\pi/2$, that is, $|\pi/2 - \theta| <$ `tol`, and zero otherwise. You may assume that `v1` and `v2` are column vectors of the same size, and that `tol` is a positive scalar.  

In [None]:
import math
math.pi

3.141592653589793

In [None]:
# Required libraries
import numpy as np
from numpy.linalg import norm
import math

def my_is_orthogonal(v1, v2, tol):
  """
  Function that evaluates whether v1 and v2
  are orthogonal vectors
  """
  # Calculate the angle between the two vectors based on dot product definition
  dot_product = np.dot(v1.T, v2)
  norms_product = norm(v1, 2) * norm(v2, 2)
  # Handle potential division by zero if either vector is the zero vector
  if norms_product == 0:
      return 0  # Or raise an error, depending on desired behavior
  theta = np.arccos(dot_product / norms_product)


  # Evaluate conditional
  if np.abs(math.pi/2 - theta) < tol:
    return 1
  else:
    return 0

In [None]:
# Test case 1
a = np.array([[1], [0.001]])
b = np.array([[0.001], [1]])
print(my_is_orthogonal(a, b, 0.01))
print(my_is_orthogonal(a, b, 0.001))

1
0


In [None]:
# Test case 2
a = np.array([[1], [0.001]])
b = np.array([[1], [1]])
my_is_orthogonal(a,b, 0.01)

0

In [None]:
# test case 3
a = np.array([[1], [1]])
b = np.array([[-1], [1]])
my_is_orthogonal(a,b, 1e-10)

1

2. Write a function `my_make_lin_ind(A)` where $A$ and $B$ are matrices. Let rank$(A) = n$. Then $B$ should be a matrix containing the first $n$ columns of $A$ that are all linearly independent. Note that this implies that $B$ has full rank.   

In [None]:
# Required Libraries
import numpy as np
from numpy.linalg import cond, matrix_rank
import sympy

def my_make_lin_ind(A):
  """
  # Linearly independent columns
  """
  n = matrix_rank(A)
  _, inds = sympy.Matrix(A).rref()

  B = []
  for i in inds:
    B.append(A.T[i])

  return (np.array(B)).T



In [None]:
# Test case
A = np.array([[12,24,0,11,-24,18,15],
              [19,38,0,10,-31,25,9],
              [1,2,0,21,-5,3,20],
              [6,12,0,13,-10,8,5],
              [22,44,0,2,-12,17,23]])
B = my_make_lin_ind(A)
B

array([[ 12,  11, -24,  15],
       [ 19,  10, -31,   9],
       [  1,  21,  -5,  20],
       [  6,  13, -10,   5],
       [ 22,   2, -12,  23]])

**Testing some functionalities for problem 2.**

In [None]:
import numpy as np
from numpy.linalg import cond, matrix_rank
import sympy

A = np.array([[12,24,0,11,-24,18,15],
              [19,38,0,10,-31,25,9],
              [1,2,0,21,-5,3,20],
              [6,12,0,13,-10,8,5],
              [22,44,0,2,-12,17,23]])
n = matrix_rank(A)
_, inds = sympy.Matrix(A).rref()
inds

(0, 3, 4, 6)

In [None]:
A.shape[0]

6

3. Cramer's rule is a method of computing the determinant of a matrix. Consider an $n× n$ square matrix $M$. Let $M(i, j)$ be the element of $M$ in the $i$th row and $j$th column of $M$, and let $m_{i, j}$ be the minor of $M$ created by removing the $i$th row and $j$th column from $M$. Cramer's rule says that

$$det(M) = \sum^n_{i = 1}{(-1)^{i-1}M(1, i)\space{det(m_{i, j})}}.$$

Write a function `my_rec_det(M)` where the output is det$(M)$. Use Cramer's rule to compute the determinant, not `NumPy`'s function.

In [None]:
import numpy as np

def my_rec_det(M):
  """
  Cramer's rule to compute the determinantof a matrix
  """
  # Get the dimensions of M (square matrix)
  n = M.shape[0]

  # Initial conditionals to calculate the determinant
  if n == 1:
    return M[0][0]
  elif n == 2:
    return M[0][0] * M[1][1] - M[0][1] * M[1][0]

  else:
    det = 0
    for i in range(n):
      # Create the submatrix (minor)
      minor = []
      for row in range(1, n):
        new_row = []
        for col in range(n):
          if col != i:
            new_row.append(M[row][col])
        minor.append(new_row)

      # Apply Cramer's formula:
      cramer = ((-1)**i)*M[0][i]*my_rec_det(np.array(minor))
      det += cramer
    return det

In [None]:
M = np.array([[1, 2, 3],
             [-4, -5, 6],
             [13, 17, 7]])
det_value = my_rec_det(M)
print(f"The determinant of M is:\n{det_value}")

The determinant of M is:
66


4. Let $p$ be a vector with length $L$ containing the coefficients of a polynomial of order $L − 1$. For
example, the vector $p =[1, 0, 2]$ is a representation of the polynomial $f(x)= 1x^2 + 0x + 2$. Write
a function `my_poly_der_mat(p)` where $p$ is the aforementioned vector, and the output $D$ is the
matrix that will return the coefficients of the derivative of $p$ when $p$ is left multiplied by $D$.For example, the derivative of $f(x)$ is $f^{'}(x) = 2x$; therefore, $d = Dp$ should yield $d =[2, 0]$.Note
this implies that the dimension of $D$ is $L − 1 × L$. The point of this problem is to show that
differentiating polynomials is actually a linear transformation.



In [None]:
def my_poly_der_mat(p):
  """
  Function that computes the derivative of a polynomial p
  as a inear transformation
  """


5. Use the Gauss elimination method to solve the following equations:

$$3x_1-x_2+4x3 = 2,$$
$$17x_1+2x_2+x_3 = 14,$$
$$x_1+12x_2-7x_3 = 54.$$


In [None]:
import numpy as np
A = np.array([[3, -1, 4],
              [17, 2, 1],
              [1, 12, -7]])
y = np.array([[2], [14], [54]])
A_y = np.concatenate((A, y), axis = 1)
A_y

array([[ 3, -1,  4,  2],
       [17,  2,  1, 14],
       [ 1, 12, -7, 54]])

In [None]:
b = A_y[0]*(-17/3)

In [None]:
A_y[1] = A_y[1] + b
A_y

array([[  3,  -1,   4,   2],
       [  0,   5, -21,   1],
       [  1,  12,  -7,  54]])

In [None]:
A_y

array([[ 3, -1,  4,  2],
       [17,  2,  1, 14],
       [ 1, 12, -7, 54]])

In [None]:
A_y

array([[ 3, -1,  4,  2],
       [17,  2,  1, 14],
       [ 1, 12, -7, 54]])

In [None]:
a2 = A_y[0]*(-A_y[1][0]/A_y[0][0])
A_y[1] = A_y[1] + a2
A_y

array([[  3,  -1,   4,   2],
       [  0,   7, -21,   2],
       [  1,  12,  -7,  54]])

In [None]:
a3 = A_y[0]*(-A_y[2][0]/A_y[0][0])
A_y[2] = A_y[2] + a3
A_y

array([[  3,  -1,   4,   2],
       [  0,   7, -21,   2],
       [  0,  12,  -8,  53]])

In [None]:
a3_1 = A_y[1]*(-A_y[2][1]/A_y[1][1])
A_y[2] = A_y[2] + a3_1
A_y

array([[  3,  -1,   4,   2],
       [  0,   7, -21,   2],
       [  0,   0,  28,  49]])

In [None]:
A_y[2,2]

np.int64(28)

In [None]:
def gauss_el(A, y):
  """
  Solve a system of linear equations
  employing the Gauss elimination methodology
  """
  # Step 1: Obtain the augmented matrix [A, y]
  A_y = np.concatenate((A, y), axis = 1)


  n = A_y.shape[0]
  # Make sure that the first element of the matrix is not zero
  # to start with the pivot equation

  # Gauss Elimination
  a2 = A_y[0]*(-A_y[1][0]/A_y[0][0])
  A_y[1] = A_y[1] + a2
  a3 = A_y[0]*(-A_y[2][0]/A_y[0][0])
  A_y[2] = A_y[2] + a3
  a3_1 = A_y[1]*(-A_y[2][1]/A_y[1][1])
  A_y[2] = A_y[2] + a3_1

  x3 = A_y[2][3]/A_y[2][2]
  x2 = (A_y[1][3] - A_y[1][2]*x3)/A_y[1][1]
  x1 = (A_y[0][3] - A_y[0][2]*x3 - A_y[0][1]*x2)/A_y[0][0]
  results = [x1, x2, x3]
  return results



In [None]:
import numpy as np
A = np.array([[3, -1, 4],
              [17, 2, 1],
              [1, 12, -7]])
y = np.array([[2], [14], [54]])

gauss_el(A, y)

[np.float64(0.17857142857142852),
 np.float64(5.535714285714286),
 np.float64(1.75)]

In [None]:
y = np.array([2, 14, 54])
x = np.linalg.solve(A, y)
print(x)

[0.05901639 5.57377049 1.84918033]


6. Use the Gauss-Jordan elimination method to solve the equations in Problem 8.

In [None]:
import numpy as np
A = np.array([[3, -1, 4],
              [17, 2, 1],
              [1, 12, -7]])
y = np.array([[2], [14], [54]])
A_y = np.concatenate((A, y), axis = 1)

In [None]:
a2 = A_y[0]*(-A_y[1][0]/A_y[0][0])
A_y[1] = A_y[1] + a2
a3 = A_y[0]*(-A_y[2][0]/A_y[0][0])
A_y[2] = A_y[2] + a3
a3_1 = A_y[1]*(-A_y[2][1]/A_y[1][1])
A_y[2] = A_y[2] + a3_1
A_y2 = A_y.copy()

In [None]:
A_y[0] + A_y[1]*(-A_y[0,1]/A_y[1, 1])

array([3.        , 0.        , 1.        , 2.28571429])

In [None]:
a = A_y2[1]*(-A_y2[0,1]/A_y2[1, 1])
A_y2[0] = A_y2[0] + a
b = A_y2[1]*(-A_y2[0,2]/A_y2[1, 2])
A_y2[0] = A_y2[0] + b
c = A_y2[2]*(-A_y2[1,2]/A_y2[2, 2])
A_y2[1] = A_y2[1] + c
A_y2

array([[ 3,  0,  0,  2],
       [ 0,  7,  0, 38],
       [ 0,  0, 28, 49]])

In [None]:
  n = A_y.shape[0]
  m = A_y.shape[1]
  for i in range (n):
    for j in range(m):
      while n == m:
        A_y[i, j] = A_y[i, j]/A_y[i, j]

KeyboardInterrupt: 

In [None]:
A_y

array([[  3,  -1,   4,   2],
       [  0,   7, -21,   2],
       [  0,   0,  28,  49]])

In [None]:
def gauss_jordan(A, y):
  """
  Gauss-Jordan elimination method to solve
  the above system of linear equations
  """
  # We apply the Gauss alimination method
  # to obtain the uper triangular matrix
  upper_matrix = gauss_el(A, y)
  a = upper_matrix[1]*(-upper_matrix[0,1]/upper_matrix[1, 1])
  upper_matrix[0] = upper_matrix[0] + a
  b = upper_matrix[1]*(-upper_matrix[0,2]/upper_matrix[1, 2])
  upper_matrix[0] = upper_matrix[0] + b
  c = upper_matrix[2]*(-upper_matrix[1,2]/upper_matrix[2, 2])
  upper_matrix[1] = upper_matrix[1] + c
  n = A_y.shape[0]
  m = A_y.shape[1]
  for i in range (n):
    for j in range(m):
      while n == m:
        A_y[i, j] = A_y[i, j]/A_y[i, j]