# Introduction to Computer Programming and Numerical Methods

> **Mohamad M. Hallal, PhD** <br> Teaching Professor, UC Berkeley

[![License](https://img.shields.io/badge/license-CC%20BY--NC--ND%204.0-blue)](https://creativecommons.org/licenses/by-nc-nd/4.0/)
***

# Linear Algebra

1. [**Array Basics**](#s1)
2. [**Array Properties**](#s2)
3. [**Array Operations**](#s3)
4. [**Systems of Linear Equations**](#s4)

***

# 0. Motivation

Linear algebra is undeniably one of the most essential mathematical tools across various fields and disciplines. Its significance lies in its ability to describe or approximate countless real-world phenomena through linear relationships. In finance, linear algebra aids in portfolio optimization, enabling investors to maximize returns while minimizing risk. Medical imaging relies on linear algebra to reconstruct detailed images from complex data obtained through MRI and CT scans. Signal processing harnesses linear algebra to enhance audio and image quality. In environmental science, researchers leverage linear algebra to model climate patterns and simulate ecological systems. Linear algebra even plays a pivotal role in determining the most relevant web pages in a Google search.

This section provides a basic overview of some basic linear algebraic vocabulary and concepts that are important for later chapters. However, the information in this section is in no way comprehensive and should not be considered a substitute for a full linear algebra course. Rather, our goal is to focus on numerical algorithms to answer questions in linear algebra, specifically, solving a system of linear equations.

**Learning objectives:**

* Use Python and MATLAB tools to perform various linear algebra calculations
* Write a system of linear equations in matrix form
* Determine whether a system of linear equations has no solutions, infinite solutions, or a unique solution
* Select appropriate Python and MATLAB tools to solve a system of linear equations

# 1. Array Basics <a id="s1"></a>

An important part of linear algebra is the symbolic language (notation). We will use the following notation conventions:
* A lowercase bold mathematical symbol such as $\boldsymbol{x}$ or $\boldsymbol{v}$ represents a vector (1-D array)
* If $\boldsymbol{v}$ is a vector then $\boldsymbol{v_i}$ is the $i^{th}$ element of the vector
* An uppercase mathematical symbol such as ${A}$ or ${M}$ represents a matrix (2-D array)
* If ${M}$ is a matrix then $M_{i,j}$ is the element in the $i^{th}$ row and $j^{th}$ column
* Two bold symbols separated by a centered dot such as  $\boldsymbol{v} \cdot \boldsymbol{w}$ represents the dot product of two vectors
* The juxtaposition of two symbol such as $A \boldsymbol{x}$ represents matrix multiplication


## 1.1. Vectors (1-D Arrays)

A vector is a 1-D array that can be written either horizontally in a row vector (i.e., all elements next to each other), or vertically in a column vector (i.e., all elements on top of each other).

$$
\boldsymbol{a} = 
\begin{bmatrix} 
2.1 & 3.4 & 9.3
\end{bmatrix} \ \ \ \ \ \ \
\boldsymbol{b} = 
\begin{bmatrix} 
2.1\\ 
3.4\\
9.3
\end{bmatrix}
$$

$\mathbf{\color{midnightblue}{\text{Python:}}}$
```python
a = np.array([[2.1, 3.4, 9.3]]) # comma ',' inside the same square bracket '[]' to separate columns

b = np.array([[2.1], 
              [3.4], 
              [9.3]]) # separate square brackets '[]' for each row and a comma ',' between square brackets
```

$\mathbf{\color{orange}{\text{MATLAB:}}}$
```octave
a = [2.1, 3.4, 9.3]  % comma ',' to separate columns (optional)

b = [2.1; 3.4; 9.3]  % semicolon ';' to separate rows (not optional)
```

If the context of a vector is ambiguous, it usually means the vector is a column vector.

## 1.2. Matrices (2-D Arrays)

A matrix is an array with at least 2 rows and 2 columns. In general, an $m \times n$ matrix is a rectangular table of numbers consisting of $m$ rows and $n$ columns.

$$
C = 
\begin{bmatrix} 
2.1 & 3.4 & 9.3\\ 
9.2 & -2.7 & 4.0
\end{bmatrix}_{2 \times 3}  \ \ \ \ \ \ \
D = 
\begin{bmatrix} 
2.1 & 9.2\\ 
3.4 & -2.7\\
9.3 & 4.0
\end{bmatrix}_{3 \times 2} 
$$

$\mathbf{\color{midnightblue}{\text{Python:}}}$
```python
C = np.array([[2.1, 3.4, 9.3],
              [9.2, -2.7, 4.0]])

D = np.array([[2.1, 9.2], 
              [3.4, -2.7], 
              [9.3, 4.0]])
```

$\mathbf{\color{orange}{\text{MATLAB:}}}$
```octave
C = [2.1, 3.4, 9.3; 9.2, -2.7, 4.0]

D = [2.1, 9.2; 3.4, -2.7; 9.3, 4.0]

```

## 1.3. Basic Array Properties

In linear algebra, array properties or attributes refer to specific characteristics associated with an array that provide essential information about its size, structure, and behavior. These attributes help in understanding and manipulating arrays in various computational contexts. Below are some of the common array properties and the syntax that can be used to obtain them in Python and MATLAB. The results below are based on the following matrix:

$$
C = 
\begin{bmatrix} 
2.1 & 3.4 & 9.3\\ 
9.2 & -2.7 & 4.0
\end{bmatrix}_{2 \times 3}
$$

| Property                 | Python Example | Python Output | MATLAB Example | MATLAB Output |
| :----------------------- | :------------- | :------------ | :------------- | :------------ |
| Number of dimensions     | `C.ndim`       | `2`           | `ndims(C)`     | `2`           |
| Length in each dimension | `C.shape`      | `(2, 3)`      | `size(C)`      | `[2 3]`       | 
| Total number of elements | `C.size`       | `6`           | `numel(C)`     | `6`           |

where `C` is the array name

# 2. Array Properties <a id="s2"></a>

In addition to the basic array properties like dimension and shape, there are other array properties essential in linear algebra. Next, we briefly introduce some of these properties and demonstrates how to obtain them using Python and MATLAB.

## 2.1. Determinant

The **determinant** is an important property that is defined only for square matrices. A **square matrix** is an ${m} \times {m}$ matrix; that is, it has the same number of rows as columns. The determinant of a square matrix $M$ is denoted by $det(M)$ or $|M|$. The process of calculating the determinant of a $2 \times 2$ matrix is simple whereas the process becomes more complex as the matrix order increases. 

The determinant of a $2 \times 2$ matrix can be computed as follows:
1. Multiply the top-left element ($\color{red}a$) by the bottom-right element ($\color{red}d$).
2. Subtract the product of the top-right element ($\color{blue}b$) and the bottom-left element ($\color{blue}c$) from the result obtained in step 1.

<br>

<center><figure>
  <img src="https://www.chilimath.com/wp-content/uploads/2018/12/animated-gif-determinant-of-2x2-matrix.gif
" style="width:35%">
    <figcaption style="text-align:center"><strong>Determinant of a matrix:</strong> <a href="https://www.chilimath.com/lessons/advanced-algebra/determinant-2x2-matrix/">https://www.chilimath.com/</a></figcaption>   
</figure></center>

<br>

Calculating the determinant of a $3 \times 3$ matrix is a bit more involved than for a $2 \times 2$ matrix. It involves breaking the $3 \times 3$ matrix down into smaller $2 \times 2$ matrices. The process involves finding the determinants of these smaller $2 \times 2$ matrices and performing multiplications and subtractions according to a specific pattern.

$$
\begin{eqnarray*}
|M| = \begin{bmatrix}
a & b & c \\
d & e & f \\
g & h & i \\
\end{bmatrix} & = & a\begin{bmatrix}
\Box &\Box  &\Box  \\
\Box & e & f \\
\Box & h & i \\
\end{bmatrix} - b\begin{bmatrix}
\Box &\Box  &\Box  \\
d & \Box & f \\
g & \Box & i \\
\end{bmatrix}+c\begin{bmatrix}
\Box &\Box  &\Box  \\
d & e & \Box \\
g & h & \Box \\
\end{bmatrix} \\
&&\\
& = & a\begin{bmatrix}
e & f \\
h & i \\
\end{bmatrix} - b\begin{bmatrix}
d & f \\
g & i \\
\end{bmatrix}+c\begin{bmatrix}
d & e \\
g & h \\
\end{bmatrix} \\ 
&&\\
& = & a(ei-fh) - b(di-fg) + c(dh-eg)
\end{eqnarray*}$$

We can use a similar approach to calculate the determinant for matrices with higher dimensions, but it is much easier to calculate it using Python or MATLAB.

$\mathbf{\color{midnightblue}{\text{Python:}}}$
```python
np.linalg.det(M)  # returns the determinant of matrix M
```

$\mathbf{\color{orange}{\text{MATLAB:}}}$
```octave
det(M)            % returns the determinant of matrix M
```

The determinant helps us understand key properties of a matrix, especially when dealing with systems of linear equations. For example, it can tell us whether a system has a unique solution or not. We will explore the application of the determinant in more detail later in this chapter.

## 2.2. Inverse

The **inverse** of a square matrix is another matrix of the same size that, when multiplied with the original matrix, results in the identity matrix. The inverse is also only defined for square matrices. The inverse of a matrix is analogous to the inverse of real numbers. For example:

$$\text{The inverse of } 3 \text{ is } 3^{-1} \text{ because } 3 \times 3^{-1} = 1$$

Likewise:

$$\text{The inverse of matrix } M \text{ is another matrix of the same shape } M^{-1} \text{, where } M \times M^{-1} = I$$

where $I$ is the identity matrix.

### 2.2.1. Identity Matrix

The **identity matrix** is a square matrix with ones on the diagonal and zeros elsewhere. The identity matrix is usually denoted by $I$, and is analogous to the real number identity, 1. For any matrix $M$, multiplying it by the identity matrix $I$ of compatible size produces the same matrix $M$ (refer to section [3.2. Matrix Multiplication](#s3)).

$$\begin{bmatrix}
\mathbf{\color{teal}1}   & 0    & \dots & 0 \\
                     0   & \mathbf{\color{teal}1}   & \dots & 0 \\
                  \vdots & \vdots & \ddots & \vdots \\
                     0   & 0   & ... & \mathbf{\color{teal}1}
\end{bmatrix}$$

For any $m \times n$ matrix $M$:

$$I_{m \times m} M_{m \times n} = M_{m \times n}$$

$$M_{m \times n} I_{n \times n} = M_{m \times n}$$

You can create an identity matrix in both Python and MATLAB with the following methods:

$\mathbf{\color{midnightblue}{\text{Python:}}}$
```python
np.eye(m)       # returns an m x m identity matrix
np.identity(m)  # returns an m x m identity matrix
```

$\mathbf{\color{orange}{\text{MATLAB:}}}$
```octave
eye(m)          % returns an m x m identity matrix
```

### 2.2.2. Inverse Matrix

Just like 0 has no inverse, not all matrices have inverses. A square matrix $M$ has an inverse, denoted $M^{-1}$, if and only if its determinant is nonzero: $|M| \neq 0$. In this case, the matrix is said to be **invertible** or **non-singular**. Otherwise, if $|M| = 0$, the matrix does not have an inverse and is said to be **singular**. If it exists, the inverse of a matrix is *unique*; that is, for an invertible matrix, there is only one inverse.

The inverse of a $2 \times 2$ matrix can be computed as follows:

$$
M^{-1} = \begin{bmatrix}
a & b \\
c & d\\
\end{bmatrix}^{-1} = \frac{1}{|M|}\begin{bmatrix}
d & -b \\
-c & a\\
\end{bmatrix}$$

where $|M|$ is the determinant of the matrix.

For larger matrices, computing the inverse analytically is more complex, but Python and MATLAB provide straightforward methods for this.

$\mathbf{\color{midnightblue}{\text{Python:}}}$
```python
np.linalg.inv(M)   # returns the inverse of matrix M
```

$\mathbf{\color{orange}{\text{MATLAB:}}}$
```octave
inv(M)             % returns the inverse of matrix M
```

***

As mentioned earlier, not all matrices have inverses. For a matrix to have an inverse, first, it must be a square matrix (having the same number of rows and columns) and second, it should have a nonzero determinant. A matrix that is close to being singular (i.e., the determinant is close to 0) is called **ill-conditioned**. While ill-conditioned matrices have inverses, they can cause numerical issues similar to dividing by a very small number, leading to overflow, underflow, or significant round-off errors.

## 2.3. Transpose 

The **transpose** of a matrix is a reversal of its rows with its columns. The transpose is denoted by a superscript $^T$ such as $M^T$ is the transpose of matrix $M$. 

**Example:**

$$
C = 
\begin{bmatrix} 
2.1 & 3.4 & 9.3\\ 
9.2 & -2.7 & 4.0
\end{bmatrix}_{2 \times 3}
$$

$$
C^T = 
\begin{bmatrix} 
2.1 & 9.2\\ 
3.4 & -2.7\\
9.3 & 4.0
\end{bmatrix}_{3 \times 2} 
$$

In general: 

$$ 
M^T(j, i) = M(i, j)
$$

The transpose of a matrix can be obtained in Python and MATLAB using different ways.

$\mathbf{\color{midnightblue}{\text{Python:}}}$
```python
np.transpose(M)  # using np.transpose() function
M.T              # using the .T attribute
```

$\mathbf{\color{orange}{\text{MATLAB:}}}$
```octave
transpose(M)     % using transpose() function
A'               % using the ' operator
```

## 2.4. Norm

The **norm** of a vector or matrix, denoted $\Vert M \Vert$,  is a measure of its size or magnitude. It is a scalar value that quantifies how large the array is in some sense. We will next focus on the norm of a vector.

There are many ways of defining the length of a vector depending on the distance formula that is used. Consider the following vector:

$$
\boldsymbol{v} = 
\begin{bmatrix} 
2.1 & 3.4 & 9.3
\end{bmatrix}
$$

The length of $\boldsymbol{v}$ can be obtained as:
$$\Vert \boldsymbol{v} \Vert_{2} = \sqrt{2.1^2 + 3.4^2 + 9.3^2} \approx 10.122$$

This is known as the **$L_2$ norm** or Euclidean norm. In general, the **$L_2$ norm** of a vector $\boldsymbol{v}$ is:

$$\Vert \boldsymbol{v} \Vert_{2} = \left(\sum_i \left|v_{i}\right|^2\right)^{1/2}$$

There are other definitions for the norm. One such definition is the **$L_1$ norm**, also known as the Manhattan norm. The $L_1$ norm of a vector $\boldsymbol{v}$ is:

$$\Vert \boldsymbol{v} \Vert_{1} = \sum_i \left|v_{i}\right|$$

In general, the **p-norm**, $L_p$ with $p \geq 1$, of a vector $\boldsymbol{v}$ is:

$$\Vert \boldsymbol{v} \Vert_{p} = \left(\sum_i \left|v_{i}\right|^p\right)^{1/p}$$

Note that the **$L_\infty$ norm** is equal to the maximum absolute value of the elements in $\boldsymbol{v}$.

The norm can be obtained in Python and MATLAB, but there are differences.

$\mathbf{\color{midnightblue}{\text{Python:}}}$
```python
np.linalg.norm(v)  # returns the 2-norm of vector v (square root of the sum of the square of all elements)
```

$\mathbf{\color{orange}{\text{MATLAB:}}}$
```octave
norm(v)            % returns the 2-norm of vector v (square root of the sum of the square of all elements)
norm(v, 1)         % returns the 1-norm of vector v (sum of the absolute value of all elements)
norm(v, Inf)       % returns the maximum absolute value of vector v
```

<div class="alert alert-block alert-warning"> <b>NOTE!</b> Similar to MATLAB, <code>np.linalg.norm()</code> takes optional arguments to control the order of the norm. However, the order argument behavior differs based on whether the input is a row or column vector.</div>

# 3. Array Operations <a id="s3"></a>

In previous sections, we have seen that basic arithmetic operations can be applied to arrays. Specifically, if we have two `ndarray` objects in Python of the same shape, `A * B` takes every element of $A$ and multiplies it by the corresponding element of $B$ in the same index. This is known as element-by-element multiplication, where $A$ and $B$ should be arrays of the same shape. However, this is *different* from that standard matrix multiplication in linear algebra. Below we introduce the dot product of two vectors and then discuss matrix multiplication.

## 3.1. Dot Product

The dot product of two vectors $\boldsymbol{v} = [v_1, v_2, \dots, v_n]$ and $\boldsymbol{w} = [w_1, w_2,\dots, w_n]$ is defined as:

$$ {\displaystyle \boldsymbol{v} \cdot \boldsymbol{w} =\sum _{i=1}^{n}{v}_{i}{w}_{i}={v}_{1}{w}_{1}+{v}_{2}{w}_{2}+\cdots +{v}_{n}{w}_{n}}$$

The dot product between two vectors can be computed in Python and MATLAB as follows:

$\mathbf{\color{midnightblue}{\text{Python:}}}$
```python
np.dot(v, w)     # dot product using the np.dot() function
```

$\mathbf{\color{orange}{\text{MATLAB:}}}$
```octave
dot(v, w)        % dot product using the dot() function

```

***

Geometrically, the dot product of two vectors is given by:

$$ \boldsymbol{v} \cdot \boldsymbol{w} = \Vert \boldsymbol{v} \Vert \Vert \boldsymbol{w} \Vert \cos(\theta)$$

where $\Vert \boldsymbol{v} \Vert$ and $\Vert \boldsymbol{w} \Vert$ are the **$L_2$ norms** (i.e., lengths) of $\boldsymbol{v}$ and $\boldsymbol{w}$, respectively, and $\theta$ is the angle between the two vectors.

## 3.2. Matrix Multiplication

Matrix multiplication between two matrices, $A$ and $B$, is only defined when $A$ is an ${m} \times {n}$ matrix and $B$ is a ${n} \times {p}$ matrix. That is, the number or columns in $A$ must equal the number of rows in $B$. The result of $A \times B$ is a matrix $C$ that is $m \times p$.

<br>

<center><figure>
  <img src="https://docs.google.com/drawings/d/e/2PACX-1vS9x1GcveTWaBX65SvjzVFCzNAd93_z2O7bYIZ0jnbvSiGN3SVomnK9RWl606LpfbrrlOtViWC4aucV/pub?w=3204&h=1951
" style="width:35%">
    <figcaption style="text-align:center"><strong> <br> Matrix multiplication requirements</strong></figcaption>   
</figure></center>

If $A$ and $B$ satisfy the above condition, then $C = A \times B$ can be obtained as follows:


$$
C(i, j) = \sum_{k=1}^n A(i, k)B(k,j)
$$

For each row in the first matrix, take the dot product with each column in the second matrix. Place the results onto the corresponding row in a new matrix:

$$C(i, j) = (\text{Row } i \text{ of matrix } A) \cdot (\text{Column } j \text{ of matrix } B)$$


<center><figure>
  <img src="https://notesbylex.com/_media/matrix-multiplication.gif
" style="width:75%">
    <figcaption style="text-align:center"><strong> <br> Matrix multiplication procedure</strong> <a href="https://notesbylex.com/matrix-multiplication">https://notesbylex.com/</a> </figcaption>   
</figure></center>


Both matrix multiplication and element-wise multiplication can be performed in Python and MATLAB, but the syntax differs. Specifically, `*` performs element-wise multiplication in Python, whereas it performs matrix multiplication in MATLAB. To perform matrix multiplication on `ndarray` in Python, the `@` operator or the `np.dot()` function can be used.

$\mathbf{\color{midnightblue}{\text{Python:}}}$
```python
np.dot(A, B) # matrix multiplication using the np.dot() function
A @ B        # matrix multiplication using the @ operator

A * B        # element-wise multiplication using the * operator
```

$\mathbf{\color{orange}{\text{MATLAB:}}}$
```octave
A * B        % matrix multiplication using the * operator

A .* B       % element-wise multiplication using the .* operator
```

# 4. Systems of Linear Equations <a id="s4"></a>

## 4.1. Matrix Form

A **linear equation** is an equality of the form $\sum_{i = 1}^{n} (a_i x_i) = b$ where $a_i$ are scalars, $x_i$ are unknown variables, and $b$ is a scalar.

Examples of linear equations:
* $3x_1 + 4x_2 + 5x_3 = 3$
* $-x_1 + x_2 = 0$

Examples of nonlinear equations: 
* $3x_1^2 + 4x_2 + 5x_3 = 3$
* $x_1x_2 + x_3 = 5$

A **system of linear equations** is a set of linear equations that share the same variables. Consider the following system of linear equations:

\begin{eqnarray*}
\begin{array}{rcrcccccrcc}
 x_1 &+&  x_2   &=& 5\\
 x_1 &+&  2 x_2 &=& 8
\end{array}
\end{eqnarray*}

The **matrix form** of a system of linear equations is **$A\boldsymbol{x} = \boldsymbol{b}$**, where **$A$** is called the coefficient matrix, **$\boldsymbol{x}$** is the unknowns or solution vector, and **$\boldsymbol{b}$** is the constant vector. 

$$\begin{bmatrix}
1 & 1 \\
1 & 2 \\
\end{bmatrix}\left[\begin{array}{c} x_1 \\x_2  \end{array}\right] =
\left[\begin{array}{c} 5 \\8 \end{array}\right]$$

If you carry out the matrix multiplication, you will see that you arrive back at the original system of equations.

The general form of such a system with $m$  equations and $n$ unknowns is given by the following:

\begin{eqnarray*}
\begin{array}{rcrcccccrcc}
a_{1,1} x_1 &+& a_{1,2} x_2 &+& {\ldots}& +& a_{1,n-1} x_{n-1} &+&a_{1,n} x_n &=& b_1\\
a_{2,1} x_1 &+& a_{2,2} x_2 &+&{\ldots}& +& a_{2,n-1} x_{n-1} &+& a_{2,n} x_n &=& b_2 \\
&&&&{\ldots} &&{\ldots}&&&& \\
a_{m-1,1}x_1 &+& a_{m-1,2}x_2&+ &{\ldots}& +& a_{m-1,n-1} x_{n-1} &+& a_{m-1,n} x_n &=& b_{m-1}\\
a_{m,1} x_1 &+& a_{m,2}x_2 &+ &{\ldots}& +& a_{m,n-1} x_{n-1} &+& a_{m,n} x_n &=& b_{m}
\end{array}
\end{eqnarray*}

which has the following matrix form:

$$\begin{bmatrix}
a_{1,1} & a_{1,2} & ... & a_{1,n}\\
a_{2,1} & a_{2,2} & ... & a_{2,n}\\
... & ... & ... & ... \\
a_{m,1} & a_{m,2} & ... & a_{m,n}
\end{bmatrix}\left[\begin{array}{c} x_1 \\x_2 \\ ... \\x_n \end{array}\right] =
\left[\begin{array}{c} b_1 \\b_2 \\ ... \\b_m \end{array}\right]$$

where $a_{i,j}$ and $b_i$ are real numbers, $A$ is a ${m} \times {n}$ matrix, and $A(i,j) = a_{i,j}$.

Note that the number of rows always equals the number of equations so the vertical height of the coefficient matrix **$A$** always equals the length of the constant vector **$\boldsymbol{b}$**. Likewise, the horizontal length of the matrix (number of columns) equals the length of the unknowns vector **$\boldsymbol{x}$**.

## 4.2. Solving Systems of Linear Equations

While solving a system with two equations and two unknowns can be straightforward, solving linear systems with more equations and unknowns can become quite tedious. We will learn how to use Python and MATLAB to solve systems of linear equations.

Consider a system of $m$ linear equations and $n$ unknowns written in matrix form, $A\boldsymbol{x}=\boldsymbol{b}$. Depending on the values of $A$ and $\boldsymbol{b}$, there are three possibilities for $\boldsymbol{x}$:
1. No solution exists for $\boldsymbol{x}$
2. There is an infinite number of solutions for $\boldsymbol{x}$
3. There is one, unique solution for $\boldsymbol{x}$

### 4.2.1. Square Matrices

The discussion below only applies when $m = n$, that is, the number of equations is equal to the number of unknowns and $A$ is a square matrix. If $A$ is a square matrix, we can solve for the unknowns by multiplying each side of the equation by $A^{-1}$. This results in:

$$A^{-1}A\boldsymbol{x} = A^{-1}\boldsymbol{b}\rightarrow I\boldsymbol{x} = A^{-1}\boldsymbol{b}\rightarrow \boldsymbol{x} = A^{-1}\boldsymbol{b}$$

However, this requires that $A$ be an invertible matrix, which is not always the case.

If $|A| = 0$, then $A$ is said to be singular and it does not have an inverse. In this case, the system does not have a unique solution (it either has infinitely many solutions or no solutions). It is not possible to conclude whether the system has infinitely many solutions or no solutions solely based on the determinant. However, if $|A| \neq 0$, then $A$ is invertible. Since the inverse of a matrix is unique, then $\boldsymbol{x} = A^{-1}\boldsymbol{b}$ gives the unique solution to the system of equations.
 
In Python and MATLAB, we can obtain the inverse of matrix $A$ and perform matrix multiplication to solve a system of linear equations. In addition, Python has the `np.linalg.solve()` function and MATLAB has the `\` operator, both of which can also be used to solve a system of linear equations without explicitly computing the inverse.

$\mathbf{\color{midnightblue}{\text{Python:}}}$
```python
x = np.dot(np.linalg.inv(A), b) # Using matrix inversion (or x = np.linalg.inv(A) @ b) (for square matrices only)
x = np.linalg.solve(A, b)       # Using LU Decomposition (for square matrices only)
```

$\mathbf{\color{orange}{\text{MATLAB:}}}$
```octave
x = inv(A) * b                  % Using matrix inversion (for square matrices only)
x = A \ b                       % Using backslash operator. Note that this is NOT the same as the division operator /
```

<div class="alert alert-block alert-warning"> <b>NOTE!</b> Do not confuse the backslash operator <code>\</code>, which is used to solve a system of linear equations in MATLAB, with the forward slash operator <code>/</code>, which is the division operator. The illustration below may help you remember the difference. Assume a person is walking the same direction we write in English, from left to right. If the person leans backward, they look like a backslash. If the person leans forward, they look like a forward slash.
<center><figure>
  <img src="https://www.epm.org/static/uploads/images/blog/backslash-forward-flash-illustration.png
" style="width:45%">
    <figcaption style="text-align:center"><strong>Back vs. forward slash:</strong> <a href="https://www.epm.org/resources/2014/Nov/21/backslashes-vs-forward-slashes/">https://www.epm.org/</a></figcaption>   
</figure></center></div>

### 4.2.2. Non-square Matrices

The inverse is only defined for square matrices, that is, when the number of unknowns is equal to the number of equations. However, it's rare that you have exactly the same number of equations as unknowns. In general, to determine whether a system of linear equations has no solution, infinite solutions, or a unique solution, it is helpful to evaluate the rank.

#### 4.2.2.1. Rank 

The rank is a property of matrices that indicates the number of linearly independent columns or rows. It can be shown that the number of linearly independent rows is always equal to the number of linearly independent columns for any matrix. So let's focus our attention on the rows for now. Consider the following matrix:

$$
M = 
\begin{bmatrix} 
1 & -2\\ 
2 & -1\\
3 & -3
\end{bmatrix}_{3 \times 2} 
$$

It should be evident that the third row can be written as the sum of the first two rows, and thus, it is "redundant". Because only two rows out of the three are considered linearly independent, the rank of the above matrix is 2: $rank(M) = 2$

There are mathematical procedures to obtain the rank of a matrix, such as Gaussian elimination. However, this is beyond the scope of our course, so we will focus on obtaining the rank using Python and MATLAB.

$\mathbf{\color{midnightblue}{\text{Python:}}}$
```python
np.linalg.matrix_rank(M)  # returns the rank of matrix M
```

$\mathbf{\color{orange}{\text{MATLAB:}}}$
```octave
rank(M)                   % returns the rank of matrix M
```

#### 4.2.2.2. General Solution (Square and Non-square Matrices)

To determine the status of the linear system, we will evaluate and compare ${rank}([A,b])$ and ${rank}(A)$.

1. If ${rank}([A,b]) = {rank}(A) + 1 \rightarrow$ No solution 

2. If ${rank}([A, b]) = {rank}(A)$ and ${rank}(A)< n \rightarrow$ Infinite number of solutions

3. If ${rank}([A,b]) = {rank}(A)$ and ${rank}(A) = n \rightarrow$ Unique solution
 
When the coefficient matrix is non-square, we cannot compute its inverse to solve the system of linear equations. In this case, we can use the backslash operator in MATLAB `\` to solve the system of linear equations. Note that `A \ B` works for square and non-square matrices and is more efficient than `inv(A) * b`.

$\mathbf{\color{orange}{\text{MATLAB:}}}$
```octave
x = inv(A) * b                  % Using matrix inversion (A has to be square)
x = A \ b                       % Using backslash operator; more efficient (A can be square or non-square)
```

Using the backslash operator is the recommended method in MATLAB, as it is the most efficient and numerically stable way.

### 4.2.3. Overdetermined Systems

An overdetermined system is a system of equations where there are more equations than unknown variables, which is typically the case in many real-world applications. Consider the following system of linear equations:

\begin{eqnarray*}
\begin{array}{rcrcccccrcc}
 2 x_1 &+&  3 x_2   &=& 5\\
 4 x_1 &+&  2 x_2 &=& 8\\
 3 x_1 &-&  2 x_2 &=& 1\\
\end{array}
\end{eqnarray*}

The matrix form **$A\boldsymbol{x} = \boldsymbol{b}$** is:

$$\begin{bmatrix}
2 & 3 \\
4 & 2 \\
3 & -2 \\
\end{bmatrix}\left[\begin{array}{c} x_1 \\ x_2 \end{array}\right] =
\left[\begin{array}{c} 5 \\8 \\1\end{array}\right]$$

This system has two unknowns ($x_1$ and $x_2$) but three equations. It's overdetermined because there are more equations than unknowns.

Generally, overdetermined systems do not have an exact solution because the additional equations introduce conflicting constraints. Specifically, the above system has $rank(A)=2$ and $rank([A,b])=3$, which according to what we discussed earlier, implies that the system has no solution. This means that there does not exist any values for $x_1$ and $x_2$ that would simultaneously  satisfy all three equations. 

Instead of finding an exact solution, we typically seek a solution that minimizes the error between the equations. This is often achieved through methods like the least squares method. This is commonly used in regression, where we want to find the best-fitting curve through a set of data points.

Mathematically, the solution that minimizes the error can be determined using the following equation: 
$$\boldsymbol{x} = \left( A^TA\right)^{-1}A^T \boldsymbol{b}$$

where $\left( A^TA\right)^{-1}A^T$ is known as the pseudo-inverse of $A$.

This solution can be obtained in MATLAB using three different ways. However, using the backslash operator is the recommended approach, as it is the most efficient and numerically stable way to solve overdetermined systems using the least squares method. 

$\mathbf{\color{orange}{\text{MATLAB:}}}$
```octave
x = inv(A' * A) * A' * b        % Using matrix inversion. Not recommended.
x = pinv(A) * b                 % Using Moore-Penrose pseudoinverse. Generally more stable.
x = A \ b                       % Using backslash operator. Recommended approach.
```