$$\newcommand{\F}{\mathbb{F}}
\newcommand{\R}{\mathbb{R}}
\newcommand\aug{\fboxsep=-\fboxrule\!\!\!\fbox{\strut}\!\!\!}
\newcommand{\v}{\mathbf{v}}
\newcommand{\a}{\mathbf{a}}
\newcommand{\b}{\mathbf{b}}
\newcommand{\c}{\mathbf{c}}
\newcommand{\d}{\mathbf{d}}
\newcommand{\p}{\mathbf{p}}
\newcommand{\r}{\mathbf{r}}
\newcommand{\w}{\mathbf{w}}
\newcommand{\u}{\mathbf{u}}
\newcommand{\x}{\mathbf{x}}
\newcommand{\y}{\mathbf{y}}
\newcommand{\z}{\mathbf{z}}
\newcommand{\0}{\mathbf{0}}
\newcommand{\1}{\mathbf{1}}
\newcommand{\A}{\mathbf{A}}
\newcommand{\B}{\mathbf{B}}
\newcommand{\C}{\mathbf{C}}
\newcommand{\E}{\mathbf{E}}
\newcommand{\P}{\mathbf{P}}$$

## General Form of Linear Equations

### Algebraic Definition (System of Linear Equations)

A general system of $m$ linear equations with $n$ unknowns can be written as:

$$
\begin{align}
a_{11} x_1 + a_{12} x_2  + \cdots + a_{1n} x_n  &= b_1 \\
a_{21} x_1 + a_{22} x_2  + \cdots + a_{2n} x_n  &= b_2 \\
& \ \ \vdots\\
a_{m1} x_1 + a_{m2} x_2  + \cdots + a_{mn} x_n  &= b_m,
\end{align}
$$

where $x_1, x_2,\ldots,x_n$ are the unknowns, $a_{11},a_{12},\ldots,a_{mn}$ are the coefficients of the system, and $b_1,b_2,\ldots,b_m$ are the constant terms.

### Matrix Definition (System of Linear Equations)

The vector equation is equivalent to a matrix equation of the form $\A\x = \b$, where $\A \in \F^{m \times n}$, $\x$ a column vector in $\F^n$ and $\b$ a column vector in $\F^m$.

$$
\A =
\begin{bmatrix}
a_{11} & a_{12} & \cdots & a_{1n} \\
a_{21} & a_{22} & \cdots & a_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
a_{m1} & a_{m2} & \cdots & a_{mn}
\end{bmatrix},\quad
\mathbf{x}=
\begin{bmatrix}
x_1 \\
x_2 \\
\vdots \\
x_n
\end{bmatrix},\quad
\mathbf{b}=
\begin{bmatrix}
b_1 \\
b_2 \\
\vdots \\
b_m
\end{bmatrix}
$$

### Vector Definition (System of Linear Equations)

Recall in the chapter on Matrix Multiplication, we note that $\A\x = \b$ is a right multiplication of a matrix $\A$ on the vector $\b$, and thus $\b$ can be represented as the **linear combination of columns of $\A$ with $x_i$ as coefficients**.

$$
\b = x_1 \a_1 + x_2 \a_2 + ... + x_n \a_n \implies 
\begin{bmatrix}
b_1 \\
b_2 \\
\vdots \\
b_m
\end{bmatrix} = x_1 \begin{bmatrix} a_{11} \\ a_{21} \\ \vdots \\ a_{m1} \end{bmatrix} + x_2 \begin{bmatrix} a_{12} \\ a_{22} \\ \vdots \\ a_{m2} \end{bmatrix} + ... + x_n \begin{bmatrix} a_{1n} \\ a_{2n} \\ \vdots \\ a_{mn} \end{bmatrix}
$$

### Definition (Homogeneous System of Equations)

A system of equations is called **homogeneous** if each equation in the system is equal to $0$ . A homogeneous system has the form:

$$
\begin{align}
a_{11} x_1 + a_{12} x_2  + \cdots + a_{1n} x_n  &= 0 \\
a_{21} x_1 + a_{22} x_2  + \cdots + a_{2n} x_n  &= 0 \\
& \ \ \vdots\\
a_{m1} x_1 + a_{m2} x_2  + \cdots + a_{mn} x_n  &= 0,
\end{align}
$$

where $x_1, x_2,\ldots,x_n$ are the unknowns, $a_{11},a_{12},\ldots,a_{mn}$ are the coefficients of the system.

> Note that this definition can be similarly translated in terms of Matrix and Vector definitions.

### Definition (Inconsistent and Consistent Systems)

- **Consistent**: A system of linear equations are called **consistent** if there exists at least one solution.
- **Inconsistent**: A system of linear equations are called **inconsistent** if there exists no solution.

## Elementary Row Operations

Elementarty row operations provide us a way to find out if a system of linear equations is **consistent or not**.

### Definition (Elementary Row Operations)

In order to enable us to convert a system of linear equations to an **equivalent** system, we define the following **elementary row operations**:

- **Row Permutation:** Interchange any two rows of a matrix: $\r_i \iff \r_j$
- **Row Multiply:** Replace any row of a matrix with a non-zero scalar multiple of itself: $\r_i \to \lambda\r_i$
- **Row Addition:** Replace any row of a matrix with the sum of itself and a non-zero scalar multiple of any other row: $\r_i \to \r_i + \lambda \r_j$.

**$\r_i$ refers to row $i$ of the matrix.**

### Definition (Elementary Column Operations)

By replacing the word *row* to *column*, we recover the definition of **elementary column operations**.

### Theorem (Elementary Row Operations Preserve Solution Set of Linear Systems)

This theorem will be proven again later in the context of matrices. Here, I highly recommend reading the proof (without the context of matrices) from [A First Course in Linear Algebra by Ken Kuttler](https://math.libretexts.org/Bookshelves/Linear_Algebra/A_First_Course_in_Linear_Algebra_(Kuttler)/01%3A_Systems_of_Equations/1.02%3A_Elementary_Operations) where he showed that these 3 operations will not change the solution set of the original system of
linear equations.

## Gauss Elimination

### Definition (Augmented Matrix of a System of Linear Equations)

We usually combine $\A\x = \b$ into one system (matrix) for ease of computing elementary row operations, after all, row operations are always applied to the **whole system**.

Given the general form of the linear equations, the **augmented matrix** of the system of equations is:

$$
[\A ~|~ \b] = \begin{bmatrix}
a_{11} & a_{12} & \cdots & a_{1n} & b_1 \\
a_{21} & a_{22} & \cdots & a_{2n} & b_2 \\
\vdots & \vdots & \ddots & \vdots \\
a_{m1} & a_{m2} & \cdots & a_{mn} & b_m
\end{bmatrix}
$$

### Theorem (Solving Augmented Matrix Solves the System of Linear Equations)

We established that row operations on a system of linear equations preserve the orginal solution set, therefore we can apply row operations on the augmented matrix to solve the solution.

### Definition (Row Echolon Form)

Given a matrix $\A \in \F^{m \times n}$ and $\b \in \F^{m}$, then we say the augmented matrix $[\A ~|~ \b]$ is in its **row echolon form** if:

- Any rows that are all **zeros** must be at the bottom of the matrix, that is to say, all **zero row vectors** are grouped at the bottom.
- The **leading coefficient (also called the pivot)** of a non-zero row is always strictly to the right of the leading coefficient of the row above it.
- All entries in a column below a pivot are zeros.
- Some textbooks require the leading coefficient to be 1.

### Definition (Reduced Row Echolon Form)

Given a matrix $\A \in \F^{m \times n}$ and $\b \in \F^{m}$, then we say the augmented matrix $[\A ~|~ \b]$ is in its **reduced row echolon form** if:

- Any rows that are all **zeros** must be at the bottom of the matrix, that is to say, all **zero row vectors** are grouped at the bottom.
- The **leading coefficient (also called the pivot)** of a non-zero row is always strictly to the right of the leading coefficient of the row above it.
- All entries in a column below a pivot are zeros.
- The leading coefficient to be 1.
- All entries in a column above and below a leading entry are zero.

### Definition (Pivot Position and Pivot Column)

- **Pivot Position:** A **pivot position** in a matrix is the location of a leading entry in the row-echelon form of a matrix.

- **Pivot Column:**  A **pivot column** is a column that contains a pivot position.

### Algorithm (Gaussian and Gaussian-Jordan Elimination)

> Entirely taken from [A First Course in Linear Algebra by Ken Kuttler](https://math.libretexts.org/Bookshelves/Linear_Algebra/A_First_Course_in_Linear_Algebra_(Kuttler)).

This algorithm provides a method for using row operations to take a matrix to its reduced row-echelon form. We begin with the matrix in its original form.

1. Starting from the left, find the first nonzero column. This is the first pivot column, and the position at the top of this column is the first pivot position. Switch rows if necessary to place a nonzero number in the first pivot position.
2. Use row operations to make the entries below the first pivot position (in the first pivot column) equal to zero.
3. Ignoring the row containing the first pivot position, repeat steps 1 and 2 with the remaining rows. Repeat the process until there are no more rows to modify.
4. Divide each nonzero row by the value of the leading entry, so that the leading entry becomes  1 . The matrix will then be in row-echelon form.

> The following step will carry the matrix from row-echelon form to reduced row-echelon form.

5. Moving from right to left, use row operations to create zeros in the entries of the pivot columns which are above the pivot positions. The result will be a matrix in reduced row-echelon form.

### Definition (Types of Solutions)

> Modified from [A First Course in Linear Algebra by Ken Kuttler](https://math.libretexts.org/Bookshelves/Linear_Algebra/A_First_Course_in_Linear_Algebra_(Kuttler)).

#### Definition (No Solution)

In the case where the system of equations has no solution, the row-echelon form of the augmented matrix will have a row of the form:

$$
\left[\begin{array}{@{}ccc|c@{}}
0 & 0 & \cdots & b_i \\
\end{array}\right]
$$

That is to say, there exists a row with entirely zeros in $\A$ but the corresponding output $\b_i \neq 0$.

#### Definition (One Unique Solution)

We use a small example as follows:

$$
\left[\begin{array}{@{}ccc|c@{}}
1 & 0 & 0 & b_1 \\
0 & 1 & 0 & b_2 \\
0 & 0 & 1 & b_3 \\
\end{array}\right]
$$

This system has unique solution as every column of the coefficient matrix is a pivot column.

#### Definition (Infinitely Many Solutions)

We use a small example as follows.

In the case where the system of equations has infinitely many solutions, the solution contains parameters. There will be columns of the coefficient matrix which are not pivot columns. The following are examples of augmented matrices in reduced row-echelon form for systems of equations with infinitely many solutions.

$$
\left[\begin{array}{@{}ccc|c@{}}
1 & 0 & 0 & b_1 \\
0 & 1 & 0 & b_2 \\
0 & 0 & 0 & 0 \\
\end{array}\right]
$$

## Uniqueness of Reduced Row-Echolon Form

### Definition (Basis Variable)

Assume a augmented matrix system $[\A ~|~ \b]$ in **rref**, then the variables (unknowns) $x_i$ is a **basic variable** if $[\A ~|~ \b]$ has a leading 1 in column number $i$, in this case, column $i$ is also a **pivot column**.

### Definition (Free Variable)

If the variable $x_i$ is not **basis**, then it is **free**.

### Definition (Free Column)

A **free column** is a column that does not contains a pivot position.

### Example (Basic and Free Variable)

This is best understood from an [example taken from A First Course in Linear Algebra by Ken Kuttler](https://math.libretexts.org/Bookshelves/Linear_Algebra/A_First_Course_in_Linear_Algebra_(Kuttler)).

Consider the system:

$$
\begin{align}
x + 2y - z + w = 3 \\
x + y - z + w = 1 \\
x + 3y - z + w = 5
\end{align}
$$

we know that the augmented matrix is:

$$
\left[\begin{array}{@{}cccc|c@{}}
1 & 2 & -1 & 1 & 3 \\
0 & 1 & 0  & 0 & 2 \\
0 & 0 & 0 & 0 & 0\\
\end{array}\right]
$$

---



**Solution**

- We always look out for the row with 1 variable to one solution (if it exists). In this case, it is $y = 2$. The perks of **rref** allows us to do this easily.
- In the first row, it has $x + 2y - z + w = 3 \implies x + 4 - z + w = 3 \implies x = -1 + z - w$.
    - Since the solution of $x$ depends on $z$ and $w$, we call $z$ and $w$ the free variable and parameters as $z$ and $w$ can actually take on any value.
    - Set $z = s$
    - Set $w = t$

So the solution set can be described as:

$$
\begin{bmatrix}
x \\ y \\ z \\ w
\end{bmatrix} 
=
\begin{bmatrix}
-1 + s - t \\ 2 \\ s \\ t
\end{bmatrix} 
$$

and has **infinitely number of solutions**.

Here, the **free variables** are the parameters $z = s$ and $w = t$, and **basic variables**.

### Sorting out the confusion (Basic and Free Variables)

From the example above, we can clearly see that free variables allow us to assign any values to them. The above example seems obvious, but it isn't that much if we have:

$$
\left[\begin{array}{@{}cccc|c@{}}
1 & 2 & 0 & -2 & 0 \\
0 & 0 & 1 & 2 & 0 \\
0 & 0 & 0 & 0 & 0\\
\end{array}\right]
$$

which translates to:

$$
\begin{aligned}
x_1 + 2x_2 + 0x_3 - 2x_4 = 0 \\
0 x_1 + 0x_2 + x_3 + 2x_4 = 0
\end{aligned}
$$

By definition $x_2$ and $x_4$ are free variables, and if you ever wonder why $x_4$ is free (even though it is by definition), then you did not understand the basics.

Consider simply:

$$
\begin{aligned}
x + y = 0 \\
2x + 2y = 0
\end{aligned}
$$

then it is obvious that this system reduces to only solving $x + y = 0$, in which if you do **RREF**, the free variable is $y$. If you plot out the solution set, this is just a straight line $x + y = 0$ that passes through the origin. Then if you write it as $x = -y$, then this means $x$ depends on $y$, in which $y$ can be any point on the line. Similarly, if we ignore the definition of free variable, we can also write $y = -x$ and recover our favourite high school equation of a line where $y$ depends on $x$ and $x$ being independent is allowed to take on any values. But matrix theory now gives us a systematic way to approach things, we just need to know that if our unknowns is more than the equations, we are usually bound to have free variables.

### Word of Caution (Basic and Free Variables)

Note that since normal Gaussian Elimination **REF** is not unique, there can be different free and basic variables for different **REF**. But you will see that **RREF** guarantees uniqueness.

### Proposition (Basic and Free Variables)

If $x_i$  is a basic variable of a homogeneous system of linear equations, then any solution of the system with $x_j=0$ for all those free variables $x_j$ with $j>i$ must also have $x_i=0$.

> This is best understood by the previous example, note that we can denote:

$$
x = x_1, y = x_2, z = x_3, w = x_4
$$

and see that the free variables below $x_1$ cannot have $x_1 \neq 0$ inside.

### Lemma (Solutions and the Reduced Row-Echelon Form of a Matrix)

Let $\A$ and $\B$ be two distinct augmented matrices for two homogeneous systems of $m$ equations in $n$ variables, such that $A$ and $B$ are each in reduced row-echelon. Then, the two systems do not have exactly the same solutions.

### Definition (Row Equivalence)

Two matrices $\A$ and $\B$ are **row equivalent** if one matrix can be obtained from the other matrix by a **finite sequence of elementary row operations**.

> Note that if $\A$ can be obtained by applying a sequence of elementary row operations on $\B$, then it follows that we just need to apply the sequence in reverse for $\B$ to get to $\A$.

### Theorem (Every Matrix is row equivalent to its RREF)

Every matrix $\A \in \F^{m \times n}$ is row equivalent to its **RREF**.

### Theorem (Row Equivalent Augmented Matrices have the same solution set)

Given $[\A ~|~ \b]$ and $[\C ~|~ \d]$, if both are **row equivalent** to each other, then the two linear systems have the same solution sets.

### Theorem (RREF is Unique)

Every matrix $\A$ has a **RREF** and it is unique. To prove it one should use **Lemma (Solutions and the Reduced Row-Echelon Form of a Matrix)** and **Theorem (Row Equivalent Augmented Matrices have the same solution set)**. See [A First Course in Linear Algebra by Ken Kuttler](https://math.libretexts.org/Bookshelves/Linear_Algebra/A_First_Course_in_Linear_Algebra_(Kuttler)).

## Rank and Homogeneous Systems

The section talks about matrix rank in homogeneous systems. I felt it is better mentioned again in matrix theory. So do visit there.

## Elementary Matrices

### Permutation Matrix

Row Exchange:
$$
\begin{align}
x_1- 2x_2+x_3&=0\\
2x_2-8x_3&=8\\
-4x_1+5x_2+9x_3&=-9
\end{align}
$$

vs

$$
\begin{align}
2x_2-8x_3&=8\\
x_1- 2x_2+x_3&=0\\
-4x_1+5x_2+9x_3&=-9
\end{align}
$$

has no difference, we just swapped row 1 and 2. We can do the same in matrix for conveince.

Also, given

$$
\P = \begin{bmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix}
,\quad
\A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{bmatrix}
$$

then 

$$\P\A = \begin{bmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{bmatrix} = \begin{bmatrix} 4 & 5 & 6 \\ 1 & 2 & 3 \\ 7 & 8 & 9 \\ \end{bmatrix}$$

and notice that row 1 and 2 are swapped by the left multiplication of the permutation matrix $\P$. Why did it worked?

Recall now

$$\P\A = \begin{bmatrix}\ \p_1 \\ \p_2 \\  \p_3 \end{bmatrix}\A = \begin{bmatrix}\p_1\A \\ \p_2\A \\ \p_3\A \end{bmatrix}$$

We just look at the first row of $\P\A$ given by $\p_1\A$ which maps to the first row of $\P\A$.

$$\p_1\A = 0 \begin{bmatrix} 1 & 2 & 3 \end{bmatrix} + 1 \begin{bmatrix} 4 & 5 & 6 \end{bmatrix} + 0 \begin{bmatrix} 7 & 8 & 9 \end{bmatrix} = \begin{bmatrix} 4 & 5 & 6 \end{bmatrix}$$

Then the rest is the same logic:

$$\p_2\A = 1 \begin{bmatrix} 1 & 2 & 3 \end{bmatrix} + 0 \begin{bmatrix} 4 & 5 & 6  \end{bmatrix} + 0 \begin{bmatrix} 7 & 8 & 9 \end{bmatrix} = \begin{bmatrix} 1 & 2 & 3 \end{bmatrix}$$

$$\p_3\A = 0 \begin{bmatrix} 1 & 2 & 3 \end{bmatrix} + 0 \begin{bmatrix} 4 & 5 & 6  \end{bmatrix} + 1 \begin{bmatrix} 7 & 8 & 9  \end{bmatrix} = \begin{bmatrix} 7 & 8 & 9 \end{bmatrix}$$

We now see why through **Matrix Multiplication (Left row wise)** that the **Permutation Matrix** works the way it is!

## References

- [A First Course in Linear Algebra by Ken Kuttler](https://math.libretexts.org/Bookshelves/Linear_Algebra/A_First_Course_in_Linear_Algebra_(Kuttler))
- https://math.stackexchange.com/questions/1634411/why-adding-or-subtracting-linear-equations-finds-their-intersection-point