# Fundamentals of Matrix Analysis

In this notebook, we recall the basic elements of linear algebra which will be employed in the remainder of the notebooks. For most of the detailed proofs, refer to any standard linear algebra text such as Axler or Friedberg-Insel-Spence.

## Linear Transformation.

**Definition**. Let $V$ and $W$ be vector spaces over $F$. We call a function $T:V \to W$ a *linear transformation* from $V$ into $W$, if for all $x,y \in V$ and $c \in F$, we have:

(a) Additivity is preserved.
$\begin{align*}
T(x+y) = T(x) + T(y)
\end{align*}
$

(b) Scalar-multiplication is preserved.
$\begin{align*}
T(cx) = cT(x)
\end{align*}
$

We often simply call $T$ a *linear map*. If $V = W$, then $T$ is called a *linear operator*.

We turn our attention to two important sets associated with linear maps: the *range space* and the *null space*. The determination of these sets allows us to more closely examine the intrinsic properties of a linear transformation.

**Definition**. Let $V$ and $W$ be vector spaces, and $T:V\to W$ be linear. We define the *null space* (or *kernel*) $N(T)$ of $T$ to be the set of all vectors $x$ in $V$, such that $T(x) = 0$. That is,

$
\begin{align*}
N(T) := \{x : T(x) = 0 \}
\end{align*}
$

where $0$ represents the zero-vector in $W$.

We define the *range* (or *image*) $R(T)$ of $T$ to be the subset of $W$ consisting of all images (under $T$) of vectors in $V$; that is $R(T) = \{T(x):x \in V\}$.

**Theorem** 1. Let $V$ and $W$ be vector spaces and $T:V\to W$ be linear. Then, $N(T)$ and $R(T)$ are subspaces of $V$ and $W$ respectively.

**Proof.**

Define $0_V$ and $0_W$ to be the zero vectors in $V,W$ respectively.

Claim. $N(T) \subseteq V$

(1) Since $T(0_V) = 0_W$, we have that $0_V \in N(T)$. So,the zero vector belongs to $N(T)$.

(2) Let $x,y \in N(T)$. Then, $T(x+y) = T(x) + T(y) = 0_W + 0_W = 0_W$. Hence, $x+y \in N(T)$. Thus, $N(T)$ is closed under vector addition.

(3) Let $c \in F$. Then, $T(cx) = cT(x) = c \cdot 0_W = 0_W$. So, $cx \in N(T)$. Thus, $N(T)$ is closed under scalar multiplication. 

Consequently, $N(T)$ is a subspace of $V$.

Claim. $R(T) \subseteq W$

(1) Since $T(0_V) = 0_W$, the zero vector $0_W$ belongs to $R(T)$.

(2) Let $x,y \in R(T)$. Then, by the very definition of a range space, there exist $v,w \in V$, such that $T(v) = x, T(w) = y$. Since, $V$ is a vector space, $v + w \in V$. So, $T(v + w) = T(v) + T(w) = x + y$. It follows that, $x + y \in R(T)$. $R(T)$ is closed under vector addition.

(3) On similar lines, let $x \in R(T), c \in F$. Then, by the very definition of a range space, there exists $v \in V$, such that $T(v) = x$. $V$ is a vector space, so $cv \in V$. Therefore, $T(cv) = cT(v) = cx$. It follows that, $cx \in R(T)$. $R(T)$ is closed under scalar multiplication.

Consequently, $R(T)$ is a subspace of $W$.

**Theorem** 2. Let $V$ and $W$ be vector spaces, and let $T:V \to W$ be a linear map. If $B = \{ v_1,v_2,\ldots,v_n \}$ is a basis for $V$, then 

$\begin{align*}
R(T) = span(T(B)) = span(\{Tv_1,Tv_2,\ldots,Tv_n\})
\end{align*}
$

**Proof**.

Clearly, $T(v_i) \in R(T)$ for each $i$. Because, $R(T)$ is a subspace, $R(T)$ contains all linear combinations of $Tv_1,Tv_2,\ldots,Tv_n$. So, $R(T)$ contains $span(\{Tv_1,Tv_2,\ldots,Tv_n\})$. But, we know, that the span of any subset $S$ of a vector space $V$, is a subspace of $V$. Therefore, in the $\Leftarrow$ direction, $span(\{Tv_1,Tv_2,\ldots,Tv_n\}) \subseteq R(T)$.

In the opposite direction $\Rightarrow$, suppose that $w \in R(T)$. Then, $w = T(v)$ for some $v \in V$ by the definition of a range space. Because, $B$ is a basis for $V$, we have:

$
\begin{align*}
v = \sum_{i=1}^{n} \alpha_i v_i
\end{align*}
$

Since, $T$ is linear, it follows that 

$\begin{align*}
w = T(v) = \sum_{i=1}^{n} \alpha_i T(v_i) \in span(T(B))
\end{align*}$

So, $R(T) \subseteq span(T(B))$. 

Consequently, $R(T)= span(T(B))$.

**Definition**. Let $V$ and $W$ be vector spaces and let $T:V \to W$ be linear. If $N(T)$ and $R(T)$ are finite dimensional, then we define the *nullity* of $T$, denoted $nullity(T)$ and the *rank* of $T$, denoted $rank(T)$, to be the dimensions of $N(T)$ and $R(T)$ respectively.

Reflecting on the action of a linear transformation, we see intuitively that the larger the nullity, the smaller the rank. In other words, the more vectors that are carried into $0$, the smaller the range. The same heuristic reasoning tells us that the larger the rank, the smaller the nullity. This balance between rank and nullity is made precise in the next theorem, appropriately called the rank-nullity-dimension theorem.

**Theorem** 3 (*Rank Nullity Dimension Theorem*). Let $V$ and $W$ be vector spaces, and let $T:V\to W$ be linear. If $V$ is finite dimensional, then

$
\begin{align*}
nullity(T) + rank(T) = dim(V)
\end{align*}
$

**Proof**.

Suppose that $dim(V) = n$, $dim(N(T)) = k$ and $\{v_1,v_2,\ldots,v_k\}$ is a basis for $N(T)$. Recall from basic linear algebra, that we may extend the basis $\{v_1,v_2,\ldots,v_k\}$ to a basis $B = \{v_1,v_2,\ldots,v_n\}$ for $V$, as long as the new vectors added are not in the span of the previous ones. We claim that $S = \{T(v_{k+1},T(v_{k+2}),\ldots,T(v_n)\}$ is a basis for $R(T)$. 

(1) First we prove that $S$ generates $R(T)$.

Using theorem (2) and the fact that $T(v_i) = 0$ for $1 \le i \le k$ we have:

$
\begin{align*}
R(T) &= span(\{Tv_1,Tv_2,\ldots,Tv_n\}\\
&= span(\{T(v_{k+1}),T(v_{k+2},\ldots, T(v_{n}))\}) = span(S)
\end{align*}
$

(2) Now we prove that $S$ is linearly independent. Suppose that,

$
\begin{align*}
\sum_{i=k+1}^{n} \beta_i T(v_i) = 0
\end{align*}
$

Using the fact that $T$ is linear, we have:

$
\begin{align*}
T\left(\sum_{i=k+1}^{n} \beta_i v_i\right) = 0
\end{align*}
$

So,

$
\begin{align*}
\sum_{i=k+1}^{n} \beta_i v_i \in N(T)
\end{align*}
$

Hence, there exists $c_1,c_2,\ldots,c_k \in F$ such that 

$
\begin{align*}
\sum_{i=k+1}^{n} \beta_i v_i = \sum_{i=1}^{n} c_i v_i
\end{align*}
$

or 

$
\begin{align*}
\sum_{i=1}^{n} (-c_i) v_i + \sum_{i=k+1}^{n} \beta_i v_i = 0
\end{align*}
$

Since, $B$ is a basis for $V$, the vectors $v_i$ are linearly independent. So, all of the coefficients $\beta_i$'s must be equal to $0$. This implies that $S$ is linearly independent.

Consequently, $S$ is a basis for $R(T)$ and $rank(T) = n - k$. Hence, $rank(T) + nullity(T) = dim(V)$.

## Matrices.

Let $m$ and $n$ be two positive integers. We call a *matrix* having *m* rows and *n* columns, or a matrix $m \times n$, or a matrix $(m,n)$ with elements in $K$, a set of $mn$ scalars $a_{ij} \in K$, with $i=1,\ldots,m$ and $j=1,\ldots,n$ represented by the following rectangular array

$A = 
\begin{bmatrix}
a_{11} & a_{12} & \ldots & a_{1n}\\
a_{21} & a_{22} & \ldots & a_{2n}\\
\vdots & \vdots &        & \vdots\\
a_{m1} & a_{m2} & \ldots & a_{mn} \tag{1}
\end{bmatrix}$

When $K = \mathbb{R}$ or $\mathbb{C}$, I shall write $A \in \mathbb{R}^{m\times n}$ or $A \in \mathbb{C}^{m\times n}$, to explicitly outline the numerical fields to which the elements of $A$ belong to.

From basic linear algebra, it is worthwhile to keep in mind that, *matrices are concrete realizations of linear transformations from $\mathbb{R}^n$ to $\mathbb{R}^m$*. $A$ is a map $T:\mathbb{R}^n \to \mathbb{R}^m$, where $T \in L(V,W)$, the space of all linear transformations.

In particular, let $V$, $W$ be finite dimensional vector spaces. Let $T$ be a linear transformation from $v$ into $W$.

$$T:V \to W$$

Let $B_v = \{v_1,v_2,\ldots,v_n\}$ and $B_w = \{w_1,w_2,\ldots,w_m\}$ be ordered bases of the vector spaces $V$ and $W$ respectively. $dim(V)= n$ and $dim(W) = m$. The matrix of the linear transformation $T$ is defined as follows.

A linear transformation is completely determined by its action on the basis vectors. If we know $Tv_1,Tv_2,\ldots,Tv_n$, it is enough to completely determine $T$. 

Each of the vectors $Tv_1,Tv_2,\ldots,Tv_n$ are elements of the vector space $W$, so we can resolve them in terms of the basis vectors $w_1,w_2,\ldots,w_m$. 

$
\begin{align*}
Tv_1 &= a_{11}w_1 + a_{21}w_2 + \ldots + a_{m1}w_m \\
Tv_2 &= a_{12}w_1 + a_{22}w_2 + \ldots + a_{m2}w_m \\
\vdots\\
Tv_n &= a_{1n}w_1 + a_{2n}w_2 + \ldots + a_{mn}w_m 
\end{align*} \tag{2}
$

That is, 
$
\begin{align*}
Tv_j = \sum_{i=1}^{m}a_{ij} w_i
\end{align*} \tag{3}
$

On the right hand side, $i$ is the running index, $j$ is the free index that corresponds to $Tv_j$. The matrix of the vector $Tv_j$ relative to the basis $B_w$ is the column vector whose entries are the coordingates with respect to $B_w$:

$
\begin{align*}
[Tv_j]_{B_W} = 
\begin{bmatrix}
a_{1j}\\
a_{2j}\\
\vdots\\
a_{mj}
\end{bmatrix}
\end{align*} \tag{4}
$

The matrix of the linear transformation $T$, that sends $x \in V$ having coordingates $\mathbf{x} = (x_1,x_2,\ldots,x_n)$ with respect to $B_W$ is defined as:

$
A = (a_{ij}) =[ T]^{B_{W}}_{B_{V}} =\begin{bmatrix}
a_{11} & a_{12} & \ldots  & a_{1n}\\
a_{21} & a_{22} & \ldots  & a_{2n}\\
a_{31} & a_{32} & \ldots  & a_{3n}\\
\vdots  &  &  & \\
a_{m1} & a_{m2} & \ldots  & a_{mn}
\end{bmatrix}
 \tag{5}$

As an aid to remembering, how $[T]_{B_V}^{B_W}$ is constructed from $T$, you might write the vectors $Tv_1,Tv_2,\ldots,Tv_n$ across the top and the basis vectors $w_1,w_2,\ldots,w_m$ for the target space along the right. In the matrix above, the $j$th column of $[T]_{B_V}^{B_W}$ consists of scalars needed to write $Tv_j$ as a linear combination of the $w$'s. Thus, the picture should remind you that $Tv_j$ is retrieved by multiplying each entry in the $j$th column, by the corresponding $w$ from the right and then adding up the resulting vectors. This is in conformation with the usual notion of writing a matrix.

## Row space and column space of a matrix.

Suppose, we are to solve a system of linear equations
$
\begin{align*}
Ax = b
\end{align*}
$

Writing $A = [A_1 A_2 \ldots A_n]$, we have,

$
\begin{align*}
[A_1 A_2 \ldots A_n]
\begin{bmatrix}
x_1\\
x_2\\
x_3\\
\vdots\\
x_n
\end{bmatrix} &= b \\
x_1 A_1 + x_2 A_2 + \ldots + x_n A_n &= b
\end{align*}
$

Suppose $A$ is not invertible, then $Ax=b$ is solvable for some right hand side vectors $b$, and not solvable for other right hand vectors. We want to describe the good right hand side vectors $b$ - the vectors that can be written as a linear combination of the column vectors of $A$. Those $b$'s form the column space $A$. 

**Definition**. The *column space* of $A$ is the subspace generated by all linear combinations of the columns of the matrix $A$.

Remember that, $Ax = b$ is solvable, if and only, if $b$ is in the column space of $A$. Since, $b \in \mathbb{R}^m$, the column space of $A$ is a subspace of $\mathbb{R}^m$.

**Definition**. Let $A$ be a matrix of order $m \times n$ over the field of real numbers $\mathbb{R}$. $A \in \mathbb{R}^{m \times n}$. The subspace of $\mathbb{R}^n$ generated by the row-vectors of $A$ is called the row-space of $A$. 

The dimension of the row space of $A$ is called the *row rank* of $A$ The dimension of the column space of $A$ is called the *column rank* of $A$. 

The row space is a subspace of $\mathbb{R}^n$ and the column space is a subspace of $\mathbf{R}^m$. Each row vector has $n$ coordinates, each column vector has $m$ coordinates.

An interesting fact is that the row rank of $A$ equals the column rank of $A$. This is a very important result. These subspaces may lie in different vector spaces, but their dimension is the same. 

## Invertible Matrices.

**Definition**. A linear transformation $T:V\to W$ is said to be *one-to-one*, if distinct elements have distinct images. Mathematically,

$\begin{align*}
x \ne y \iff T(x) \ne T(y)
\end{align*}$

The contrapositive of the above statement is,

$\begin{align*}
T(x) = T(y) \iff x = y
\end{align*}$

**Theorem**. Let $V$ and $W$ be vector spaces, and let $T : V \to W$ be linear. Then, $T$ is one-to-one if and only if $N(T) = \{0\}$. 

**Proof**. 

Suppose that $T$ is one-to-one and $x \in N(T)$. Then, $T(x) = 0 = T(0)$. Since, $T$ is one-to-one, we have $x = 0$. Hence, $N(T) = \{0\}$. 

Now, assume that $N(T) = 0$, and suppose that $T(x) = T(y)$. Then, $0 = T(x) - T(y) = T(x-y)$. Therefore, $x-y \in N(T) = \{0\}$. So, $x - y = 0$, or $x = y$. This means that $T$ is one-to-one.

**Definition**. Let $V$ and $W$ be vector spaces, and let $T : V \to W$ be linear. Then, $T$ is onto if and only if $R(T) = W$. 

**Definition**. Let $V$ and $W$ be vector spaces, and let $T : V \to W$ be linear. Then, $T$ is bijective if it is both injective(one-to-one) and surjective(onto). If $T:V \to W$ is a bijection, $dim(V) = dim(W)$. 

**Definition**. Let $V$ and $W$ be vector spaces, and let $T : V \to W$ be linear. The necessary and sufficient conditions for $T$ to be invertible are:
(1) $T$ is one-to-one (injective)
(2) $T$ is onto (surjective)

**Definition**. Let $V$ and $W$ be vector spaces and let $T:V\to W$ be linear. A function $U:W\to V$ is said to be an inverse of $T$, if $TU = I_W$ and $UT = I_V$. If $T$ has an inverse, it is said to be *invertible*. 

Hence, if $V$ and $W$ are vector spaces of equal dimension and $T:V \to W$, then $T$ is invertible. This is why invertible matrices are $n \times n$ square matrices.

**Theorem**. Let $T$ be an invertible linear operator on $\mathbb{R}^n$ and $B$ be a basis of $V$. Then, the inverse of a matrix is given by,

$\begin{align*}
[T^{-1}]_B = [T]_B^{-1}
\end{align*}$

**Proof**.

Since $T$ is an invertible map, there is an inverse operator $S$, such that $ST = I_V$ and $TS = I_V$. Remember, that this holds true, if a linear transformation has an inverse. There are no matrices here. So, apply the matrix operator $[]$

$
\begin{align*}
[TS]_{B} = [I]_B
\end{align*}
$

Invoking the formula $[TS]_B = [T]_B S_[B]$, we have,

$
\begin{align*}
[T]_{B} [S]_{B} = [I]_B
\end{align*}
$

We can also so this for the linear transformation in the first equation $TS = I_V$:

$
\begin{align*}
[S]_{B} [T]_{B} = [I]_B
\end{align*}
$

What this means is that, $[S]_B$ is the inverse of the matrix $[T]_B$. That is, $[S]_B = [T]_B^{-1}$. But, $S = T^{-1}$. So,

$
\begin{align*}
[T^{-1}]_B = [T]_B^{-1}
\end{align*}
$

**Theorem**. (*Invertible matrix theorem*). Let $T:\mathbb{R}^N \to \mathbb{R}^n$ and let $B$ be the standard basis of $\mathbb{R}^n$. Define $A:=[T]_B$. Then, the following statements are equivalent:

(1) $A$ is invertible. That is $A$ has an inverse, is non-singular or non-degenerate. There exists $B$ such that, $AB = I_n = BA$.

(2) The columns of $A$ are linearly independent.

(3) The rows of $A$ are linearly independent.

(4) $T$ is one-to-one. The null space of $A$ is trivial, that is, it contains only the zero vector as an element. $N(T) = \{0\}$. The homogenous system of equations $Ax = 0$ has only the trivial solution $x = 0$. 

(5) $T$ is surjective (onto). The range space  $R(T) = \mathbb{R}^{n}$. That is, the columns of $A$ span $\mathbb{R}^n$. $A$ has full rank. The inhomogenous system of equations $Ax = b$ has a unique solution $x$ for each right hand side vector $b$.  

(5) $det(A) \ne 0$.

(6) $A$ is row-equivalent to the $n \times n$ identity matrix $I_n$.

(7) $A$ has $n$ pivot positions.

**Proof**. 

(2) Because $T$ is invertible, $T$ is a bijection. That is, $T$ is both an injection and a surjection. Now, $R(T) = span(T(B))$.

Claim. $T(B)$ is a basis for $\mathbb{R}^n$.

(a) $T$ is surjective, so $R(T) = \mathbb{R}^n$. It follows that $span(T(B)) = \mathbb{R}^n$.

(b) Suppose $\alpha_1 Te_1 + \alpha_2 Te_2 + \ldots + \alpha_n Te_n = 0$. Then, $T(\alpha_1 e_1 + \alpha_2 e_2 + \ldots \alpha_n e_n) = 0$. Then, $\alpha_1 e_1 + \alpha_2 e_2 + \ldots \alpha_n e_n \in N(T)$. As $T$ is injective, $N(T) = \{0\}$, so $\alpha_1 e_1 + \alpha_2 e_2 + \ldots \alpha_n e_n = 0$. But, $e_1,e_2,\ldots,e_n$ are the basis vectors in $\mathbb{R}^n$. Therefore, the $\alpha_i$'s are zero. So, $\{Te_1,Te_2,\ldots,Te_n\}$ are linearly independent vectors.

The coordinates of the vector $Te_j$ are entries in the $j$th column of $A$. So, the columns of $A$ are linearly independent.

(3) Because $A$ is invertible, $A^T$ is also invertible. Therefore, the rows of $A$ are linearly independent.

The implications involving determinants and row-equivalence are explained at length further ahead.


## Elementary matrix operations and elementary matrices.

Solving a system of linear algebraic equations $Ax=b$ is the most important aspect of linear algebra. From high-school and university, we are familiar with performing elementary row-operations on a matrix $A$, resulting in a simplified system of equations easier to solve. 

**Definition** (*Elementary matrix*). Any matrix $E$ obtained by performing a single elementary row(column) operation on $I_n$ is called an elementary matrix.

**Theorem.** Given any elementary matrix $E$, there exists a matrix $D$, such that $DE = I = ED$. Every elementary matrix $E$ is invertible.

**Story Proof.** 

Each of the elementary row(column) operations on a matrix $A$ is like pre(post)-multiplying the matrix $A$, by an elementary matrix $E$. For example, if

$
A = 
\begin{bmatrix}
0 & 1 & 2\\
3 & 0 & 0\\
2 & 1 & 0
\end{bmatrix}
$

Remember, that $i$th row of the product of two matrices is given by,

$
\begin{align*}
C_i = \begin{bmatrix}a_{i1} & a_{i2} & \ldots & a_{in}\end{bmatrix} \begin{bmatrix}B_1 \\ B_2 \\ \vdots \\ B_n\end{bmatrix} = a_{i1}\cdot B_1 + a_{i2}\cdot B_2 + \ldots + a_{in}\cdot B_n
\end{align*}
$

Left-multiplying this matrix by $E_{12}$ is as good as interchanging the first and second rows of $A$. That is,

$
\begin{align*}
E_{12}A = \begin{bmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix}0 & 1 & 2 \\ 3 & 0 & 0 \\ 2 & 1 & 0\end{bmatrix} = \begin{bmatrix}3 & 0 & 0 \\ 0 & 1 & 2 \\ 2 & 1 & 0\end{bmatrix}
\end{align*}
$

Consider multiplying or dividing a row by a scalar. Multiplying a row by a scalar $k$ is akin to left-multiplication $E_i(k)A$. Multiplying a column by a scalar $k$ is similar to the right multiplication $E_i(k)$. As an illustration,

$
\begin{align*}
E_3(2)A = \begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 2\end{bmatrix}\begin{bmatrix}0 & 1 & 2 \\ 3 & 0 & 0 \\ 2 & 1 & 0\end{bmatrix} = \begin{bmatrix}0 & 1 & 2 \\ 3 & 0 & 0 \\ 4 & 2 & 0\end{bmatrix}
\end{align*}
$

Lastly, if an equation(row) $e_i$ is replaced by the sum of $e_i + ke_j$, where $j \ne i$ and $k$ is any scalar, it is equivalent to multiplication by an elementary matrix $E_ij(k)$.

$
\begin{align*}
E_{13}(5)A = \begin{bmatrix}1 & 0 & 5 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix}0 & 1 & 2 \\ 3 & 0 & 0 \\ 2 & 1 & 0\end{bmatrix} = \begin{bmatrix}10 & 6 & 2 \\ 3 & 0 & 0 \\ 2 & 1 & 0\end{bmatrix}
\end{align*}
$

Each column of $E$ is a unique linear combination of the columns of $I_n$. Thus, every column $E_j = \alpha_1 e_1 + \alpha_2 e_2 +\ldots + \alpha_n e_n$ has unique coordinates $\alpha_1,\alpha_2,\ldots,\alpha_n$. Therefore, each of the $E_j$ are linearly independent. Consequently, any elementary matrix $E$ is *invertible*.

## Equivalent systems of linear equations.

Any rectangular matrix in the *row-echelon* form has the following three defining properties.

(1) The first $r$ rows for some $r \ge 0$ are non-zero, and the remaining rows if any are zero.

(2) In the $i$th row $(i=1,2,3,\ldots,r)$, the first non-zero element is equal to unity, the column in which it occurs is $c_i$. 

(3) $c_1 < c_2 < c_3 < \ldots < c_n$

A matrix in row-echelon form has a stair-case pattern. For example,

$
\begin{align*}
U = \begin{bmatrix}
1 & 0 & 3 & 3\\
0 & 1 & 3 &4 \\
0 & 0 & 0 & 1
\end{bmatrix}
\end{align*}
$

is in the row-echelon form.

If a matrix in row-echelon form satisfies the following conditions, then it is said to be row-reduced echelon form(rref). 

(1) The matrix is in row-echelon form.

(2) Each leading $1$ is the only non-zero entry in its column.

The reduced-row echelon form of the matrix discussed in the previous section is

$
\begin{align*}
R = \begin{bmatrix}
1 & 0 & 3 & 0\\
0 & 1 & 3 & 0 \\
0 & 0 & 0 & 1
\end{bmatrix}
\end{align*}
$

The sole objective of linear algebra is to solve the system of equations:

$
\begin{align*}
A\mathbf{x}=\mathbf{b}
\end{align*}
$

where $A$ is a matrix of order $m \times n$ over the field of reals, $\mathbf{x} \in \mathbb{R}^n$, the right hand side vector $\mathbf{b} \in \mathbb{R}^m$. We are interested to find $\mathbf{x}=(x_1,x_2,x_3,\ldots,x_n)$ that satisfies the above system of equations.

A solution of a linear system is therefore an assignment of values to the variables $x_1,x_2,\ldots, x_n$ such that each of the equations is satisfied. The set of all possible solutions is called the *solution set*.

- A system of equaton may have no solution at all (for example parallel lines). 
- A system of equations may have unique solution (straight lines intersecting at a unique point)
- A system of equations may have an infinite number of solutions (coincident lines)

In general, a system of $m$ equations in $n$ unknowns is said to be *under-determined*, if the number of equations are smaller than the number of unknowns $m < n$. If $m < n$, the system generally has no solution or an infinite number of solutions. If $m=n$, the system has a unique solution. If the number of equations exceeds the number of unknowns, $m > n$, it is over-determined, and generally has no solution.

This is just to give a geometric viewpoint to the solutions of a system of linear equations.