# Fundamentals of Matrix Analysis

In this notebook, we recall the basic elements of linear algebra which will be employed in the remainder of the notebooks. 

## Linear Transformation.

**Definition**. Let $V$ and $W$ be vector spaces over $F$. We call a function $T:V \to W$ a *linear transformation* from $V$ into $W$, if for all $x,y \in V$ and $c \in F$, we have:

(a) Additivity is preserved.
$\begin{align}
T(x+y) = T(x) + T(y)
\end{align}
$

(b) Scalar-multiplication is preserved.
$\begin{align}
T(cx) = cT(x)
\end{align}
$

We often simply call $T$ a *linear map*. If $V = W$, then $T$ is called a *linear operator*.

We turn our attention to two important sets associated with linear maps: the *range space* and the *null space*. The determination of these sets allows us to more closely examine the intrinsic properties of a linear transformation.

**Definition**. Let $V$ and $W$ be vector spaces, and $T:V\to W$ be linear. We define the *null space* (or *kernel*) $N(T)$ of $T$ to be the set of all vectors $x$ in $V$, such that $T(x) = 0$. That is,

$
\begin{align}
N(T) := \{x : T(x) = 0 \}
\end{align}
$

where $0$ represents the zero-vector in $W$.

We define the *range* (or *image*) $R(T)$ of $T$ to be the subset of $W$ consisting of all images (under $T$) of vectors in $V$; that is $R(T) = \{T(x):x \in V\}$.

**Theorem** 1. Let $V$ and $W$ be vector spaces and $T:V\to W$ be linear. Then, $N(T)$ and $R(T)$ are subspaces of $V$ and $W$ respectively.

**Proof.**

Define $0_V$ and $0_W$ to be the zero vectors in $V,W$ respectively.

Claim. $N(T) \subseteq V$

(1) Since $T(0_V) = 0_W$, we have that $0_V \in N(T)$. So,the zero vector belongs to $N(T)$.

(2) Let $x,y \in N(T)$. Then, $T(x+y) = T(x) + T(y) = 0_W + 0_W = 0_W$. Hence, $x+y \in N(T)$. Thus, $N(T)$ is closed under vector addition.

(3) Let $c \in F$. Then, $T(cx) = cT(x) = c \cdot 0_W = 0_W$. So, $cx \in N(T)$. Thus, $N(T)$ is closed under scalar multiplication. 

Consequently, $N(T)$ is a subspace of $V$.

Claim. $R(T) \subseteq W$

(1) Since $T(0_V) = 0_W$, the zero vector $0_W$ belongs to $R(T)$.

(2) Let $x,y \in R(T)$. Then, by the very definition of a range space, there exist $v,w \in V$, such that $T(v) = x, T(w) = y$. Since, $V$ is a vector space, $v + w \in V$. So, $T(v + w) = T(v) + T(w) = x + y$. It follows that, $x + y \in R(T)$. $R(T)$ is closed under vector addition.

(3) On similar lines, let $x \in R(T), c \in F$. Then, by the very definition of a range space, there exists $v \in V$, such that $T(v) = x$. $V$ is a vector space, so $cv \in V$. Therefore, $T(cv) = cT(v) = cx$. It follows that, $cx \in R(T)$. $R(T)$ is closed under scalar multiplication.

Consequently, $R(T)$ is a subspace of $W$.

**Theorem** 2. Let $V$ and $W$ be vector spaces, and let $T:V \to W$ be a linear map. If $B = \{ v_1,v_2,\ldots,v_n \}$ is a basis for $V$, then 

$\begin{align}
R(T) = span(T(B)) = span(\{Tv_1,Tv_2,\ldots,Tv_n\})
\end{align}
$

**Proof**.

Clearly, $T(v_i) \in R(T)$ for each $i$. Because, $R(T)$ is a subspace, $R(T)$ contains all linear combinations of $Tv_1,Tv_2,\ldots,Tv_n$. So, $R(T)$ contains $span(\{Tv_1,Tv_2,\ldots,Tv_n\})$. But, we know, that the span of any subset $S$ of a vector space $V$, is a subspace of $V$. Therefore, in the $\Rightarrow$ direction, $span(\{Tv_1,Tv_2,\ldots,Tv_n\}) \subseteq V$.

In the opposite direction $\Leftarrow$, suppose that $w \in R(T)$. Then, $w = T(v)$ for some $v \in V$ by the definition of a range space. Because, $B$ is a basis for $V$, we have:

$
\begin{align}
v = \sum_{i=1}^{n} \alpha_i v_i
\end{align}
$

Since, $T$ is linear, it follows that 

$\begin{align}
w = T(v) = \sum_{i=1}^{n} \alpha_i T(v_i) \in span(T(B))
\end{align}$

So, $R(T) \subseteq span(T(B))$. 

Consequently, $R(T)= span(T(B))$.

**Definition**. Let $V$ and $W$ be vector spaces and let $T:V \to W$ be linear. If $N(T)$ and $R(T)$ are finite dimensional, then we define the *nullity* of $T$, denoted $nullity(T)$ and the *rank* of $T$, denoted $rank(T)$, to be the dimensions of $N(T)$ and $R(T)$ respectively.

Reflecting on the action of a linear transformation, we see intuitively that the larger the nullity, the smaller the rank. In other words, the more vectors that are carried into $0$, the smaller the range. The same heuristic reasoning tells us that the larger the rank, the smaller the nullity. This balance between rank and nullity is made precise in the next theorem, appropriately called the rank-nullity-dimension theorem.

**Theorem** 3 (*Rank Nullity Dimension Theorem*). Let $V$ and $W$ be vector spaces, and let $T:V\to W$ be linear. If $V$ is finite dimensional, then

$
\begin{align*}
nullity(T) + rank(T) = dim(V)
\end{align*}
$

**Proof**.

Suppose that $dim(V) = n$, $dim(N(T)) = k$ and $\{v_1,v_2,\ldots,v_k\}$ is a basis for $N(T)$. Recall from basic linear algebra, that we may extend the basis $\{v_1,v_2,\ldots,v_k\}$ to a basis $B = \{v_1,v_2,\ldots,v_n\}$ for $V$, as long as the new vectors added are not in the span of the previous ones. We claim that $S = \{T(v_{k+1},T(v_{k+2}),\ldots,T(v_n)\}$ is a basis for $R(T)$. 

(1) First we prove that $S$ generates $R(T)$.

Using theorem (2) and the fact that $T(v_i) = 0$ for $1 \le i \le k$ we have:

$
\begin{align}
R(T) &= span(\{Tv_1,Tv_2,\ldots,Tv_n\}\\
&= span(\{T(v_{k+1}),T(v_{k+2},\ldots, T(v_{n}))\}) = span(S)
\end{align}
$

(2) Now we prove that $S$ is linearly independent. Suppose that,

$
\begin{align}
\sum_{i=k+1}^{n} \beta_i T(v_i) = 0
\end{align}
$

Using the fact that $T$ is linear, we have:

$
\begin{align}
T\left(\sum_{i=k+1}^{n} \beta_i v_i\right) = 0
\end{align}
$

So,

$
\begin{align}
\sum_{i=k+1}^{n} \beta_i v_i \in N(T)
\end{align}
$

Hence, there exists $c_1,c_2,\ldots,c_k \in F$ such that 

$
\begin{align}
\sum_{i=k+1}^{n} \beta_i v_i = \sum_{i=1}^{n} c_i v_i
\end{align}
$

or 

$
\begin{align}
\sum_{i=1}^{n} (-c_i) v_i + \sum_{i=k+1}^{n} \beta_i v_i = 0
\end{align}
$

Since, $B$ is a basis for $V$, the vectors $v_i$ are linearly independent. So, all of the coefficients $\beta_i$'s must be equal to $0$. This implies that $S$ is linearly independent.

Consequently, $S$ is a basis for $R(T)$ and $rank(T) = n - k$. Hence, $rank(T) + nullity(T) = dim(V)$.

## Matrices.

Let $m$ and $n$ be two positive integers. We call a *matrix* having *m* rows and *n* columns, or a matrix $m \times n$, or a matrix $(m,n)$ with elements in $K$, a set of $mn$ scalars $a_{ij} \in K$, with $i=1,\ldots,m$ and $j=1,\ldots,n$ represented by the following rectangular array

$A = 
\begin{bmatrix}
a_{11} & a_{12} & \ldots & a_{1n}\\
a_{21} & a_{22} & \ldots & a_{2n}\\
\vdots & \vdots &        & \vdots\\
a_{m1} & a_{m2} & \ldots & a_{mn} \tag{1}
\end{bmatrix}$

When $K = \mathbb{R}$ or $\mathbb{C}$, I shall write $A \in \mathbb{R}^{m\times n}$ or $A \in \mathbb{C}^{m\times n}$, to explicitly outline the numerical fields to which the elements of $A$ belong to.

From basic linear algebra, it is worthwhile to keep in mind that, *matrices are concrete realizations of linear transformations from $\mathbb{R}^n$ to $\mathbb{R}^m$*. $A$ is a map $T:\mathbb{R}^n \to \mathbb{R}^m$, where $T \in L(V,W)$, the space of all linear transformations.

In particular, let $V$, $W$ be finite dimensional vector spaces. Let $T$ be a linear transformation from $v$ into $W$.

$$T:V \to W$$

Let $B_v = \{v_1,v_2,\ldots,v_n\}$ and $B_w = \{w_1,w_2,\ldots,w_m\}$ be ordered bases of the vector spaces $V$ and $W$ respectively. $dim(V)= n$ and $dim(W) = m$. The matrix of the linear transformation $T$ is defined as follows.

A linear transformation is completely determined by its action on the basis vectors. If we know $Tv_1,Tv_2,\ldots,Tv_n$, it is enough to completely determine $T$. 

Each of the vectors $Tv_1,Tv_2,\ldots,Tv_n$ are elements of the vector space $W$, so we can resolve them in terms of the basis vectors $w_1,w_2,\ldots,w_m$. 

$
\begin{align}
Tv_1 &= a_{11}w_1 + a_{21}w_2 + \ldots + a_{m1}w_m \\
Tv_2 &= a_{12}w_1 + a_{22}w_2 + \ldots + a_{m2}w_m \\
\vdots\\
Tv_n &= a_{1n}w_1 + a_{2n}w_2 + \ldots + a_{mn}w_m 
\end{align} \tag{2}
$

That is, 
$
\begin{align}
Tv_j = \sum_{i=1}^{m}a_{ij} w_i
\end{align} \tag{3}
$

On the right hand side, $i$ is the running index, $j$ is the free index that corresponds to $Tv_j$. The matrix of the vector $Tv_j$ relative to the basis $B_w$ is the column vector whose entries are the coordingates with respect to $B_w$:

$
\begin{align}
[Tv_j]_{B_W} = 
\begin{bmatrix}
a_{1j}\\
a_{2j}\\
\vdots\\
a_{mj}
\end{bmatrix}
\end{align} \tag{4}
$

The matrix of the linear transformation $T$, that sends $x \in V$ having coordingates $\mathbf{x} = (x_1,x_2,\ldots,x_n)$ with respect to $B_W$ is defined as:

$\displaystyle  \begin{array}{{>{\displaystyle}l}}
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \begin{matrix}
Tv_{1} & Tv_{2}
\end{matrix} \ \ \dotsc \ \ \ Tv_{n} \ \\
A\ =\ ( a_{ij}) =[ T]^{B_{W}}_{B_{V}} =\begin{bmatrix}
a_{11} & a_{12} \  & \dotsc  & a_{1n}\\
a_{21} & a_{22} \  & \dotsc  & a_{2n}\\
a_{31} & a_{32} \  & \dotsc  & a_{3n}\\
\vdots  &  &  & \\
a_{m1} & a_{m2} \  & \dotsc  & a_{mn}
\end{bmatrix}\begin{matrix}
w_{1}\\
w_{2}\\
w_{3}\\
\vdots \\
w_{m}
\end{matrix}
\end{array} \tag{5}$

As an aid to remembering, how $[T]_{B_V}^{B_W}$ is constructed from $T$, you might write the vectors $Tv_1,Tv_2,\ldots,Tv_n$ across the top and the basis vectors $w_1,w_2,\ldots,w_m$ for the target space along the right. In the matrix above, the $j$th column of $[T]_{B_V}^{B_W}$ consists of scalars needed to write $Tv_j$ as a linear combination of the $w$'s. Thus, the picture should remind you that $Tv_j$ is retrieved by multiplying each entry in the $j$th column, by the corresponding $w$ from the right and then adding up the resulting vectors. This is in conformation with the usual notion of writing a matrix.

## Row space and column space of a matrix.

Suppose, we are to solve a system of linear equations
$
\begin{align}
Ax = b
\end{align}
$

Writing $A = [A_1 A_2 \ldots A_n]$, we have,

$
\begin{align}
[A_1 A_2 \ldots A_n]
\begin{bmatrix}
x_1\\
x_2\\
x_3\\
\vdots\\
x_n
\end{bmatrix} &= b \\
x_1 A_1 + x_2 A_2 + \ldots + x_n A_n &= b
\end{align}
$

Suppose $A$ is not invertible, then $Ax=b$ is solvable for some right hand side vectors $b$, and not solvable for other right hand vectors. We want to describe the good right hand side vectors $b$ - the vectors that can be written as a linear combination of the column vectors of $A$. Those $b$'s form the column space $A$. 

**Definition**. The *column space* of $A$ is the subspace generated by all linear combinations of the columns of the matrix $A$.

Remember that, $Ax = b$ is solvable, if and only, if $b$ is in the column space of $A$. Since, $b \in \mathbb{R}^m$, the column space of $A$ is a subspace of $\mathbb{R}^m$.

**Definition**. Let $A$ be a matrix of order $m \times n$ over the field of real numbers $\mathbb{R}$. $A \in \mathbb{R}^{m \times n}$. The subspace of $\mathbb{R}^n$ generated by the row-vectors of $A$ is called the row-space of $A$. 

The dimension of the row space of $A$ is called the *row rank* of $A$ The dimension of the column space of $A$ is called the *column rank* of $A$. 


## Elementary matrix operations and elementary matrices.

Solving a system of linear algebraic equations $Ax=b$ is the most important aspect of linear algebra. From high-school and university, we are familiar with performing elementary row-operations on a matrix $A$, resulting in a simplified system of equations easier to solve. 