# Basic Vector Manipulations

1) Notations

2) Operations

A vector with $n$ elements is denoted like this:

$$
\mathbf{x} = \left[ 
\begin{array}{c}
x_1 \\ x_2 \\ \vdots \\ x_n
\end{array}
\right]
$$

A columns vector like this is often referred to as $n \times 1$ vector ($n$ rows, a single column).

The transponse of this vector is:

$$
\mathbf{x}^T = \left[ 
\begin{array}{cccc}
x_1 & x_2 & \dots & x_n
\end{array}
\right]
$$

A row vector like this is often referred to as $1 \times n$ vector (a single row, $n$ columns).

**operation / scalar multiplication**

A vector can be multiplied by a scalar $s$ by multiplying each element of the vector by the scalar $s$.

$$
s \cdot \mathbf{x} = s \cdot \left[ 
\begin{array}{c}
x_1 \\ x_2 \\ \vdots \\ x_n
\end{array} \right] = \left[ 
\begin{array}{c}
s \cdot x_1 \\ s \cdot x_2 \\ \vdots \\ s \cdot x_n
\end{array} \right]
$$

**operation / vector addition**

Two column vectors $\mathbf{x}$, $\mathbf{y}$ can be added if they have the same number of elements.

$$
c \cdot \mathbf{x} + d \cdot \mathbf{y} = \left[
\begin{array}{c}
c \cdot x_1 + d \cdot y_1 \\
c \cdot x_2 + d \cdot y_2 \\
\vdots \\
c \cdot x_n + d \cdot y_n \\
\end{array}
\right]
$$

The addition of two row vectors $\mathbf{x}^T$, $\mathbf{y}^T$ is defined in a analogues way.

**operation / length of vector**

The length of a row or column vector $\mathbf{x}$ is defined as the *positive* square root of the sum over all squared elements a the vector.

$$
|\mathbf{x}| = \sqrt{\sum_{k=1}^{n} x_k^2}
$$

---

**operation / vector multiplication**

There are two forms of vector multiplication:

1) Vector product: multiplication of a column and a row vector each having $n$ elements. The result is a $n \times n$ *matrix*.

2) Scalar product: multiplication of a row vector and a column vector each having $n$ elements. The result is a *scalar value*.

*Vector product*

$$
\mathbf{z} = \mathbf{x} \cdot \mathbf{y}^T =
\left[
\begin{array}{c}
x_1 \\
x_2 \\
\vdots \\
x_n \\
\end{array}
\right] \cdot \left[
\begin{array}{cccc}
y_1 & y_2 & \dots & y_n
\end{array}
\right] = \left[
\begin{array}{cccc}
x_1 \cdot y_1 & x_1 \cdot y_2 & \dots & x_1 \cdot y_n \\
x_2 \cdot y_1 & x_2 \cdot y_2 & \dots & x_2 \cdot y_n \\
\vdots & \vdots & \ddots & \vdots \\
x_n \cdot y_1 & x_n \cdot y_2 & \dots & x_n \cdot y_n \\
\end{array}
\right]
$$

*Scalar product*

$$
\mathbf{z} = \mathbf{x}^T \cdot \mathbf{y} =
\left[
\begin{array}{cccc}
x_1 & x_2 & \dots & x_n
\end{array}
\right] \cdot \left[
\begin{array}{c}
y_1 \\ y_2 \\ \vdots \\ y_n
\end{array}
\right] = \sum_{k=1}^n x_k \cdot y_k
$$

## Linear equations

A linear equation with an unknow vector $\mathbf{x}$ of $m$ unknown vector components can be expressed by the following equation:

(adapted from: `Mathematics for Machine Learning` by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, Cambridge University Press)

$$
\left[ 
\begin{array}{c}
a_{(1,1)} \\ a_{(2,1)} \\ \vdots \\ a_{(n,1)}
\end{array} \right] \cdot x_1 +
\left[ 
\begin{array}{c}
a_{(1,2)} \\ a_{(2,2)} \\ \vdots \\ a_{(n,2)}
\end{array} \right] \cdot x_2 + \dots 
\left[ 
\begin{array}{c}
a_{(1,m)} \\ a_{(2,m)} \\ \vdots \\ a_{(n,m)}
\end{array} \right] \cdot x_m =
\left[ 
\begin{array}{c}
b_1 \\ b_2 \\ \vdots \\ b_n
\end{array} \right] 
$$

The *solution* or *target* vector $\mathbf{b}$ has $n$ elements whereas there are $m$ unknowns in vector $\mathbf{x}$. So in general the linear system of equations is either under- or overdetermined.

Using *matrix* notation a more compact notation results:

$$
\left[ 
\begin{array}{cccc}
a_{(1,1)} & a_{(1,2)} & \dots & a_{(1,m)} \\
a_{(2,1)} & a_{(2,2)} & \dots & a_{(2,m)} \\
\vdots & \vdots & \ddots & \vdots \\
a_{(n,1)} & a_{(n,2)} & \dots & a_{(n,m)} \\
\end{array} \right] \cdot \left[ 
\begin{array}{c}
x_1 \\ x_2 \\ \vdots \\ x_m
\end{array} \right]
= \left[ 
\begin{array}{c}
b_1 \\ b_2 \\ \vdots \\ b_n
\end{array} \right]
$$

## Matrix * Vector Multiplication

$$
\underbrace{
\left[ 
\begin{array}{cccc}
a_{(1,1)} & a_{(1,2)} & \dots & a_{(1,m)} \\
a_{(2,1)} & a_{(2,2)} & \dots & a_{(2,m)} \\
\vdots & \vdots & \ddots & \vdots \\
a_{(n,1)} & a_{(n,2)} & \dots & a_{(n,m)} \\
\end{array} \right]}_{\mathbf{A}} \cdot \underbrace{\left[ 
\begin{array}{c}
x_1 \\ x_2 \\ \vdots \\ x_m
\end{array} \right]}_{\mathbf{x}}
= \underbrace{\left[ 
\begin{array}{c}
y_1 \\ y_2 \\ \vdots \\ y_n
\end{array} \right]}_{\mathbf{y}}
$$

Multiplying a $n \times m$ matrix $\mathbf{A}$ by a $m$ element column vector $\mathbf{x}$ from the right a $n$ element column vector $\mathbf{y}$ is the result.

Each element $y_k$ $(1 \le k \le n)$ of column vector $\mathbf{y}$ is computed according to this formula:

$$
y_k = \sum_{l=1}^{m} a_{k, l} \cdot x_l \\
$$

## Matrix - Matrix Addition

Given two $n \times m$ matrices $\mathbf{A}$ and $\mathbf{B}$ matrix addition is defined *element-wise*:

$$
\mathbf{C} = \mathbf{A} + \mathbf{B} =
\left[ 
\begin{array}{cccc}
a_{(1,1)} + b_{(1,1)} & a_{(1,2)} + b_{(1,2)}  & \dots & a_{(1,m)} + b_{(1,m)} \\
a_{(2,1)} + b_{(2,1)} & a_{(2,2)} + b_{(2,2)} & \dots & a_{(2,m)} + b_{(2,m)} \\
\vdots & \vdots & \ddots & \vdots \\
a_{(n,1)} + b_{(n,1)} & a_{(n,2)} + b_{(n,2)} & \dots & a_{(n,m)} + b_{(n,m)} \\
\end{array} \right]
$$

Obviously matrix addition is commutative:

$$
\mathbf{C} = \mathbf{A} + \mathbf{B} = \mathbf{B} + \mathbf{A}
$$

## Matrix * Matrix Multiplication

Matrix multiplication like

$$
\underbrace{\mathbf{C}}_{m \times k} = \underbrace{\mathbf{A}}_{m \times n} \cdot \underbrace{\mathbf{B}}_{n \times k}
$$

is defined if matrix $\mathbf{A}$ is of type $m \times n$ and matrix $\mathbf{B}$ is of type $n \times k$. The resulting matrix $\mathbf{C}$ is of type $m \times k$.

An element $c_{(i, j)}$ of matrix $\mathbf{C}$ is computed as the *dot product* of the i'th row vector of $\mathbf{A}$ and the j'th column vector of $\mathbf{B}$.

$$
c_{(i,\ j)} = \sum_{l=1}^{n} a_{(i, l)} \cdot b_{(l, j)}
$$

Obviously matrix multiplication is not commutative.




## Matrix Properties


### Associativity

$$
\underbrace{\mathbf{D}}_{m \times p} = \left(\underbrace{\mathbf{A}}_{m \times n} \cdot \underbrace{\mathbf{B}}_{n \times k} \right) \cdot \underbrace{{C}}_{k \times p} = \underbrace{\mathbf{A}}_{m \times n} \cdot \left(\underbrace{\mathbf{B}}_{n \times k} \cdot \underbrace{{C}}_{k \times p} \right) 
$$

**Proof**

Defining matrices $\mathbf{W}$ and $\mathbf{W}$

$$
\underbrace{\mathbf{W}}_{m \times k} = \mathbf{A} \cdot \mathbf{B}
$$

and

$$
\underbrace{\mathbf{Q}}_{n \times p} = \mathbf{B} \cdot \mathbf{C}
$$

First we derive an expression for elements of matrix $\mathbf{D}$ by computing the product $\mathbf{W} \cdot \mathbf{C}$. Then we repeat the procedure by computing elements of matrix $\mathbf{D}$ by computing the product $\mathbf{A} \cdot \mathbf{Q}$.

Both procedures should result in identical expressions for the elements of $\mathbf{D}$.

Element $w_{(i, j)}$ of $\mathbf{W}$ are obtained from equation:

$$
w_{(i,\ j)} = \sum_{l=1}^{n} a_{(i, l)} \cdot b_{(l, j)}
$$

Element $d_{(i_1, j_1)}$ of $\mathbf{D}$ is obtained from equation:

$$
d_{(i_1, j_1)} = \sum_{l_1=1}^{k} w_{(i_1, l_1)} \cdot c_{(l_1, j_1)} = \sum_{l_1=1}^{k} \sum_{l=1}^{n} a_{(i_1, l)} \cdot b_{(l, l_1)} \cdot c_{(l_1, j_1)}
$$

----

Element $q_{(i, j)}$ of $\mathbf{Q}$ are obtained from equation:

$$
q_{(i, j)} = \sum_{h=1}^{k} b_{(i, h)} \cdot c_{(h, j)}
$$

Element $d_{(i_1, j_1)}$ of $\mathbf{D}$ is obtained from equation:

$$
d_{(i_1, j_1)} = \sum_{l_2=1}^{n} a_{(i_1, l_2)} \cdot q_{(l_2, j_1)} = \sum_{h=1}^{k} \sum_{l_2=1}^{n} a_{(i_1, l_2)} \cdot b_{(l_2, h)} \cdot c_{(h, j_1)}
$$

*Conclusion* 

Comparing both ways to compute elements of $\mathbf{D}$ proves associativity rule.

---

### Distributivity

Let matrices $\mathbf{A}$ and $\mathbf{B}$ be of type $m \times n$ and matrix $\mathbf{C}$ of type $n \times k$. Then

$$
\mathbf{D} = \left(\mathbf{A} + \mathbf{B} \right) \cdot \mathbf{C} = \mathbf{A} \cdot \mathbf{C} + \mathbf{B} \cdot \mathbf{C}
$$

Distributivity can be understood almost intuitively. A proof is thererfore unessesary.

### Multiplication with Identity Matrix

A $m \times m$ square matrix $\mathbf{A}$ is multiplied by the $m \times m$ identity matrix $\mathbf{I_m}$. Then:

$$
\mathbf{I_m} \cdot \mathbf{A} = \mathbf{A} \cdot \mathbf{I_m}
$$

No proof required.

---

## Transpose of a matrix

Defining indices $m,\ n,\ k$ and there ranges:

$1 \le m \le M$, $1 \le n \le N$ and $1 \le k \le K$ 

Let $\mathbf{A}$ be a $M \times N$ matrix with elements $a_{(m, n)}$ . The transpose of $\mathbf{A}^T$ is obtained by swapping the element indices.

$$
\underbrace{\mathbf{B}}_{N \times M} = \mathbf{A}^T
$$

$$
b_{(n, m)} = a_{(m, n)}
$$

---

### Some important rules

### Transpose of sum of matrices

$$
\left( \mathbf{A} + \mathbf{B} \right)^T = \mathbf{A}^T + \mathbf{B}^T
$$

---

### Transpose of product of matrices

$$
\mathbf{D} = \left( \underbrace{\mathbf{A}}_{M \times N} \cdot \underbrace{\mathbf{B}}_{N \times K} \right)^T
$$

Let the $M \times K$ matrix $\mathbf{C}$ be defined by:

$$
\mathbf{C} = \mathbf{A} \cdot \mathbf{B}
$$

and therefore

$$
\mathbf{D} = \mathbf{C}^T
$$

The elements $c_{(m, k)}$ of the matrix are:

$$
c_{(m,\ k)} = \sum_{n=1}^{N} a_{(m, n)} \cdot b_{(n, k)}
$$

and the elements $d_{(k, m)}$ of matrix $\mathbf{D}$:

$$
d_{(k, m)} = c_{(m, k)}
$$

Elements of $\mathbf{B}^T$ are 

$$
b'_{(k,n)} = b_{(n, k)}
$$

and elements of $\mathbf{A}^T$ are

$$
a'_{(n, m)} = a_{(m, n)}
$$

We now compute matrix $\mathbf{F}$ defined by equation:

$$
\underbrace{\mathbf{F}}_{K \times M} = \mathbf{B}^T \cdot \mathbf{A}^T
$$

Elements of $\mathbf{F}$ are computed:

$$
f_{(k, m)} = \sum_{n=1}^{N} b'_{(k,n)} \cdot a'_{(n, m)} = \sum_{n=1}^{N} b_{(n, k)} \cdot a_{(m, n)} = c_{(m, k]}
$$

This proves 

$$
\left( \mathbf{A} \cdot \mathbf{B} \right)^T = \mathbf{B}^T \cdot \mathbf{A}^T  
$$

---


## Orthogonality

