---
# Section 2.1: Vector and Matrix Norms
---

## Motivation

When solving $Ax = b$ numerically, we get a solution $\hat{x}$ that satisfies

$$
\hat{A} \hat{x} = \hat{b}.
$$

We hope that $\hat{A} \approx A$ and $\hat{b} \approx b$.

It is for this reason that we need a way to measure the distance between both matrices and vectors.

---

## Vector norms

A (vector) **norm** is a function 

$$
\|\cdot\| : \mathbb{R}^n \rightarrow \mathbb{R}
$$

that satisfies

1. $\|x\| > 0$ if $x \neq 0$ and $\|0\| = 0$
2. $\|\alpha x\| = |\alpha| \|x\|$
3. $\|x + y\| \leq \|x\| + \|y\| \qquad$ (triangle inequality)

The **distance** between vectors $x$ and $y$ can then be measured by

$$
\mathrm{dist}(x,y) = \| x - y \|.
$$

---

## The $p$-norm

$$
\|x\|_p = \left( \sum_{i=1}^n \left|x_i\right|^p \right)^{1/p}, \quad 1 \leq p \leq \infty.
$$

Some examples are:

$$
\begin{array}{llclll}
\|x\|_1 &=& \displaystyle{\sum_{i=1}^n |x_i|} &=& \texttt{norm(x, 1)} \qquad &\text{(Manhattan norm)} \\
\|x\|_2 &=& \displaystyle{\sqrt{\sum_{i=1}^n |x_i|^2}} &=& \texttt{norm(x)} \qquad &\text{(Euclidean norm)} \\
\|x\|_\infty &=& \displaystyle{\max_{i=1,\ldots,n} |x_i|} &=& \texttt{norm(x, Inf)} \qquad &\text{(max-norm)} \\
\end{array}
$$

The **unit $p$-norm ball** is the set of vectors having $p$-norm at most $1$:

$$
B_p = \big\{ x \in \mathbb{R}^n : \lVert x \rVert_p \leq 1 \big\}.
$$

---

## `norm` of a vector

In [None]:
using LinearAlgebra

In [None]:
?norm

In [None]:
x = Float64[1, -2, 1]

In [None]:
y = Float64[0.9, -1.9, 1.1]

In [None]:
# The 1-norm of x
norm(x, 1)

In [None]:
# The 0-norm counts the number of nonzero entries in x
norm([1, 0, 0, 0, 1], 0)

In [None]:
# The Manhattan distance between x and y
norm(x - y, 1)

In [None]:
x

In [None]:
# The Euclidean norm of x
norm(x, 2)

In [None]:
sqrt(1 + 4 + 1)

In [None]:
# The default norm is the Euclidean norm
norm(x)

In [None]:
# The Euclidean distance between x and y
norm(x - y)

In [None]:
x

In [None]:
# The max-norm of x
norm(x, Inf)

In [None]:
# The max-norm distance between x and y
norm(x - y, Inf)

---

## The $A$-norm

$$
\|x\|_A = \sqrt{x^T A x}
$$

where $A$ is an $n \times n$ positive definite matrix.

For example, when $A = I$, we have the usual Euclidean norm:

$$
\|x\|_I = \|x\|_2.
$$

---

## The resistance distance on a social network

A practical example of the $A$-norm is the **resistance distance** between individuals on a **social network**. 

A social network can be represented as a **graph** where individuals are **nodes** which are connected by an **edge** if they are friends.

This graph can be represented using an **adjacency matrix** $A = [a_{ij}]$ where $a_{ij} = 1$ if $i$ and $j$ are friends, otherwise $a_{ij} = 0$; you cannot be friends with yourself, so $a_{ii} = 0$.

The **Laplacian matrix** $L = [l_{ij}]$ of the graph has $l_{ii} = \deg(i)$ (i.e., the number of friends of $i$), and $l_{ij} = -a_{ij}$ for $i \neq j$. That is,

$$
L = \mathrm{Diag}(Ae) - A,
$$

where $e$ is the vector of all ones, and $\mathrm{Diag}(Ae)$ is the diagonal matrix with the vector $Ae$ on its diagonal.

The **resistance distance** between $i$ and $j$ is then given by

$$
\mathrm{dist}(i,j) = \|e_i - e_j\|_B,  \qquad \text{where $B = (L + ee^T)^{-1}$}.
$$

Here $e_i$ and $e_j$ are the $i^\mathrm{th}$ and $j^\mathrm{th}$ columns of the identity matrix. (Note: it can be shown that $L + ee^T$ is positive definite, which implies that $(L + ee^T)^{-1}$ is positive definite.)

In [None]:
n = 6
A = Symmetric(rand(n, n))
A = round.(A)
A = A - diagm(diag(A))  # Make the diagonal zero

In [None]:
e = ones(n)
L = diagm(A*e) - A

In [None]:
L + e*e'

In [None]:
isposdef(L + e*e')

In [None]:
B = inv(L + e*e')

In [None]:
cholesky(Symmetric(B))

In [None]:
eigvals(Symmetric(B))

In [None]:
isposdef(Symmetric(B))

In [None]:
# Define the resistance norm
resnorm(x) = sqrt(x'*B*x)[1]

# Form the matrix of distances between all nodes
I = diagm(ones(n))
D = Float64[resnorm(I[:,i] - I[:,j]) for i = 1:n, j = 1:n]

In [None]:
D[3,4]

In [None]:
D[3,6]

In [None]:
D[3,4] > D[3,6]

In [None]:
using GraphPlot, LightGraphs

In [None]:
g = Graph(A)
gplot(g, nodelabel=1:n)

---

## The Euclidean norm is a norm

To prove that the **Euclidean norm** is indeed a norm, we need to show it satisfies the **triangle inequality**:

$$
\|x + y\|_2 \leq \|x\|_2 + \|y\|_2.
$$

We will prove this using the following fundamental result.

> ### Theorem: (Cauchy-Schwarz Inequality)
>
> $$
\left|x^Ty\right| \leq \|x\|_2 \|y\|_2, \qquad \forall x, y \in \mathbb{R}^n.
$$

### Proof of the triangle inequality.

Let $x, y \in \mathbb{R}^n$

$$
\begin{align}
\|x + y\|_2^2 
&= (x+y)^T(x+y) \\
&= x^Tx + x^Ty + y^Tx + y^Ty \\
&= \|x\|_2^2 + 2x^Ty + \|y\|_2^2 \\
&\leq \|x\|_2^2 + 2\|x\|_2\|y\|_2 + \|y\|_2^2 \qquad \text{(Cauchy-Schwarz inequality)} \\
&= \big(\|x\|_2 + \|y\|_2\big)^2. \\
\end{align}
$$

Taking the square root of both sides, we obtain $\|x + y\|_2 \leq \|x\|_2 + \|y\|_2$. $\blacksquare$

---

Now let's see a proof of the Cauchy-Schwarz inequality.

### Proof of Cauchy-Schwarz.

Let $t \in \mathbb{R}$. Then

$$
\begin{align}
0 \leq \|x + ty\|_2^2 
&= (x + ty)^T(x + ty) \\
&= x^T x + 2tx^Ty + t^2y^Ty \\
&= \|x\|_2^2 + \big(2x^Ty\big) t + \|y\|_2^2 t^2  \\
&= c + bt + at^2,
\end{align}
$$

where 

$$
a = \|y\|_2^2, \qquad
b = 2x^Ty, \qquad 
c = \|x\|_2^2.
$$

$\therefore$ $at^2 + bt + c \geq 0$, $\forall t \in \mathbb{R}$.

This implies that the equation

$$
at^2 + bt + c = 0
$$

either has no solution or exactly one solution. By the **quadratic formula**,

$$
t = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a},
$$

we must have $b^2 - 4ac \leq 0$.

Thus, 

$$
\big(2x^Ty\big)^2 - 4 \|y\|_2^2 \|x\|_2^2 \leq 0,
$$

which simplifies to

$$
\left|x^Ty\right| \leq \|x\|_2 \|y\|_2. \qquad \blacksquare
$$

---

## Matrix norms

A **matrix norm** is a function 

$$
\|\cdot\| : \mathbb{R}^{n \times n} \rightarrow \mathbb{R}
$$

that satisfies

1. $\|A\| > 0$ if $A \neq 0$ and $\|0\| = 0$
2. $\|\alpha A\| = |\alpha| \|A\|$
3. $\|A + B\| \leq \|A\| + \|B\|$
4. $\|AB\| \leq \|A\|\|B\| \qquad$ (submultiplicativity)

The **distance** between matrices $A$ and $B$ can then be measured by

$$
\mathrm{dist}(A, B) = \| A - B \|.
$$

---

### The Frobenius norm

$$
\|A\|_F = \sqrt{\sum_{i=1}^n \sum_{j=1}^n |a_{ij}|^2} = \texttt{norm(A)}
$$

In [None]:
A = [1 2; 3 4.0]

In [None]:
norm(A)

In [None]:
sqrt(30)

---

### Exercise

Compute $\lVert I \rVert_F$, where $I$ is the $n \times n$ identity matrix.

---

In [None]:
I = diagm(ones(4))

In [None]:
norm(I)

---

## Induced Matrix Norms

Let $\|\cdot\| : \mathbb{R}^n \rightarrow \mathbb{R}$ be a vector norm.

The **induced matrix norm** (a.k.a. the **operator norm**) is defined as

$$
\|A\| = \max_{x \neq 0} \frac{\|Ax\|}{\|x\|} = \max_{\|x\| = 1} \|Ax\|.
$$

The induced matrix norm is a norm.

---

## Exercise

Let $\|\cdot\|$ be an induced matrix norm. Compute $\|I\|$.

---

## Examples

$$
\begin{array}{llclll}
\|A\|_p &=& \displaystyle{\max_{x \neq 0} \frac{\|Ax\|_p}{\|x\|_p}} &=& \texttt{opnorm(A, p)} \qquad &\text{($p$-norm)} \\
\|A\|_1 &=& \displaystyle{\max_{1 \leq j \leq n} \sum_{i=1}^n |a_{ij}|} &=& \texttt{opnorm(A, 1)} \qquad &\text{(max-column-sum)} \\
\|A\|_2 &=& \displaystyle{\sqrt{\lambda_{\max}\left(A^TA\right)}} &=& \texttt{opnorm(A)} \qquad &\text{(spectral norm)} \\
\|A\|_\infty &=& \displaystyle{\max_{1 \leq i \leq n} \sum_{j=1}^n |a_{ij}|} &=& \texttt{opnorm(A, Inf)} \qquad &\text{(max-row-sum)} \\
\end{array}
$$

**Note:** $\lambda_{\max}\left(A^TA\right)$ is the **largest eigenvalue** of the symmetric matrix $A^TA$; $\sqrt{\lambda_{\max}\left(A^TA\right)}$ is the **largest singular value** of the matrix $A$.

However, the Frobenius norm is **not** an induced matrix norm.

In [None]:
A = [1 3; -2 0.0]

sqrt(maximum(eigvals(A'*A)))

In [None]:
maximum(svdvals(A))

---

## `opnorm` of a matrix

In [None]:
A = [1 2 3; 4 5 6; 7 8 9.0]

In [None]:
# max-column-sum
opnorm(A, 1)

In [None]:
# max-row-sum
opnorm(A, Inf)

In [None]:
# spectral norm
opnorm(A)

In [None]:
# The spectral norm of A is the 
# maximum singular value of A
svdvals(A)

In [None]:
# The singular values of A are the square root of the
# eigenvalues of A'*A
λ = max.(eigvals(A'*A),0)
sqrt.(λ)

---

## Induced Matrix Norm Inequality

> ### Theorem: (Induced Matrix Norm Inequality)
>
> Let $\|\cdot\| : \mathbb{R}^n \rightarrow \mathbb{R}$ be a vector norm. Then the corresponding induced matrix norm satisfies
>
> $$\|Ax\| \leq \|A\|\|x\|,\qquad \text{for all $A \in \mathbb{R}^{n \times n}$ and $x \in \mathbb{R}^n$.}$$

### Proof:

Let $x \in \mathbb{R}^n$. If $x = 0$, then $\|Ax\| \le \|A\|\|x\|$ clearly holds since both sides of the inequality would equal zero. Now suppose that $x \ne 0$. Then,

$$
\frac{\|Ax\|}{\|x\|} \le \max_{y \ne 0} \frac{\|Ay\|}{\|y\|} = \|A\|.
$$

Multiplying both sides by $\|x\|$, we have $\|Ax\| \le \|A\|\|x\|$. $\blacksquare$

---