---
# Section 2.1: Vector and Matrix Norms
---

## Motivation

When solving $Ax = b$ numerically, we get a solution $\hat{x}$ that satisfies

$$
\hat{A} \hat{x} = \hat{b}.
$$

We hope that $\hat{A} \approx A$ and $\hat{b} \approx b$.

It is for this reason that we need a way to measure the distance between both matrices and vectors.

---

## Vector norms

A (vector) **norm** is a function 

$$
\|\cdot\| : \mathbb{R}^n \rightarrow \mathbb{R}
$$

that satisfies

1. $\|x\| > 0$ if $x \neq 0$ and $\|0\| = 0$
2. $\|\alpha x\| = |\alpha| \|x\|$
3. $\|x + y\| \leq \|x\| + \|y\| \qquad$ (triangle inequality)

The **distance** between vectors $x$ and $y$ can then be measured by

$$
\mathrm{dist}(x,y) = \| x - y \|.
$$

---

## The $p$-norm

$$
\|x\|_p = \left( \sum_{i=1}^n \left|x_i\right|^p \right)^{1/p}, \quad 1 \leq p \leq \infty.
$$

Some examples are:

$$
\begin{array}{llclll}
\|x\|_1 &=& \displaystyle{\sum_{i=1}^n |x_i|} &=& \texttt{norm(x, 1)} \qquad &\text{(Manhattan norm)} \\
\|x\|_2 &=& \displaystyle{\sqrt{\sum_{i=1}^n |x_i|^2}} &=& \texttt{norm(x)} \qquad &\text{(Euclidean norm)} \\
\|x\|_\infty &=& \displaystyle{\max_{i=1,\ldots,n} |x_i|} &=& \texttt{norm(x, inf)} \qquad &\text{(max-norm)} \\
\end{array}
$$

The **unit $p$-norm ball** is the set of vectors having $p$-norm at most $1$:

$$
B_p = \big\{ x \in \mathbb{R}^n : \lVert x \rVert_p \leq 1 \big\}.
$$

---

## `norm` of a vector

In [1]:
?norm

search: norm normpath normalize_string vecnorm issubnormal UniformScaling



```rst
..  norm(A, [p])

Compute the ``p``-norm of a vector or the operator norm of a matrix ``A``, defaulting to the ``p=2``-norm.

For vectors, ``p`` can assume any numeric value (even though not all values produce a mathematically valid vector norm). In particular, ``norm(A, Inf)`` returns the largest value in ``abs(A)``, whereas ``norm(A, -Inf)`` returns the smallest.

For matrices, the matrix norm induced by the vector ``p``-norm is used, where valid values of ``p`` are ``1``, ``2``, or ``Inf``. (Note that for sparse matrices, ``p=2`` is currently not implemented.) Use :func:`vecnorm` to compute the Frobenius norm.
```


In [2]:
x = Float64[1, -2, 1]

3-element Array{Float64,1}:
  1.0
 -2.0
  1.0

In [3]:
y = Float64[0.9, -1.9, 1.1]

3-element Array{Float64,1}:
  0.9
 -1.9
  1.1

In [4]:
# The 1-norm of x
norm(x, 1)

4.0

In [5]:
# The 0-norm counts the number of nonzero entries in x
norm([1, 0, 0, 0, 1], 0)

2.0

In [6]:
# The Manhattan distance between x and y
norm(x - y, 1)

0.30000000000000016

In [9]:
# The Euclidean norm of x
norm(x, 2)

2.449489742783178

In [10]:
# The default norm is the Euclidean norm
norm(x)

2.449489742783178

In [13]:
# The Euclidean distance between x and y
norm(x - y)

0.1732050807568878

In [14]:
# The max-norm of x
norm(x, Inf)

2.0

In [15]:
# The max-norm distance between x and y
norm(x - y, Inf)

0.10000000000000009

---

## The $A$-norm

$$
\|x\|_A = \sqrt{x^T A x}
$$

where $A$ is an $n \times n$ positive definite matrix.

For example, when $A = I$, we have the usual Euclidean norm:

$$
\|x\|_I = \|x\|_2.
$$

---

## The resistance distance on a social network

A practical example of the $A$-norm is the **resistance distance** between individuals on a **social network**. 

A social network can be represented as a **graph** where individuals are **nodes** which are connected by an **edge** if they are friends.

This graph can be represented using an **adjacency matrix** $A = [a_{ij}]$ where $a_{ij} = 1$ if $i$ and $j$ are friends, otherwise $a_{ij} = 0$; you cannot be friends with yourself, so $a_{ii} = 0$.

The **Laplacian matrix** $L = [l_{ij}]$ of the graph has $l_{ii} = \deg(i)$ (i.e., the number of friends of $i$), and $l_{ij} = -a_{ij}$ for $i \neq j$. That is,

$$
L = \mathrm{Diag}(Ae) - A,
$$

where $e$ is the vector of all ones, and $\mathrm{Diag}(Ae)$ is the diagonal matrix with the vector $Ae$ on its diagonal.

The **resistance distance** between $i$ and $j$ is then given by

$$
\mathrm{dist}(i,j) = \|e_i - e_j\|_B,  \qquad \text{where $B = (L + ee^T)^{-1}$}.
$$

Here $e_i$ and $e_j$ are the $i^\mathrm{th}$ and $j^\mathrm{th}$ columns of the identity matrix. (Note: it can be shown that $L + ee^T$ is positive definite, which implies that $(L + ee^T)^{-1}$ is positive definite.)

In [20]:
n = 8
A = Symmetric(rand(n, n))
A = round(Int64, A)
A = A - diagm(diag(A))  # Make the diagonal zero

8x8 Array{Int64,2}:
 0  1  0  0  0  0  1  1
 1  0  1  0  1  1  1  0
 0  1  0  0  0  0  1  1
 0  0  0  0  1  0  1  1
 0  1  0  1  0  0  0  1
 0  1  0  0  0  0  0  1
 1  1  1  1  0  0  0  1
 1  0  1  1  1  1  1  0

In [21]:
e = ones(n)
L = diagm(A*e) - A

8x8 Array{Float64,2}:
  3.0  -1.0   0.0   0.0   0.0   0.0  -1.0  -1.0
 -1.0   5.0  -1.0   0.0  -1.0  -1.0  -1.0   0.0
  0.0  -1.0   3.0   0.0   0.0   0.0  -1.0  -1.0
  0.0   0.0   0.0   3.0  -1.0   0.0  -1.0  -1.0
  0.0  -1.0   0.0  -1.0   3.0   0.0   0.0  -1.0
  0.0  -1.0   0.0   0.0   0.0   2.0   0.0  -1.0
 -1.0  -1.0  -1.0  -1.0   0.0   0.0   5.0  -1.0
 -1.0   0.0  -1.0  -1.0  -1.0  -1.0  -1.0   6.0

In [22]:
L + e*e'

8x8 Array{Float64,2}:
 4.0  0.0  1.0  1.0  1.0  1.0  0.0  0.0
 0.0  6.0  0.0  1.0  0.0  0.0  0.0  1.0
 1.0  0.0  4.0  1.0  1.0  1.0  0.0  0.0
 1.0  1.0  1.0  4.0  0.0  1.0  0.0  0.0
 1.0  0.0  1.0  0.0  4.0  1.0  1.0  0.0
 1.0  0.0  1.0  1.0  1.0  3.0  1.0  0.0
 0.0  0.0  0.0  0.0  1.0  1.0  6.0  0.0
 0.0  1.0  0.0  0.0  0.0  0.0  0.0  7.0

In [23]:
isposdef(L + e*e')

true

In [24]:
B = inv(L + e*e')

8x8 Array{Float64,2}:
  0.300768     0.00935015  -0.032565    …   0.0192905    -0.00133574 
  0.00935015   0.179827     0.00935015     -0.00108723   -0.0256896  
 -0.032565     0.00935015   0.300768        0.0192905    -0.00133574 
 -0.0547652   -0.0532741   -0.0547652       0.00636804    0.00761059 
 -0.0572502   -0.00804548  -0.0572502      -0.0398546     0.00114935 
 -0.0584928    0.0145688   -0.0584928   …  -0.062966     -0.00208126 
  0.0192905   -0.00108723   0.0192905       0.183803      0.000155318
 -0.00133574  -0.0256896   -0.00133574      0.000155318   0.146527   

In [26]:
chol(B)

8x8 UpperTriangular{Float64,Array{Float64,2}}:
 0.548423  0.0170491  -0.0593793  -0.0998593  …   0.0351745   -0.00243559 
 0.0       0.423718    0.0244562  -0.121712      -0.00398124  -0.0605311  
 0.0       0.0         0.544651   -0.105973       0.0394317    3.98128e-19
 0.0       0.0         0.0         0.525374       0.0258381   -1.65094e-18
 0.0       0.0         0.0         0.0           -0.062107    -4.10753e-19
 0.0       0.0         0.0         0.0        …  -0.0990148   -1.07352e-19
 0.0       0.0         0.0         0.0            0.408248    -8.85247e-20
 0.0       0.0         0.0         0.0            0.0          0.377964   

In [28]:
eigvals(B)

8-element Array{Float64,1}:
 0.526246
 0.432251
 0.333333
 0.277964
 0.182884
 0.163128
 0.125   
 0.131742

In [27]:
isposdef(B)

false

In [29]:
# Define the resistance norm
resnorm(x) = sqrt(x'*B*x)[1]

# Form the matrix of distances between all nodes
I = eye(n)
D = Float64[resnorm(I[:,i] - I[:,j]) for i = 1:n, j = 1:n]

8x8 Array{Float64,2}:
 0.0       0.679629  0.816497  0.849902  …  0.928169  0.667825  0.670796
 0.679629  0.0       0.679629  0.773569     0.770995  0.604818  0.6146  
 0.816497  0.679629  0.0       0.849902     0.928169  0.667825  0.670796
 0.849902  0.773569  0.849902  0.0          0.962518  0.695055  0.665838
 0.848731  0.707809  0.848731  0.723093     0.938465  0.754048  0.670302
 0.928169  0.770995  0.928169  0.962518  …  0.0       0.868032  0.770995
 0.667825  0.604818  0.667825  0.695055     0.868032  0.0       0.574474
 0.670796  0.6146    0.670796  0.665838     0.770995  0.574474  0.0     

In [30]:
D[3,4]

0.8499015652667136

In [31]:
D[3,6]

0.9281689935476016

---

## The Euclidean norm is a norm

To prove that the **Euclidean norm** is indeed a norm, we need to show it satisfies the **triangle inequality**:

$$
\|x + y\|_2 \leq \|x\|_2 + \|y\|_2.
$$

We will prove this using the following fundamental result.

> ### Theorem: (Cauchy-Schwarz Inequality)
>
$$
\left|x^Ty\right| \leq \|x\|_2 \|y\|_2, \qquad \forall x, y \in \mathbb{R}^n.
$$

### Proof of the triangle inequality.

Let $x, y \in \mathbb{R}^n$

$$
\begin{align}
\|x + y\|_2^2 
&= (x+y)^T(x+y) \\
&= x^Tx + x^Ty + y^Tx + y^Ty \\
&= \|x\|_2^2 + 2x^Ty + \|y\|_2^2 \\
&\leq \|x\|_2^2 + 2\|x\|_2\|y\|_2 + \|y\|_2^2 \qquad \text{(Cauchy-Schwarz inequality)} \\
&= \big(\|x\|_2 + \|y\|_2\big)^2. \\
\end{align}
$$

Taking the square root of both sides, we obtain $\|x + y\|_2 \leq \|x\|_2 + \|y\|_2$. $\blacksquare$

---

Now let's see a proof of the Cauchy-Schwarz inequality.

### Proof of Cauchy-Schwarz.

Let $t \in \mathbb{R}$. Then

$$
\begin{align}
0 \leq \|x + ty\|_2^2 
&= (x + ty)^T(x + ty) \\
&= x^T x + 2tx^Ty + t^2y^Ty \\
&= \|x\|_2^2 + \big(2x^Ty\big) t + \|y\|_2^2 t^2  \\
&= c + bt + at^2,
\end{align}
$$

where 

$$
a = \|y\|_2^2, \qquad
b = 2x^Ty, \qquad 
c = \|x\|_2^2.
$$

$\therefore$ $at^2 + bt + c \geq 0$, $\forall t \in \mathbb{R}$.

This implies that the equation

$$
at^2 + bt + c = 0
$$

either has no solution or exactly one solution. By the **quadratic formula**,

$$
t = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a},
$$

we must have $b^2 - 4ac \leq 0$.

Thus, 

$$
\big(2x^Ty\big)^2 - 4 \|y\|_2^2 \|x\|_2^2 \leq 0,
$$

which simplifies to

$$
\left|x^Ty\right| \leq \|x\|_2 \|y\|_2. \qquad \blacksquare
$$

---

## Matrix norms

A **matrix norm** is a function 

$$
\|\cdot\| : \mathbb{R}^{n \times n} \rightarrow \mathbb{R}
$$

that satisfies

1. $\|A\| > 0$ if $A \neq 0$ and $\|0\| = 0$
2. $\|\alpha A\| = |\alpha| \|A\|$
3. $\|A + B\| \leq \|A\| + \|B\|$
4. $\|AB\| \leq \|A\|\|B\| \qquad$ (submultiplicativity)

The **distance** between matrices $A$ and $B$ can then be measured by

$$
\mathrm{dist}(A, B) = \| A - B \|.
$$

---

### The Frobenius norm

$$
\|A\|_F = \sqrt{\sum_{i=1}^n \sum_{j=1}^n |a_{ij}|^2} = \texttt{vecnorm(A)}
$$

In [2]:
A = [1 2; 3 4.0]

2x2 Array{Float64,2}:
 1.0  2.0
 3.0  4.0

In [3]:
vecnorm(A)

5.477225575051661

In [4]:
sqrt(30)

5.477225575051661

In [5]:
?vecnorm

search: vecnorm



```
vecnorm(A, [p])
```

For any iterable container `A` (including arrays of any dimension) of numbers (or any element type for which `norm` is defined), compute the `p`-norm (defaulting to `p=2`) as if `A` were a vector of the corresponding length.

For example, if `A` is a matrix and `p=2`, then this is equivalent to the Frobenius norm.


---

### Exercise

Compute $\lVert I \rVert_F$, where $I$ is the $n \times n$ identity matrix.

$\|I\|_F = \sqrt{n}$

---

In [7]:
I = eye(4)

4x4 Array{Float64,2}:
 1.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0
 0.0  0.0  1.0  0.0
 0.0  0.0  0.0  1.0

In [8]:
vecnorm(I)

2.0

---

## Induced Matrix Norms

Let $\|\cdot\| : \mathbb{R}^n \rightarrow \mathbb{R}$ be a vector norm.

The **induced matrix norm** (a.k.a. the **operator norm**) is defined as

$$
\|A\| = \max_{x \neq 0} \frac{\|Ax\|}{\|x\|} = \max_{\|x\| = 1} \|Ax\|.
$$

The induced matrix norm is a norm.

---

## Exercise

Let $\|\cdot\|$ be an induced matrix norm. Compute $\|I\|$.

$\|I\| = 1$

---

## Examples

$$
\begin{array}{llclll}
\|A\|_p &=& \displaystyle{\max_{x \neq 0} \frac{\|Ax\|_p}{\|x\|_p}} &=& \texttt{norm(A, p)} \qquad &\text{($p$-norm)} \\
\|A\|_1 &=& \displaystyle{\max_{1 \leq j \leq n} \sum_{i=1}^n |a_{ij}|} &=& \texttt{norm(A, 1)} \qquad &\text{(max-column-sum)} \\
\|A\|_2 &=& \displaystyle{\sqrt{\lambda_{\max}\left(A^TA\right)}} &=& \texttt{norm(A)} \qquad &\text{(spectral norm)} \\
\|A\|_\infty &=& \displaystyle{\max_{1 \leq i \leq n} \sum_{j=1}^n |a_{ij}|} &=& \texttt{norm(A, Inf)} \qquad &\text{(max-row-sum)} \\
\end{array}
$$

**Note:** $\lambda_{\max}\left(A^TA\right)$ is the **largest eigenvalue** of the symmetric matrix $A^TA$; $\sqrt{\lambda_{\max}\left(A^TA\right)}$ is the **largest singular value** of the matrix $A$.

However, the Frobenius norm is **not** an induced matrix norm.

In [16]:
A = [1 3; -2 0.0]

sqrt(maximum(eigvals(A'*A)))

3.25661653798294

In [18]:
maximum(svdvals(A))

3.2566165379829384

---

## `norm` of a matrix

In [20]:
A = [1 2 3; 4 5 6; 7 8 9.0]

3x3 Array{Float64,2}:
 1.0  2.0  3.0
 4.0  5.0  6.0
 7.0  8.0  9.0

In [21]:
# max-column-sum
norm(A, 1)

18.0

In [22]:
# max-row-sum
norm(A, Inf)

24.0

In [24]:
# spectral norm
norm(A)

16.84810335261421

In [25]:
# The spectral norm of A is the 
# maximum singular value of A
svdvals(A)

3-element Array{Float64,1}:
 16.8481     
  1.06837    
  4.41842e-16

In [28]:
# The singular values of A are the square root of the
# eigenvalues of A'*A
λ = max(eigvals(A'*A), 0)
sqrt(λ)

3-element Array{Float64,1}:
  0.0    
  1.06837
 16.8481 

---

## Induced Matrix Norm Inequality

> ### Theorem: (Induced Matrix Norm Inequality)
>
> Let $\|\cdot\| : \mathbb{R}^n \rightarrow \mathbb{R}$ be a vector norm. Then the corresponding induced matrix norm satisfies
>
$$\|Ax\| \leq \|A\|\|x\|,\qquad \text{for all $A \in \mathbb{R}^{n \times n}$ and $x \in \mathbb{R}^n$.}$$

---