## Chapter 03 -- Norm and distance

Modified by kmp 2022

Sources:

https://web.stanford.edu/~boyd/vmls/

https://github.com/vbartle/VMLS-Companions

Based on "Boyd and Vandenberghe, 2021, Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares - Julia Language Companion" https://web.stanford.edu/~boyd/vmls/vmls-julia-companion.pdf


In [None]:
using LinearAlgebra
using VMLS


### 3.1 Norm

The norm $‖x‖$ is written in Julia as `norm(x)`. (It can be evaluated several other ways, too.) The `norm` function is contained in the Julia package `LinearAlgebra`, so you must install and then add this package to use it; see page ix. 

In [None]:
x = [2, -1, 2]
norm(x), sqrt(x'*x), sqrt(sum(x.^2))

**Triangle inequality.** Let us check the triangle inequality, $‖x+ y‖ ≤ ‖x‖+ ‖y‖$, for some specific values of $x$ and $y$.

In [None]:
x, y = randn(10), randn(10)
lhs = norm(x+y)
rhs = norm(x) + norm(y)
lhs, rhs, lhs <= rhs

**RMS value.** The RMS value of a vector $x$ is `rms(x)` $= \frac{‖x‖}{\sqrt{n}}$. In Julia, this is expressed as `norm(x)/sqrt(length(x))`. The VMLS package contains this function, so you can use it once you have installed this package.

Let us define a vector which represents a signal, i.e., the value of some quantity at uniformly spaced time instances, and find its RMS value. The following code plots the signal, its average value, and two constant signals at `avg(x) ± rms(x)` (Figure [3.1]).

In [None]:
rms(x) = norm(x) / sqrt(length(x))
t = 0:0.01:1                      # sampled times
x = cos.(8*t) - 2*sin.(11*t);
avg(x)

In [None]:
rms(x)

In [None]:
using Plots

plot(t, x, size = (300,300))

plot!(t, avg(x)*ones(length(x)))
plot!(t, (avg(x)+rms(x))*ones(length(x)), color = :green)
plot!(t, (avg(x)-rms(x))*ones(length(x)), color = :green)
plot!(legend = false)

**Figure 3.1** A signal $x$. The horizontal lines show $avg(x) + rms(x)$, $avg(x)$ and $avg(x)− rms(x)$.

**Chebyshev inequality.** The Chebyshev inequality states that the number of entries of an $n$-vector $x$ that have absolute value at least $a$ is no more than $\frac{‖x‖^2}{a^2} = n rms(x)^2/a^2$. If this number is, say, $12.15$, we can conclude that no more that $12$ entries have absolute value at least $a$, since the number of entries is an integer. So the Chebyshev bound can be improved to be $floor(‖x‖^2/a)$, where $floor(u)$ is the integer part of a positive number. To define a function with the Chebyshev bound, including the floor function improvement, and apply the bound to the signal found above, for a specific value of $a$.

In [None]:
# Define Chebyshev bound function
cheb_bound(x,a) = floor(norm(x)^2/a);
a = 1.5;
println("Chebyshev bound: $(cheb_bound(x,a))")

In [None]:
println("Number of entries of x with |x[i]| >= a: $(sum(abs.(x) .>= a))")

In the last line, the expression `abs.(x) .>= a` creates an array with entries that are Boolean, i.e., `true` or `false`, depending on whether the corresponding entry of `x` satisfies the inequality. When we sum the vector of Booleans, they are automatically converted to (re-cast as) the numbers `1` and `0`, respectively.

### 3.2 Euclidean Distance

The Euclidean distance between two vectors is $dist(x, y) = ‖x − y‖$. This is written in Julia as **`norm(x-y)`**. The distance between the pairs of the vectors $u, v,$ and $w$ are given by:

In [None]:
u = [1.8, 2.0, -3.7, 4.7];
v = [0.6, 2.1, 1.9, -1.4];
w = [2.0, 1.9, -4.0, 4.6];
norm(u-v), norm(u-w), norm(v-w)

We can see that $u$ and $w$ are much closer to each other than $u$ and $v$, or $v$ and $w$.

**Nearest neighbor.** We define a function that calculates the nearest neighbor of a vector in a list of vectors, and try it on the points in Figure [3.3](https://web.stanford.edu/~boyd/vmls/vmls.pdf#figure.3.3) of VMLS.

In [None]:
nearest_neighbor(x,z) = z[ 

    argmin([norm(x-y) for y in z]) # array comprehension syntax
   
    ]

z = ([2,1], [7,2], [5.5,4], [4,8], [1,5], [9,6])

nearest_neighbor([5,6], z)

In [None]:
nearest_neighbor([3,3], z)

On the first line, the expression `[norm(x-y) for y in z]` uses a convenient **`comprehension`** construction in Julia. Here `z` is a list of vectors, and the expression expands to an array with elements `norm(x-z[1])`, `norm(x-z[2]), . . . .` The function **`argmin`** applied to this array returns the index of the smallest element. 

**De-meaning a vector.** We refer to the vector `x − avg(x)1` as the de-meaned version of `x`.

In [None]:
de_mean(x) = x .- avg(x)     # Define de-mean function
x = [1, -2.2, 3]

avg(x), de_mean(x), avg(de_mean(x))

### 3.3 Standard deviation

**Standard deviation.** We can define a function that corresponds to the VMLS definition of the standard deviation of a vector, $std(x) =$ $\frac{‖x − avg(x)1‖}{\sqrt{n}}$, where $n$ is the length of the vector.

In [None]:
x = rand(100)

# VMLS definition of std
stdev(x) = norm(x.-avg(x))/sqrt(length(x))

stdev(x)

This function is in the VMLS package, so you can use it once youhave installed this package. Julia’s **Statistics package** has a similar function, **`std(x)`**, which computes the value $\frac{‖x − avg(x)1‖}{\sqrt{n − 1}}$, where $n$ is the length of $x$.) 

**Return and risk.** We evaluate the mean return and risk, measured by standard deviation, of the four time series Figure [3.4](https://web.stanford.edu/~boyd/vmls/vmls.pdf#figure.3.4) of VMLS.

In [None]:
a = ones(10)
b = [ 5, 1, -2, 3, 6, 3, -1, 3, 4, 1 ]
c = [ 5, 7, -2, 2, -3, 1, -1, 2, 7, 8 ]
d = [ -1, -3, -4, -3, 7, -1, 0, 3, 9, 5 ]

[("char, avg, std"),
("a", avg(a), stdev(a)), 
("b", avg(b), stdev(b)),
("c", avg(c), stdev(c)),
("d", avg(d), stdev(d))]

**Standardizing a vector.** If a vector $x$ is not constant (i.e., at least two of its entries are different), we can standardize it, by subtracting its mean and dividing by its standard deviation. The resulting standardized vector has mean value zero and RMS value one. Its entries are called $z-scores$. We will define a standardize function, and then check it with a random vector.

In [None]:
function standardize(x)
    x_tilde = x .- avg(x)   # De-meaned vector
    return x_tilde/rms(x_tilde)
end

In [None]:
x = rand(1000)
z = standardize(x)

[("x:", avg(x), rms(x)), ("z:", avg(z), rms(z))]

### 3.4 Angle
**Angle.** Let’s define a function that computes the angle between two vectors. We will call it ang because Julia already includes a function angle (for the phase angle of a complex number).

In [None]:
# Define angle function, which returns radians
ang(x,y) = acos(x'*y/(norm(x)*norm(y)));
a = [1,2,-1]; b=[2,0,-3];

[(ang(a,b),":angle in radians"), 
(ang(a,b)*(360/(2*pi)), ":angle in degrees")]

**Correlation coefficient.** The correlation coefficient between two vectors $a$ and $b$ (with nonzero standard deviation) is defined as $$ρ = \frac{ã^Tb̃}{‖ã‖‖b̃‖},$$ where $ã$ and $b̃$ are the de-meaned versions of $a$ and $b$, respectively. There is no built-in function for correlation, so we can define one. We use function to calculate the correlation coefficients of the three pairs of vectors in Figure [3.8](https://web.stanford.edu/~boyd/vmls/vmls.pdf#figure.3.8) in VMLS.

In [None]:
function correl_coef(a,b)
    a_tilde = a .- avg(a)
    b_tilde = b .- avg(b)
    return (a_tilde'*b_tilde)/(norm(a_tilde)*norm(b_tilde))
end

In [None]:
a0 = [4.4, 9.4, 15.4, 12.4, 10.4, 1.4, -4.6, -5.6, -0.6, 7.4];
b0 = [6.2, 11.2, 14.2, 14.2, 8.2, 2.2, -3.8, -4.8, -1.8, 4.2];
a1 = [4.1, 10.1, 15.1, 13.1, 7.1, 2.1, -2.9, -5.9, 0.1, 7.1];
b1 = [5.5, -0.5, -4.5, -3.5, 1.5, 7.5, 13.5, 14.5, 11.5, 4.5];
a2 = [-5.0, 0.0, 5.0, 8.0, 13.0, 11.0, 1.0, 6.0, 4.0, 7.0];
b2 = [5.8, 0.8, 7.8, 9.8, 0.8, 11.8, 10.8, 5.8, -0.2, -3.2];

In [None]:
[("0",correl_coef(a0,b0)),

("1",correl_coef(a1,b1)),

("2",correl_coef(a2,b2))]

The correlation coefficients of the three pairs of vectors are $96.8\%, −98.8\%$, and $0.4\%$.

### 3.5 Complexity 
Let’s check that the time to compute the correlation coefficient of two $n$-vectors is approximately linear in $n$.

In [None]:
x0 = randn(10^6); y0 = randn(10^6);
x1 = randn(10^7); y1 = randn(10^7);

@time correl_coef(x0,y0), @time correl_coef(x1,y1)