---
title: 3.4 Distance and Nearest Neighbors
subject: Inner Products and Norms
subtitle: Howdy neighbor!
short_title: 3.4 Distance and Nearest Neighbors
authors:
  - name: Nikolai Matni
    affiliations:
      - Dept. of Electrical and Systems Engineering
      - University of Pennsylvania
    email: nmatni@seas.upenn.edu
license: CC-BY-4.0
keywords: Distance, Nearest Neighbors
math:
  '\vv': '\mathbf{#1}'
  '\bm': '\begin{bmatrix}'
  '\em': '\end{bmatrix}'
  '\R': '\mathbb{R}'
---

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nikolaimatni/ese-2030/HEAD?labpath=/02_Ch_3_Inner_Products_and_Norms/043-distance_nearest_neighbhors.ipynb)

{doc}`Lecture notes <../lecture_notes/Lecture 06 - Clustering and K-means.pdf>`

## Reading

Material related to this page, as well as additional exercises, can be found in VMLS 3.2.

## Learning Objectives

By the end of this page, you should know:
- What is the Euclidean distance between two vectors?
- What are the properties of a general distance function?

## The Euclidean Distance

A distance function, or metric, describes how far apart 2 points are.

A familiar starting point for our study of distances will be the Euclidean distance, which is closely related to the Euclidean norm on $\mathbb R^n$:

:::{prf:definition} The Euclidean Distance
:label: euclidean_distance_defn

For vectors $\vv u, \vv v \in \mathbb{R}^n$, the Euclidean distance is defined as the Euclidean norm of their difference $\vv u - \vv v$. In other words,

\begin{align*}
    \text{dist}(\vv u, \vv v) = \| \vv u - \vv v\| = \sqrt{\langle \vv u - \vv v, \vv u - \vv v \rangle}
\end{align*}
:::

Note that this is measuring the length of the arrow drawn from point $\vv x$ to point $\vv y$:

:::{figure}../figures/04-euc_dist.png
:label:Euclidean distance
:alt: Euclidean distance bewteen 2 vectors $\vv x$ and $\vv y$
:width: 200px
:align: center
:::

````{exercise}  Euclidean distance
:label: distance-ex1

Find the Euclidean distance between $\bm 1\\ 2 \em$ and $\bm 3 \\ 4 \em$.

```{solution} distance-ex1
:class: dropdown

We have

\begin{align*}
    \text{dist}\left(\bm 1\\ 2 \em, \bm 3\\ 4 \em\right) = \left\| \bm 1\\ 2 \em - \bm 3\\ 4 \em \right\| = \sqrt{ (1 - 3)^2 + (2 - 4)^2} =\boxed{2\sqrt 2}
\end{align*}

```
````

In [1]:
# Distance between vectors
import numpy as np

v1 = np.array([1, 2])
v2 = np.array([3, 4])
euc_dist = np.linalg.norm(v1 - v2)
print("Eucledian distance: ", euc_dist)

Eucledian distance:  2.8284271247461903


## General Distances

In this course, we will only work with the Euclidean distance. However, given any vector space with a general norm (i.e., $\mathbb{R}^n$ with the Euclidean norm), we may construct a distance function as the norm of their difference. This leads us to a more general notion of distances:

:::{prf:definition} General Distances
:label: general_distance_defn

For a set $S$, a function $d : S \times S \to \mathbb R$ is a distance function, or metric, if it satisfies the following:

1. **Symmetry.** For all $x, y \in S$,

\begin{align*}
    d(x, y) = d(y, x)
\end{align*}

2. **Positivity.** For all $x, y \in S$, 

\begin{align*}
    d(x, y) \geq 0
\end{align*}
and $d(x, y) = 0$ if and only if $x = y$.

3. **Triangular Inequality.** For all $x, y, z \in S$,

\begin{align*}
    d(x, z) \leq d(x, y) + d(y, z)
\end{align*}

:::

Try to convince yourself why the [Euclidean distance](#euclidean_distance_defn) fits this definition. 

When the distance $\| \vv x - \vv y \|$ between two vectors $\vv x, \vv y \in V$ is small, we say they are "close." If the distance between $\| \vv x - \vv y \|$ is large, we say they are "far." What constitutes close or far is typically application dependent.

Note that one vector space can admit many distance functions. From here on, unless otherwise mentioned, we will only be considering the [Euclidean distance](#euclidean_distance_defn).

:::{prf:example} Matrix norms and their induced distances
:label:distance-matrix_norm_ex

Let $M \in \mathbb{R}^{n \times n}$ be a symmetric square matrix such that $x^T M x > 0$ for all nonzero $x\in \mathbb{R}^n$. Such a matrix is called *positive definite*; some equivalent definitions of positive definite matrices are symmetric matrices which may be decomposed as $A^TA$, where $A$ is an invertible square matrix, or symmetric matrices with all *strictly positive* eigenvalues. A familiar positive definite matrix is the identity matrix, $I_n$.

Then, $M$ induces an [inner product](#inner_defn) given by $\langle \vv u, \vv v \rangle_M = \vv u^T M \vv v$. In the case that $M$ is diagonal, this is the [weighted dot product](#weighted-dot-product-ex) we have seen before. Try for yourselves to verify that $\langle \vv u, \vv v \rangle_M$ indeed satisfies all axioms of an inner product.

The inner product in turn induces a norm $\|\vv v\|_M = \sqrt{\langle \vv v, \vv v\rangle}_M = \sqrt{\vv v^T M \vv v}$.

TODO: Probably want to move this to the norms lesson



:::

:::{prf:example} Distances on a connected graph
:label:distance-connected_graph_ex

In this example, we'll demonstrate how the [definition of a distance function](#general_distance_defn) can be satisfied when the underlying space isn't a vector space. We will consider the *shortest walk distance on a connected undirected graph*.

An undirected graph consists of a set of *vertices* and *edges* which connect 2 vertices. Often times, undirected graphs are drawn as follows: the vertices are dots, and edges are lines connecting 2 dots. So we can represent an example of an undirected graph with the image below. (For our purposes, we will assume that each pair of vertices can have at most 1 edge connecting them, and that no vertex has an edge to itself.)



![alt text](../figures/04-graph_distance.png)

A *walk* in an undirected graph is a sequence of vertices $v_1, v_2, ..., v_{k - 1}, v_k$ such that there is an edge between adjacent vertices in this sequence. The number of edges in the walk is its *length*. In the image, for example, $3 \to 5\to 6\to 10$ is a walk with length 3.

We say an undirected graph is *connected* if there is at least one walk between every pair of vertices. The above graph is connected.

If a graph is connected, then we can define the *shortest walk distance* as follows. For vertices $u, v$ in our graph, their shortest walk distance is defined as 

\begin{align*}
    \text{dist}(u, v) = \text{length of shortest walk starting at $u$ and ending at $v$}
\end{align*}

We will verify that the shortest walk distance indeed satisfies the three axioms of a distance function.

1. **Symmetry.** For any two vertices $u, v$, let $P = u \to ... \to v$ be a minimum length walk (with length $l$) starting at $u$ and ending at $v$. If we reverse $P$, we get a walk starting at $v$ and ending at $u$ with length $l$. This means that $d(v, u) \leq l = d(u, v)$.

    Next, let $Q = v\to ...\to u$ be a minimum length walk starting at $v$ and ending at $u$ (with length $l'$). If we reverse $Q$, we get a walk starting at $u$ and ending at $v$ with length $l'$. This means that $d(u, v) \leq l' = d(v, u)$.

    Taken together, these two inequalities imply that $d(u, v) = d(v, u)$, i.e., the shortest walk distance is symmetric.

2. **Positivity.** For vertices $u \neq v$, it will take at least one edge to go from $u$ to $v$, implying that $d(u, v) > 0$ if $u \neq v$. Also, $d(v, v) = 0$ because we can take the trivial walk $P = v$, which has no edges. 

3. **Triangular Inequality.** For vertices $u, v, w$, we want to show that 

    \begin{align*}
    d(u, w) \leq d(u, v) + d(v, w)
    \end{align*}

    Note that if $P_{uv}$ is a walk of length $d(u, v)$ from $u$ to $v$, and $P_{vw}$ is a walk of length $d(v, w)$ from $v$ to $w$, then we can concatenate $P_{uv} \to P_{vw}$ to get a walk starting from $u$ which ends at $w$ and has length $d(u, v) + d(v, w)$. Since the shortest walk from $u$ to $w$ can't be longer than this walk we just constructed, this implies that $d(u, w) \leq d(u, v) + d(v, w)$, i.e., the triangular inequality holds.


:::

## Applications of Distances

:::{prf:example} Feature distances
:label:distance-feature_distance

If $\vv x, \vv y \in V$ are vectors containing *features* of two objects, $\|\vv  x - \vv  y\|$ is called the *feature distance*. It gives a measure of how "different" two objects are. 

For example, suppose each vector represents a patient in a hospital with entries such as age, weight, height, and test results. We can use $\| \vv x - \vv y\|$ to check if patients $\vv x$ and $\vv y$ are "close" to each other with respect to these features.
:::

:::{prf:example} Nearest neighbors
:label:distance-nearest_neighbors

Suppose we are given a collection $\vv {z_1}, ..., \vv {z_m} \in V$ of $m$ vectors living in a vector space $V$. We say that $\vv{z_j}$ is the *nearest neighbor* of $\vv {x}$ among the vectors $\vv {z_1}, ..., \vv {z_m} \in V$ if 

\begin{align*}
    \| \vv x - \vv{z_j} \| \leq \| \vv x - \vv{z_i} \| \quad\text{for i = 1, ..., m}
\end{align*}

In words, this means $\vv{z_j}$ is the closest vector to $\vv x$ among $\vv{z_1}, ..., \vv{z_m}$. This is illustrated below; we note that the nearest neighbor may not be unique (e.g., if several $\vv{z_i}$ satisfy the condition above).

:::{figure}../figures/04-nearest_neighbor.png
:label:Nearest neighbor
:alt: Nearest neighbor to a vector $\vv x \in \mathbb{R}^2$
:width: 400px
:align: center
:::

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nikolaimatni/ese-2030/HEAD?labpath=/02_Ch_3_Inner_Products_and_Norms/043-distance_nearest_neighbhors.ipynb)