# Demo: Vectorize pairwise similarities

Demo on "How to vectorize pairwise (dis)similarity metrics"

> A straightforward pattern for vectorizing metrics like L1 distance and Intersection over Union for all pairs of points.

Taken from: https://towardsdatascience.com/how-to-vectorize-pairwise-dis-similarity-metrics-5d522715fb4e

In [1]:
import numpy as np

In [2]:
X = np.array([[0, 0, 0, 10, 1], [0, 1, 0, 0, 1], [1, 1, 0, 20, 1], [1, 0, 0, 1, 1]])
X

array([[ 0,  0,  0, 10,  1],
       [ 0,  1,  0,  0,  1],
       [ 1,  1,  0, 20,  1],
       [ 1,  0,  0,  1,  1]])

In [3]:
X[:, None, :].shape

(4, 1, 5)

## Experiment 1

Manhattan distance between pairs of 5-dimensional points.

### Expand

In [4]:
deltas = X[:, None, :] - X[None, :, :]
deltas.shape

(4, 4, 5)

In [5]:
deltas

array([[[  0,   0,   0,   0,   0],
        [  0,  -1,   0,  10,   0],
        [ -1,  -1,   0, -10,   0],
        [ -1,   0,   0,   9,   0]],

       [[  0,   1,   0, -10,   0],
        [  0,   0,   0,   0,   0],
        [ -1,   0,   0, -20,   0],
        [ -1,   1,   0,  -1,   0]],

       [[  1,   1,   0,  10,   0],
        [  1,   0,   0,  20,   0],
        [  0,   0,   0,   0,   0],
        [  0,   1,   0,  19,   0]],

       [[  1,   0,   0,  -9,   0],
        [  1,  -1,   0,   1,   0],
        [  0,  -1,   0, -19,   0],
        [  0,   0,   0,   0,   0]]])

### Reduce

In [6]:
abs_deltas = np.abs(deltas)
abs_deltas.sum(axis=-1)

array([[ 0, 11, 12, 10],
       [11,  0, 21,  3],
       [12, 21,  0, 20],
       [10,  3, 20,  0]])

## Experiment 2

Sum of identical positions in pairwise 5-dimensional points.

### Expand

In [7]:
identity = X[:, None, :] == X[None, :, :]
identity

array([[[ True,  True,  True,  True,  True],
        [ True, False,  True, False,  True],
        [False, False,  True, False,  True],
        [False,  True,  True, False,  True]],

       [[ True, False,  True, False,  True],
        [ True,  True,  True,  True,  True],
        [False,  True,  True, False,  True],
        [False, False,  True, False,  True]],

       [[False, False,  True, False,  True],
        [False,  True,  True, False,  True],
        [ True,  True,  True,  True,  True],
        [ True, False,  True, False,  True]],

       [[False,  True,  True, False,  True],
        [False, False,  True, False,  True],
        [ True, False,  True, False,  True],
        [ True,  True,  True,  True,  True]]])

### Reduce

In [8]:
identity.sum(axis=-1)

array([[5, 3, 2, 3],
       [3, 5, 3, 2],
       [2, 3, 5, 3],
       [3, 2, 3, 5]])