# Optimal Transport for Image Recognition

Optimal transport is a special kind of network flow where we have source nodes and target nodes (and the network flows from sources to targets only). The goal of optimal transport is to find the minimum cost to move mass from source nodes to target nodes. We can represent an image as a matrix of pixels with color intensity. Define the similarity of a source image to a target image by the cost to move the color from pixels of the source to the target. Develop an image recognition algorithm using optimal transport as a measure of similarity.

## Problem Statement

### Optimal Transport

A discrete **optimal transport** problem is a special kind of network flow where we designate some nodes as **sources** and others as **targets**, we assign a **mass** to every node and we restrict the mass flow from source nodes to target nodes only. The goal is to determine the minimum cost to transport the mass distribution of source nodes to the mass  distribution target nodes.
  
Let $\mathbf{p}_1,\dots,\mathbf{p}_{n_s} \in \mathbf{R}^N$ be a collection of source nodes with masses $m_{\mathbf{p}_1},\dots,m_{\mathbf{p}_{n_s}}$ such that

$$
\sum_{i=1}^{n_s} m_{\mathbf{p}_i} = 1
$$

Let $\mathbf{q}_1,\dots,\mathbf{q}_{n_t} \in \mathbf{R}^N$ be a collection of target nodes with masses $m_{\mathbf{q}_1},\dots,m_{\mathbf{q}_{n_t}}$ such that

$$
\sum_{j=1}^{n_t} m_{\mathbf{q}_j} = 1
$$

Let $c_{ij}$ be the cost to move one unit of mass from source node $i$ to target node $j$ and let $C = [c_{ij}]$ be the **cost matrix** of size $n_s \times n_t$. We will define the cost $c_{ij}$ as the distance from source node $i$ to target node $j$.

Let $x_{ij}$ be the mass flow from source node $i$ to target node $j$ and let $X = [x_{ij}]$ be the **transport matrix** of size $n_s \times n_t$. The cost of a transport matrix is

$$
\sum_{i=1}^{n_s} \sum_{j=1}^{n_t} c_{ij} x_{ij}
$$

The constraints are given by the **balance equations** where the total flow out of source node $i$ is the mass $m_{\mathbf{p}_i}$ and the total flow into target node $j$ is the mass $m_{\mathbf{q}_j}$:

$$
\sum_{k = 1}^{n_t} x_{ik} = m_{\mathbf{p}_i}
\hspace{10mm}
\sum_{k = 1}^{n_s} x_{kj} = m_{\mathbf{q}_j}
$$

Represent the transport matrix as the flattened vector $\mathbf{x}$ of length $n_sn_t$ where

$$
\mathbf{x} = \begin{bmatrix} x_{1,1} \\ x_{1,2} \\ \vdots \\ x_{1,n_t} \\ x_{2,1} \\ \vdots \\ x_{n_s,n_t} \end{bmatrix}
$$

Let $\mathbf{b}$ be the mass vector of size $n_s + n_t$ where

$$
\mathbf{b} = \begin{bmatrix} m_{\mathbf{p}_1} \\ \vdots \\  m_{\mathbf{p}_{n_s}} \\ m_{\mathbf{q}_1} \\ \vdots \\ m_{\mathbf{q}_{n_t}} \end{bmatrix}
$$

Therefore we can write the balance equations in matrix notation $A \mathbf{x} = \mathbf{b}$ where $A$ is $(n_s + n_t) \times n_sn_t$. For example, the matrix $A$ for 3 source nodes and 2 target nodes is given by

$$
A = \begin{bmatrix}
1 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 1 \\
1 & 0 & 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 1 \end{bmatrix}
$$

### Images

A (grayscale) digital image is a matrix $D = [d_{ij}]$ where $0 \leq d_{ij} \leq 1$ is the color of pixel in row $i$ and column $j$ (where 0 is black and 1 is white). Define the cost to transport one unit of color from one pixel to another as the distance between the pixels.

Using the general desciption above we may compute the cost of the optimal transport of a source image to the target image.

### Image Recognition

Determine the cost of to transport one image to another by optimal transport and then classify images by computing the transportation cost relative to a set of fixed images.

## Data and Computations

* Part 1: Discrete Optimal Transport
  * Compute the cost matrix $C$ given a set $S$ source of nodes and a set $T$ of target nodes in 2D
  * Setup the constraint equations $A \mathbf{x} = \mathbf{b}$ for optimal transport
  * Compute the optimal transport matrix for any $S$ and $T$
  * Create a data visualization to display the source and target nodes and the transport matrix
* Part 2: OT for Images
  * Compute the cost matrix $C$ for a pair of $N \times M$ pixel images
  * Compute the optimal transport between images
* Part 3: Image Recognition
  * Load the digits dataset from sklearn
  * Choose a reference image for each digit 0 to 9
  * Given a random image from the dataset compute the transport cost to each of the reference images 
  * Classify digits in sklearn digits dataset by computing the minimum optimal transport to reference images
  * Compare OT classifier to nearest neighbor classfier

## References

* [Computational Optimal Transport (Gabriel Peyré and Marco Cuturi)](https://optimaltransport.github.io)
* [Optimal Transport for Machine Learning (Rémi Flamary)](https://remi.flamary.com/cours/tuto_otml.html)
* [Discrete Optimal Transport (Rémi Flamary)](https://remi.flamary.com/demos/transport.html)
* [Optimal Transport for Machine Learning (Gabriel Peyre)](https://www.youtube.com/watch?v=mITml5ZpqM8)
* [sklearn digits dataset](https://scikit-learn.org/stable/datasets/toy_dataset.html#digits-dataset)