# Intro to Optimal Transport and Wasserstein Distance


## Monge Problem

### Introduction
Let's say there is a pile of sand of measure $\mu$ and a hole on the ground with the same volume with the measure $v$. Now, imagine a worker with a task of moving sand from the pile to the hole on the ground. He picks up sand at point $x$ with a shovel and drops it at $y = T(x)$ position in the hole. During the process, it's clear there is a distance $D(x, T(x))$ between the initial and final position of sand particle. It's reasonable to assume there is a correlation between the total cost of the moving procedure and the distance. An important point we shouldn't miss -- not only the worker travels from position $x$ to $y$, but he also carries the sand with mass of $\mu(x)$. Hence, our cost is calculated by the following expression: $$\mu(x) \cdot D(x, T(x))$$
One last thing we shouldn't forget when looking at this problem is the conservation of mass. Let's take segment $B$ from the hole which the worker needs to fill. $$ T^{-1}(B) = \{x|T(x) \in B\} $$
Let's suppose the sand in the segment $B$ will come from $A_1$, $A_2$, $A_3$ segments of the sand pile. Then, the total mass of the sand taken from all selected segments in the sand pile must be equal to the mass of the target measure: $$\mu(A_1) + \mu(A_2) + \mu(A_3) = v(B)$$ This observation must be true for all the segments. In mathematical terms: $$ \forall B, \mu(T^{-1}(B)) = v(B) $$ Or, simply, $$ \displaystyle T_{\sharp }\mu = v $$
The question is what $\displaystyle T_{\sharp }\mu = v$ minimizes $\int{D(x,T(x)) \cdot \mu(dx)}$? 

### Formal Definition

$\Omega$ a measurable space, $c : \Omega \times \Omega \rightarrow \R$. $\mu, v$ two probability measures in $\mathcal{P}(\Omega)$.


[Monge'81] problem: find a map $T : \Omega \rightarrow \Omega$
$$
\inf_{T_{\sharp}\mu = \nu} \int_{\Omega} c(x, T(x)) \, \mu(dx)
$$

[Brenier'87] If $\Omega = \mathbb{R}^d, \, c = || \cdot - \cdot ||^2,$

$\mu, v$ a.c, then $T = \nabla u, u$ convex.

![OT between two probability distributions](OTplot.png)


[Brenier'87] For any $u$ convex $\nabla u$ is the OT Monge map between $\mu$ and $\nabla u_{\sharp} \mu$.


### Monge-Ampere Equation

If $\Omega = \mathbb{R}^d, c = ||\cdot - \cdot||^2, \mu, v$ have densities $p$,$q$, then $T_{\sharp} \mu = v$ is equivalent to
$$
p(x) = q(T(x))|det J_T(x)|
$$

Monge-Ampere: find convex $f$ such that
$$
|\nabla^2 f(x)| = \frac{p(x)}{q(\nabla f(x))}
$$

## Kantorovich Problem

### Introduction
Imagine a general tasked with moving soldiers from barracks to frontline positions. Each barrack `i` has supply $\mu_i$, and each frontline `j` requires demand $v_j$. Unlike Monge’s rigid map $T(x)$, here soldiers from one barrack can be split among multiple positions, ensuring a solution always exists.

To capture this, we introduce:

- **Cost matrix $C$:** $C_{ij}$ is the cost of moving one soldier from barrack $i$ to frontline $j$.

|   | Frontline 1 | Frontline 2 | Frontline 3 |
|---|-------------|-------------|-------------|
| Barrack 1 | $C_{11}$ | $C_{12}$ | $C_{13}$ |
| Barrack 2 | $C_{21}$ | $C_{22}$ | $C_{23}$ |
| Barrack 3 | $C_{31}$ | $C_{32}$ | $C_{33}$ |

- **Transportation plan $P$:** $P_{ij}$ is the number of soldiers sent from $i$ to $j$.  

Mass conservation requires:
$$ \sum_j P_{ij} = \mu_i \quad \forall i $$
$$ \sum_i P_{ij} = v_j \quad \forall j $$
$$ P_{ij} \geq 0 $$

The total cost is:
$$ \sum_{i,j} P_{ij} \cdot C_{ij} $$

**Kantorovich’s problem**: find $P$ that **minimizes** this cost, subject to supply–demand constraints.
