# All pairs shortest path
Given a graph with vertices and weight edges, we wish to find the path with the shortest total weight from $u$ to $v$ for all pairs of $u,v$ in the graph.

We can run the [single source algorithm](./graph_algorithms.ipynb#Single-source-shortest-path) on iteratively on every vertex as the source.
Then, the following would be our derived complexity

| Type of graph | Single source algorithm | Complexity |
| --- | --- | --- |
| Unweighted | BFS | $$O(n(n+m))$$
| Non-negative acyclic | One-pass Bellman Ford | $$O(n(n+m))$$
| Non-negative weight | Dijkstra | $$O(n(n \log n + m))$$
| General graph | Bellman Ford | $$O(n^2)$$

## Reweighting
Notice that when we introduce negative weights, our complexity increases by a significant amount.
What if we tried to remove the weights by transforming the graph?

One option is to simply add some constant to all the edges.
But this is clearly wrong, because this would unfairly increase the cost of paths with many edges.

<span hidden> TODO, add graph example</span>

### Johnson's Algorithm
From the previous observation, it becomes clear that we need to reweight the graph in such a way that the total weight of any path from $u$ to $v$ changes by the same amount.

Suppose that for every edge with weight $w(u,v)$, we produce $w'$ as below:
$$
w'(u,v) = w(u,v) + f(u) - f(v)
$$

Then now we consider any path $s, v_1, \dots, v_k, t$, we see that modified path has the following total distance:

$$
\begin{align}
& w'(s, v_1) + w'(v_1, v_2) + \dots + w'(v_k, t) \\
& = w(s, v_1) + w(v_1, v_2) + \dots + w(v_k, t) + f(s) - f(v_1) + f(v_1) - f(v_2) + \dots + f(v_k) - f(t) \\
& = w(s, v_1) + w(v_1, v_2) + \dots + w(v_k, t) + f(s) - f(t) \\
& = dist(s, t) + f(s) - f(t)
\end{align}
$$

Thus, any path $s,t$ would be modified by the same amount, $f(s) - f(t)$

Hence, it follows that we need to find $f(s) - f(t)$ such that all $w'(s, t)$ is positive.

However, it is not clear if there exists such a function $f$ for us.

Suppose that the graph has some vertex $x$ that can reach any other vertex.
Notice that $dist(x, t) \leq dist(x, s) + w(s, t)$
Rearranging, we get 
$$
w(s,t) + dist(x,s) - dist(x,t) \geq 0
$$
which is what we desired.
Hence, we can use $f(y) = dist(x,y)$ as our function.

However, the graph may not have a source vertex that can reach every other vertex.
And we do not wish to add or subtract $\infty$, as that may make some of our calculations undefined.


We instead, can add a new vertex $s$ to the graph, and add edges $s,v$ for every $v$ in $V$, with 0 weights.
Now, we have our source as desired.
And notice that because we didn't add any edges **to** s, only edges **away** from $s$, we didn't create any new path in the new graph with respect to the vertices in the  old graph.

#### Pseudo code
1. Add source vertex $s$
2. Add 0 weight edges from $s$ to all $v$
3. Call Bellman Ford to determine the distance of $s$ to all $v$
    * If negative cycle is found, fail gracefully
4. Reweight all the edges to $w(u,v) + dist(s, u) - dist(s, v)$
5. For each vertex $u$, run Dijkstra with it as the source to get $dist'(u,v)$
6. For pair of vertices $u,v$, obtain the original shortest path by $dist(u,v) = dist'(u,v) - dist(s,u) + dist(s,v)$

#### Time complexity
The algorithm requires:
* One call to Bellman Ford
    * $O(nm)$
* Reweighting
    * $O(m)$
* $n$ calls to Dijkstra
    * $O(n(n \log n + m))$

Hence, the overall complexity is $O(n(n \log n + m))$, which is much better than running Bellman Ford $n$ times.

## DP APSP
### Straight forward DP
A simpler but less efficient approach to APSP uses dynamic programming.

It is defined as the following recursive function
$$
dist(u,v,l) = 
\begin{aligned}
\begin{cases}
0 \quad &\text{if }l = 0\text{ and }u = v \\
\infty \quad &\text{if }l = 0\text{ and }u \neq v \\ 
\min(dist(u,v,l-1), \min_{x \to v}(dist(u, x, l-1) + w(x, v)) 
\quad &\text{otherwise}
\end{cases}
\end{aligned}
$$

Simply put, $dist(u,v,l)$ tells us the shortest path from $u$ to $v$ using at most $l$ edges.
Thus, if $l$ is 0, the base case is clear.

When $l > 0$, then we either just reduce $l$, or we consider using the edge $x,v$ as part of our shortest path.

The complexity for this is $O(n^2m)$

#### Divide and conquer
We had assume that $x$ is the element before $v$ in the path $u,v$.
However, if we consider $x$ as the middle vertex in the path, we have the following recurrence
$$
dist(u, v, 2^k) = \min_{x \in V} (dist(u, x, 2^{k-1}) + dist(x, v, 2^{k-1})) \\
dist(u, v, 2^0) = w(u,v)
$$

Then by setting $k = \lceil \log_2 n \rceil$, we would get our shortest path

### Floyd-Warshall
For Floyd-Warshall, we instead consider the shortest path from $u \to v$, using vertices $1, \dots, r$.

Thus, we get the following recursion
$$
dist(u, v, r) = 
\begin{aligned}
\begin{cases} 
w(u,v)   \quad & \text{if } r = 0 \\
\min \left(dist(u, v, r-1), dist(u, r, r-1) + dist(r, v, r-1) \right) \quad &\text{otherwise}
\end{cases}
\end{aligned}
$$

In other words, when $r = 0$, the shortest path is simply the direct edge from $u \to v$ because we cannot use any other vertices.
And when we can use up till vertex $r$, 
then it is simply the shortest $uv$ distance without using $r$ 
$$dist(u, v, r-1)$$, 
and with using $r$ 
$$dist(u, r, r-1) + dist(r, v, r-1)$$

This gives us a simply runtime of $O(n^3)$

## Matrix multiplication

Consider two $n \times n $ matrices A and B.
Then the matrix product of $AB$, is 
$$
AB_{ij} = A_{i1}B_{1j} + A_{i2}B_{2j} + \dots + A_{in} B_{nj}
$$


Suppose that $M$ is a [weighted adjacency matrix](TODO).

And suppose that we define this new operation, $\circ$

$$
M \circ M = \min (M_{i1} + M_{1j}, M_{i2} + M_{2j}, \dots, M_{in} + M_{nj})
$$

Notice that $M\circ M$ corresponds to the shortest path from $i \to j$ of length at most 2.

Also, notice that this is very similar to matrix product, $M \times M$, but multiplication is replaced by addition, and addition is replaced by minimum.

Hence, we can also derive our shortest path as simply
$$
M \circ M \circ M \dots \circ M \qquad n - 1\text{ times}
$$

Once again, notice that similarity to [matrix exponentiation](TODO) $M^{n-1}$

Thus, we can use a similar approach to perform exponentiation in $\log n$ operations.
And since matrix each $\circ$ operation takes $O(n^3)$ operations, we obtain a runtime of $O(n^3 \log n)$

**Aside:** In actuality, this is similar (if not identical) to the [divide and conquer algorithm](#Divide-and-conquer) provided previously.

### Matrix multiplication speedup
Astute readers might have suggested to potentially use [Strassen's algorithm](./recursion.ipynb#Strassen's-algorithm) to speed up matrix multiplication to $O(n^{\log 7}) \approx O(n^{2.81})$.
If matrix multiplication can be speed up, can we similarly improve our APSP algorithm?

Sadly, the algorithm cannot be improve so simply.
Notice that in Strassen's algorithm, we used subtraction, which is an inverse of addition.
However, in our $\circ$ function, we used a minimum instead, and there is no inverse for minimum.

It is still an open question as to whether APSP can be solve in faster than $O(n^3)$ time.