## Summary notes

Implement a function that will return the single shortest path between any two nodes in a graph.

Dijkstra's shortest path solves:

> *Given a graph and a source vertex in the graph, find the shortest paths from the source to all vertices in the given graph.*
>
> [Dijkstra’s shortest path algorithm | Greedy Algo-7](https://www.geeksforgeeks.org/dijkstras-shortest-path-algorithm-greedy-algo-7/) (GeeksForGeeks)

We first implement the algorithm outlined in [Dijkstra's algorithm](https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm) (Wikipedia).

We then show two use cases:
the first is an example on how to use the function;
and the second shows how to collect all shortest paths from every city using dictionary comprehension.

## Dependencies

In [1]:
import heapq
from collections import defaultdict
import math
import networkx as nx
import laughingrook as lr

## Functions

In [2]:
def shortest_paths(g: nx.Graph, start: object) -> dict[dict]:
    """Return the single shortest path from start to all other nodes in
    G.

    This is an implementation of Dijkstra's algorithm.

    Returns a dictionary with key=node, value={'dist', 'path'}.

    Preconditions:
    - start in G
    - edge weights held in key='weight'
    """
    visited = set()
    dist = defaultdict(lambda: math.inf)
    prev = defaultdict(list)
    dist[start], prev[start] = 0, [start]
    pq = [(0, start)]
    while len(pq) >= 1:
        _, u = heapq.heappop(pq)
        if u not in visited:
            visited.add(u)
            for v in g.neighbors(u):
                if dist[u] + g.edges[u, v]['weight'] < dist[v]:
                    dist[v] = dist[u] + g.edges[u, v]['weight']
                    prev[v] = prev[u] + [v]
                    heapq.heappush(pq, (dist[v], v))

    return {n: {'dist': dist[n], 'path': prev[n]} for n in g.nodes}

In [3]:
def print_shortest_path(ssp, city):
    """Print the distance and shortest path from Cardiff to the given city.
    """
    print(f"Cardiff -> {city}:\n{ssp[city]['dist']}mi {ssp[city]['path']}")

## Main

We will use the "The International E-road Network" dataset[^1] for this example.

### Load the data

In [4]:
eroads = lr.datasets.get_csv_file('graphs/eroads_edge_list.csv')
eroads.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1250 entries, 0 to 1249
Data columns (total 7 columns):
 #   Column                       Non-Null Count  Dtype 
---  ------                       --------------  ----- 
 0   road_number                  1250 non-null   object
 1   origin_country_code          1250 non-null   object
 2   origin_reference_place       1250 non-null   object
 3   destination_country_code     1250 non-null   object
 4   destination_reference_place  1250 non-null   object
 5   distance                     1250 non-null   int64 
 6   watercrossing                1250 non-null   bool  
dtypes: bool(1), int64(1), object(5)
memory usage: 59.9+ KB


### Initialise the graph

In [5]:
eroads = lr.datasets.get_csv_file('graphs/eroads_edge_list.csv')
U = 'origin_reference_place'
V = 'destination_reference_place'
W = 'distance'
edges = eroads.to_dict(orient='records')
g = nx.Graph((e[U], e[V], {'weight': e[W]}) for e in edges)
print(g)

Graph with 894 nodes and 1198 edges


### Example 1: Shortest paths from Cardiff

Find the shortest paths from Cardiff to all other cities.

In [6]:
ssp = shortest_paths(g, 'Cardiff')

Output the shortest paths connect Cardiff to London, Dublin, and Calais.

In [7]:
for city in ('London', 'Dublin', 'Calais'):
    print(f"Cardiff -> {city}:\n{ssp[city]['dist']}mi {ssp[city]['path']}")

Cardiff -> London:
264mi ['Cardiff', 'Newport', 'Bristol', 'London']
Cardiff -> Dublin:
456mi ['Cardiff', 'Swansea', 'Fishguard', 'Rosslare', 'Wexford', 'Dublin']
Cardiff -> Calais:
429mi ['Cardiff', 'Newport', 'Bristol', 'London', 'Folkestone', 'Dover', 'Calais']


### Example 2: Collect all shortest paths

Return a collection of all the shortest paths, from each city to all other cities.

This returns a nested dictionary `dict[dict[dict]]`.

In [8]:
all_ssp = {n: shortest_paths(g, n) for n in g.nodes}

Repeat the example above, this time printing the shortest paths from London, Dublin, and Calais to Cardiff.

In [9]:
ca = 'Cardiff'
for city in ('London', 'Dublin', 'Calais'):
    print(
        f"{city} -> {ca}:"
        + f"\n{all_ssp[city][ca]['dist']}mi {all_ssp[city][ca]['path']}"
    )

London -> Cardiff:
264mi ['London', 'Bristol', 'Newport', 'Cardiff']
Dublin -> Cardiff:
456mi ['Dublin', 'Wexford', 'Rosslare', 'Fishguard', 'Swansea', 'Cardiff']
Calais -> Cardiff:
429mi ['Calais', 'Dover', 'Folkestone', 'London', 'Bristol', 'Newport', 'Cardiff']


### Performance

In [10]:
print('Time to completion, shortest_paths =')
%timeit shortest_paths(g, 'Cardiff')
print('\nTime to completion, all_shortest_paths =')
%timeit {n: shortest_paths(g, n) for n in g.nodes}

Time to completion, shortest_paths =
3.59 ms ± 53 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Time to completion, all_shortest_paths =
3.42 s ± 24.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


[^1]: (See [ljk233/laughingrook-datasets/graphs](https://github.com/ljk233/laughingrook-datasets/tree/main/graphs).)