In [1]:
# automatically reload dependant notebooks
%load_ext autoreload
%autoreload 2
import import_ipynb

# Elementary Graph Algorithms

If you have not read CLRS 4ed Chapter 20 *Elementary Graph Algorithms*, read it now, before proceeding with this notebook. The chapter presents two graph algorithms—bread-first search (BFS) and depth-first search (DFS)—which are used by just about every graph application. These algorithms work on both directed and undirected graphs.

## *breadth-first search*

BFS discovers a breadth-first tree (BFT) within graph $G = (V, E)$, staring from the source vertex $s$, exploring outward one edge-distance at a time, until all the vertices have been explored. See §20.2 *Breadth-first search* p.554. The runtime of BFS is $O(V + E)$, which is time linear in adjacency list size. See p.558.

Since the `LstGraph` constructor has already initialised the vertex and edge attributes, it may seem superfluous to reinitialise ther vertices, here. But this step is actually necessary, since we may wish to call `bfs()` multiple times on the same graph. The `Gray` vertices, which have been discovered but have not finished processing, are stored in a first-in, first-out (FIFO) queue described in §10.1.3 *Stacks and queues* p.254.

The purpose of BFS is to discover a breadth-first tree (BFT) witihn the graph. So, we implement `bft()`. This function accepts a graph, runs `bfs()` on the graph, and extracts a BFT therefrom. Since both BFS and DFS use the same initialisation sequence, we will implement the `egaInit()` function, first.

In [2]:
from queue import Queue
from graph import *
from ega import *

def egaInit(g: LstGraph) -> None:
  for u in g.getVV():
    u.par = None
    u.dis = Infinity
    u.col = VCol.White

def bfs(g: LstGraph, s: Vert) -> LstGraph:
  def explore() -> LstGraph:
    if q.empty(): return g
    u = q.get()
    for v in g.adj(u):
      if v.col == VCol.White:
        # v discovered
        v.par = u
        v.dis = u.dis + 1
        v.col = VCol.Gray
        q.put(v)
    # u finished
    u.col = VCol.Black
    return explore()

  # initialize
  egaInit(g)
  # s discovered
  s.par = None
  s.dis = 0
  s.col = VCol.Gray
  # search g
  q = Queue()
  q.put(s)
  return explore()

def bft(g: LstGraph, s: Vert) -> LstTree:
  g = bfs(g, s)
  t = LstTree(f"{g.tag}†")
  for u in g.getVV():
    if u == s or not u.isRoot(): t.insV(u)
  for u in t.getVV():
    if not u.isRoot():
      e = g.getE(makeETag(u.par, u))
      t.insE(e)
  return t

importing Jupyter notebook from graph.ipynb
The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
importing Jupyter notebook from util.ipynb
importing Jupyter notebook from ega.ipynb
The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


Above, we call the `explore()` inner function recursively, instead of using the `while` loop as CLRS does. When studying algorithms, indeed for all things mathematics, it is essential to be comfortable with recursion. Recursive expressions can be understood by visual inspection, whereas to understand loopy statements, one must be mentally executed the sequence of statements. Familiarity with recursion is also becoming important to programmers now, since many modern programming languages are OO-FP hybrids with good compilers that can optimise tail-call recursions into jump instructions, thereby eliminating the attendant function call overhead. Python's support for recursion is pitiful, for sure. But here, the emphasis is on studying algorithms, not efficiency. If efficiency is our top priority, we should not use Python in the first place.

## *depth-first search*

DFS discovers a depth-first forest (DFF) within the graph by staring from some vertex $u$ and following the outbound edge $(u, v)$ to the neighbour vertex $v$, then proceeding as far as possible before backtracking to $u$ and trying another neighbour, and continuing in this manner until all the vertices have been explored. See §20.3 *Depth-first search* p.563. The runtime of DFS is $\Theta(V + E)$, which is a tighter bound that that of BFS. See p.567.

Since the purpose of DFS is to discover DFF, we implement `dff()`, as well. And since DFF is pointless without edge classification, we implement edge classification within `dfs()` as described on CLRS p.570. Our implementation of DFS, therefore, is slightly more complicated than that described on CLRS p.565, but this is a necessary, minor departure.

In [3]:
def dfs(g: LstGraph) -> LstGraph:
  def explore(u: Vert) -> None:
    time[0] += 1
    # u discovered
    u.dis = time[0]
    u.col = VCol.Gray
    for v in g.adj(u):
      e = g.getE(makeETag(u, v))
      if v.col == VCol.Black:  # bullet 3. p.570
        e.cls = ECls.F if u.dis < v.dis else ECls.C
      elif v.col == VCol.Gray:  # bullet 2. p.570
        e.cls = ECls.B
      elif v.col == VCol.White:  # bullet 1. p.570
        v.par = u
        e.cls = ECls.T
        explore(v)
    time[0] += 1
    # u finished
    u.fin = time[0]
    u.col = VCol.Black

  # initialize
  egaInit(g)
  # search g
  time = [0]  # use array instead of a scalar to allow explore() to mutate time
  for u in g.getVV():
    if u.col == VCol.White: explore(u)
  return g

def dff(g: LstGraph) -> LstGraph:
  g = dfs(g)
  f = LstGraph(f"{g.tag}†")
  f.dupVV(g.vv)
  for v in f.getVV():
    if not v.isRoot():
      e = g.getE(makeETag(v.par, v))
      f.insE(e)
  return f

As we had done with BFS, we use the `explore()` recursive inner function to perform the search. CLRS uses the `time: int` global variable. We do not. Instead, we define a local `time: [int]` in `dfs()`, and update it from within `explore()`. The inner function `explore()` can access variables defined in its outer function `dfs()`. This is called the *closure* property of functions. Python allows reading the enclosed variables, but prohibits mutating them. But if a closure variable references an objects, the inner function may mutate the contents of the object but not the reference variable itself. This is the case here: the `time` variable in `explore()` references a list object allocated in `dfs()`, so we may mutate the element `time[0]` but not `time` itself. This is an inelegant solution, but it allows us to follow the CLRS description, closely.

# Applications of Elementary Graph Algorithms

CLRS Chapter 20 presents two applications: topological sort (TSort) and strongly connected components (SCC).

## *topological sort*

TSort applies DFS to a directed acyclic graph (DAG) to obtain a linear ordering of the vertices. Many simple tasks, like cooking and cleaning, to intricate processes, like surgical procedure and software compilation, depend on steps being performing in a particular order. A DAG can describe task dependencies, where vertices $u$ and $v$ are tasks and edges $(u, v)$ indicate that $u$ must be perfodelEd after $v$. TSort produces a sensible order of such tasks by applying DFS to the dag `sg` then sorting the vertices in the descending order of their finish times. See §20.4 *Topological sort* p.573. The runtime of TSort is the same as that of DFS: $\Theta(V + E)$.

In [4]:
def tsort(g: LstGraph) -> [Vert]:
    g = dfs(g)
    return sorted(g.getVV(), key=lambda u: u.fin, reverse=True)

## *strongly connected components*

Strongly connected components (SCC) of a directed graph (digraph) are sets of vertices that are reacheable from each other. See Appendix B.4 *Graphs* pp.1164-1168. CLRS §20.5 *Strongly connected components* presents an algorithm that applies DFS twice to extract an SCC from a digraph. The runtime of SCC is the same as that of DFS: $\Theta(V + E$.

First, we define the `Comp` component type, which is a `Vert` that contains a set of strongly connected vertices. We also define the `makeCTag()` utility function that forms the component's tag by merging the IDs of its constituent vertices.

In [5]:
class Comp(Vert):
  def __init__(self, tag: Tag):
    super().__init__(tag)
    self.vv: VSet = {}  # strongly connected vertices
  def init(self) -> None: self.__init__(self.tag)

  def insVV(self, vv: [Vert]) -> None:
    for u in vv: self.vv[u.tag] = u
  def getVV(self) -> [Vert]: return list(self.vv.values())

def makeCTag(vv: [Vert]) -> Tag: return "+".join([v.tag for v in vv])

Now, we implement the SCC algorithm. To comput SCC according to the CLRS algorithm presented on p.577, we require two utility functions: `transpose()` that reverses the edges of a digraph, `sort()` that sorts the vertices by some attribute, and `contract()` that merges the strongly connected vertices into components by contracting the edges. `contract()` uses the DFF to contract its input graph. The contraction of an undirected graph is given on p.1168.

In [6]:
from typing import Callable

def scc(g: LstGraph) -> LstGraph:
  g = dfs(g)
  r = transpose(g)
  s = sort(r, attr=lambda u: u.fin, reverse=True)  # descending sort of vertices by finish times
  s = dfs(s)
  f = dff(s)
  return contract(g, f)

def transpose(g: LstGraph) -> LstGraph:
  # reverse edges
  r = LstGraph(f"{g.tag}!")
  r.dupVV(g.vv)
  for e in g.getEE(): r.insE(Edge(e.v, e.u))  # flip (u, v) to (v, u)
  return r

def sort(g: LstGraph, attr: Callable[[Vert], int], reverse: bool = False) -> LstGraph:
  # sort vertices
  s = LstGraph(f"{g.tag}§")
  for u in sorted(g.getVV(), key=attr, reverse=reverse): s.insV(u)  # sorted vertices
  s.dupEE(g.ee)
  return s

def contract(g: LstGraph, f: LstGraph) -> LstGraph:
  # contract DFS g using DFF f
  def scv(u: Vert) -> [Vert]:
    aa = f.adj(u)  # vertex u's adjacent vertices in DFF f
    return [] if not aa else [v := aa[0], *scv(v)]

  c = LstGraph(f"{g.tag}₵")  # SCC c
  # create vertices of SCC c
  for r in [v for v in g.getVV() if v.isRoot()]:  # for each root vertex r in DFS g
    vv = [r, *scv(r)]  # strongly connected vertices rooted at vertex r
    x = Comp(makeCTag(vv))  # create component x by merging strongly connected vertices vv
    x.insVV(vv)
    c.insV(x)  # insert component x into SCC c
  # create edges of SCC c
  cc: [Comp] = c.getVV()  # components of SCC c
  for x in cc:  # for each component x in SCC c
    aa: VSet = {}  # adjacent vertices of component x in DFS g
    vv = x.getVV()  # constituent vertices of component x
    for u in vv:  # for each constituent vertex u of component x
      for v in [a for a in g.adj(u) if a not in vv]: aa[v.tag] = v  # for each (u, v) leaving component x
    for a in aa.values():  # for each adjacent vertex a of component x
      for y in [b for b in cc if b != x]:  # for every other component y in SCC c
        if a in y.getVV():  # component y is adjacent to component x
          c.insE(Edge(x, y))  # insert edge (x, y) into SCC c
  return c

# Conclusion

In this notebook, we implemented the BFS and DFS elementary graph algorithms and the two graph applications described in CLRS Chapter 20. Other more advanced graph algorithms use BFS and DFS. Tests for these elementary graph algorithms and their applications are in [`egatest.ipynb`](./egatest.ipynb)