# Chapter 13: Graphs and Traversals

## Graph Terminology

A graph G is a set V of vertices and a collection E of pairs of vertices from V, which are called edges. Edges may be either:

- **directed**: (u,v) is ordered, with u preceding v
- **undirected**: (u,v) is not ordered

Graphs may either be directed or undirected, in which all of their edges are directed or undirected respectively. Alternatively, they may be **mixed**, containing both types of edges

The two vertices joined by an edge are that edge's end vertices. If the edge is directed, the first vertex is the origin and and second is the destination

Two vertices are **adjacent** if they are endpoints of the same edge, with an edge being incident on a vertex if the vertex is one of that edge's endpoints. An vertex's outgoing and incoming edges are directed edges in which that vertex is an origin and destination, respectively

The **degree** (deg(v)) of a vertex is the number of incident edges of v 

Edges are grouped in **collections**, which allows for two undirected edges to have the same end vertices, and two directed edges to have the same origin and destination. These are known as parallel edges. **Simple** graphs do not have parallel edges or self-loops, and the edges of a simple graph may be dealt with as a set

If G is a graph with m edges, then $\sum_{v \in G}deg(v) = 2m$

Ig G is a directed graph with m edges, then $\sum_{v \in G}indeg(v) = \sum_{v \in G}outdeg(v) = m$

Let G be a simple graph with n vertices and m edges. If G is undirected, then $m \leq n(n-1)/2$, and if G is directed then $m \leq n(n-1)$

**Path**: sequence of alternating vertices and edges that starts and ends at vertices

- **cycle**: path with the same start and end vertices
  - **forest**: graph without cycles
- **simple path/cycle**: each vertex in the path/cycle is distinct

**Subgraph**: graph H whose vertices and edges are subsets of the vertices and edges of G

- **spanning subgraph**: subgraph of G that contains all the vertices of graph G
  
**Connected**: if for any two vertices of graph G there is a path between them

- if G is not connected then the maximal connected subgraphs are called **connected components**
- **tree**: connected forest
- **rooted tree**: tree with a root
- **free tree**: tree with no root
  
**Spanning tree**: spanning subgraph that is a free tree

Let G be an undirected graph with n vertices and m edges:

- if G is connected then $m \geq n - 1$
- if G is a tree then $m = n - 1$
- if G is a forest then $m \leq n - 1$

## Graph Data Structures

Graphs may be represented via the following data structures:

### Adjacency List

For each vertex in a graph, a list is stored that represents all the edges incident on v

The space used by each vertex's list is $O(deg(v))$. Therefore, the space requirement for the adjacency list structure of the entire graph of n vertices and m edges is $O(n+m)$

The following operations are supported:

- return incident edges or adjacent vertices for a vertex v: $O(deg(v))$ time
- determine whether two vertices u and v are adjacent: $O(min\{deg(u), deg(v)\})$

### Adjacency Matrix

Replace the list from the above method with a matrix

Allows for determining adjacencies between pairs of vertices in constant time, but with the trade-off of using $O(n^2)$ space for a graph of n vertices

## Depth-first Search

Useful for finding a path from one vertex to another, determining whether a graph is connected, and computing a spanning tree of a connected graph. Utilizes **backtracking**

Two types of edges are present:

- **discovery edges**: edges that lead to unexplored vertices
  - form a spanning tree of the connected component of the starting vertex s, called the **DFS** tree
- **back edges**: edges that lead to previously explored vertices

![dfs-graph](./res/13-dfs-graph.PNG)

The following algorithm is a recursive description of DFS for searching from a vertex v:

![dfs-recur](./res/13-dfs-recur.PNG)

Let G be an undirected graph on which a DFS traversal starting at a vertex s has been performed. Then the traversal visits all the vertices in the connected component of s, and the discovery edges form a spanning tree of the connected component of s

The following algorithm searches an entire graph G using DFS(G,v):

![dfs-whole](./res/13-dfs-whole.PNG)

Let G be a graph with n vertices and m edges represented with the adjacency list structure. A DFS traversal of G can be performed in $O(n+m)$ time. Also, there exist $O(n+m)$-time algorithms based on DFS for the following problems:

- testing whether G is connected
- computing a spanning forest of G
- computing the connected components of G
- computing a path between two vertices of G, or reporting that no such path exists
- computing a cycle in G, or reporting that G has no cycles

## Breadth-first Search

Instead of searching recursively as DFS does, BFS proceeds in rounds and subdivides the vertices into **levels**, which represent the minimum number of edges from the start vertex to each vertex

![bfs-graph](./res/13-bfs-graph.PNG)

The following algorithm implements BFS:

![bfs](./res/13-bfs.PNG)

Let G be an undirected graph on which a BFS traversal starting at vertex s has been performed. Then:

- the traversal visits all the vertices in a connected component of s
- the discovery edges form a spanning tree T of the connected component of s 
- for each vertex v at level i, the path of tree T between s and v has i edges, and any other part of G between s and v has at least i edges
- if (u,v) is a cross edge, then the level numbers of u and v differ by at most 1

Let G be a graph with n vertices and m edges represented with the adjacency list structure. A BFS traversal of G takes $O(n+m)$ time. Also, there exist $O(n+m)$-time algorithms based on BFS for the following problems:

- testing whether G is connected
- computing a spanning forest of G
- computing the connected components of G
- given a start vertex s of G, compute for every vertex v a path wth the minimum number of edges between s and v, or reporting that no such path exists
- computing a cycle in G, or reporting that G has no cycles

## Directed Graphs

A directed graph is strongly connected if for any two vertices u and v, u reaches v and v reaches u

The transitive closure of a digraph is the digraph such that all vertices are the same as the vertices of the original, and has an edge (u,v) whenever the original has a directed path from u to v 

The following algorithm is illustrates a recursive DFS for a digraph:

![directed-dfs](./res/13-directed-dfs.PNG)

Let G be a digraph with n vertices and m edges. The following problems can be solved by an algorithm that runs in $O(n(n+m))$ time:

- computing, for each vertex v, the sub-graph reachable from v
- testing whether G is strongly connected
- computing the transitive closure of G

The **Floyd-Warshall** algorithm computes the transitive closure of G by incrementally computing a series of digraphs $G_0, G_1,...,G_n$:

![floyd-warshall](./res/13-floyd-warshall.PNG)

This algorithm operates in $O(n^3)$ time

**Topological ordering**: ordering of the vertices of a directed graph such that for every edge $(v_i,v_j)$ of G, i < j. A digraph has a topological ordering if and only if it's acyclic

The following algorithm topologically sorts a digraph:

![topological-sort](./res/13-topological-sort.PNG)

This algorithm runs in $O(n+m)$ time and uses $O(n)$ auxiliary space