# [CptS 215 Data Analytics Systems and Algorithms](https://github.com/gsprint23/cpts215)
[Washington State University](https://wsu.edu)

[Gina Sprint](http://eecs.wsu.edu/~gsprint/)
# Graphs

Learner objectives for this lesson:
* Learn about graph data structures
* Gain familiarity with graph terminology


## Acknowledgments
Content used in this lesson is based upon information in the following sources:
* [Miller and Ranum](http://interactivepython.org/runestone/static/pythonds/index.html)

## Graph Overview
We are quite familiar with the concept of a graph (the web of connections kind, not the plot kind). For example:
* Airport system: each airport is connected by airline routes/flights
* Internet: each computing device is connected by data transmission infrastructure
* Power grid: each building is connected via power lines
* Highway infrastructure: each town is connected by roads
* etc.

You can think of graphs as a more generalized version of a tree. As we will learn soon, a tree is a connected, acyclic digraph.

### Definition
A graph is a data structure representing connections (edges) between items (vertices). A vertex (or node) is an item in a graph. An edge (or link) is a connection between two vertices in a graph. Given a graph $G = (V, E)$, the set $V$ is the set of all vertices in $G$: $V = \{v_{1}, ..., v_{i}, ..., v_{n}\}$ and the set $E$ is the set of all edges in $G$: $E = \{e_{1}, ..., e_{j}, ..., e_{m}\}$ where $e_{j} = (v_{k}, v_{l})$ is an edge connecting vertices $v_{k}$ and $v_{l}$.

Example graph:
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/5b/6n-graf.svg/640px-6n-graf.svg.png" width="300">
(image from [https://upload.wikimedia.org/wikipedia/commons/thumb/5/5b/6n-graf.svg/640px-6n-graf.svg.png](https://upload.wikimedia.org/wikipedia/commons/thumb/5/5b/6n-graf.svg/640px-6n-graf.svg.png)) 

For the above graph:
* $V = \{1, 2, 3, 4, 5, 6\}$
* $E =\{(1, 2), (1, 5), (2, 5), (2, 3), (3, 4), (4, 5), (4, 6)\}$

### Terminology
1. Adjacent: two vertices, $v_{k}$ and $v_{l}$, are adjacent if they are connected by an edge ($(v_{k}, v_{l}) \in E$).
    * Example: vertices 4 and 6 are adjacent
    * Example vertices 4 and 2 are *not* adjacent
1. Path: a sequence of edges leading from a source (starting) vertex to a destination (ending) vertex.
    * Example: a path from 1 to 6 is: (1, 2), (2, 3), (3, 4), (4, 6)
1. Path length: the number of edges in a path.
    * Example: path length for (1, 2), (2, 3), (3, 4), (4, 6) is: 4
1. Distance: the distance between two vertices is the path length for the shortest path between the two vertices.
    * Example: the shortest path from 1 to 6 is: (1, 5), (5, 4), (4, 6) which has path length 3. Thus, the distance from 1 to 6 is 3.
    * Note: we will find out soon how to compute the shortest path between two vertices!

## Graph Variations
### Weighted Graph
Each edge in a weighted graph has an associated "weight", or value, associated with it. This weight represents the cost to move from one vertex to another. 

Example weighted graph:
<img src="https://upload.wikimedia.org/wikipedia/commons/5/5f/CPT-Graphs-undirected-weighted.svg" width="300">
(image from [https://upload.wikimedia.org/wikipedia/commons/5/5f/CPT-Graphs-undirected-weighted.svg](https://upload.wikimedia.org/wikipedia/commons/5/5f/CPT-Graphs-undirected-weighted.svg)) 

The above example shows the distances in miles between pairs of towns in England. In a weighted graph, the *weighted path length* is the sum of the weights of the edges in a path. For example, the path from Dunwich to Maldon: (Dunwich, Blaxhall, 15), (Blaxhall, Feering, 46), (Feering, Maldon, 11) has weight 15 + 46 + 11 = 72

### Directed Graph (Digraph)
Edges in a directed graph have a way associated with them (e.g. one-way or two-way). An arrow is typically used to denote the direction of an edge.

Example digraph:
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/8/8c/CPT-Graphs-directed-unweighted.svg/1000px-CPT-Graphs-directed-unweighted.svg.png" width="300">
(image from [https://upload.wikimedia.org/wikipedia/commons/thumb/8/8c/CPT-Graphs-directed-unweighted.svg/1000px-CPT-Graphs-directed-unweighted.svg.png](https://upload.wikimedia.org/wikipedia/commons/thumb/8/8c/CPT-Graphs-directed-unweighted.svg/1000px-CPT-Graphs-directed-unweighted.svg.png)) 

The above example shows that it is easy to get between Dunwich and Blaxhall, but could be quite long to get between Feering and Tiptree, depending on the direction of travel! Tiptree to Feering is one edge away, but Feering to Tiptree is two edges away. In fact, one could get from Harwich to Dunwich, but how could one get from Dunwich to Harwich?

#### Digraph Cycle
A path that starts and ends at the same vertex is called a *cycle*. In the above example, there are several cycles. For example:
* Feering, Maldon, Tiptree
* Maldon, Tiptree, Clacton
* Feering, Maldon, Tiptree, Clacton, Harwich, Tiptree, Feering

Graphs without cycles are called *acyclic* graphs. An directed, acyclic graph is called a DAG. We will return to DAGs in future lessons.

#### Weighted Digraph
A graph can be both directed and weighted. 

Example weighted digraph:
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/2/22/CPT-Graphs-directed-weighted.svg/200px-CPT-Graphs-directed-weighted.svg.png" width="300">
(image from [https://upload.wikimedia.org/wikipedia/commons/thumb/2/22/CPT-Graphs-directed-weighted.svg/200px-CPT-Graphs-directed-weighted.svg.png](https://upload.wikimedia.org/wikipedia/commons/thumb/2/22/CPT-Graphs-directed-weighted.svg/200px-CPT-Graphs-directed-weighted.svg.png)) 

## Graph Abstract Data Type
Vertices in a graph ADT store a value, called the key, that names the vertex. Edges represent relationships/connections amongst the keys. The interface of a graph ADT includes the following constructor, methods, and operators: 
1. `Graph()` creates a new, empty graph.
1. `add_vertex(vert)` adds an instance of Vertex to the graph.
1. `add_edge(from_vert, to_vert)` adds a new, directed edge to the graph that connects two vertices.
1. `add_edge(from_vert, to_vert, weight)` adds a new, weighted, directed edge to the graph that connects two vertices.
1. `get_vertex(vert_key)` finds the vertex in the graph named vertKey.
1. `get_vertices()` returns the list of all vertices in the graph.
1. `in` returns True for a statement of the form vertex in graph, if the given vertex is in the graph, False otherwise.

## Practice Problems

### 1
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/2/22/CPT-Graphs-directed-weighted.svg/200px-CPT-Graphs-directed-weighted.svg.png" width="300">
(image from [https://upload.wikimedia.org/wikipedia/commons/thumb/2/22/CPT-Graphs-directed-weighted.svg/200px-CPT-Graphs-directed-weighted.svg.png](https://upload.wikimedia.org/wikipedia/commons/thumb/2/22/CPT-Graphs-directed-weighted.svg/200px-CPT-Graphs-directed-weighted.svg.png)) 

In the graph above:
1. What vertices are adjacent to Harwich?
1. Is it possible to get from Clacton to all other vertices?
1. Is it possible to get from Dunwich to all other vertices?

### 2
In the graph above:
1. What are all possible paths from Harwich to Clacton?
1. What is the shortest path from Harwich to Clacton?