# Graphs
In this lecture we will discuss several common problems in graph theory. The algorithms we discuss are not only useful in practice, they are used in real-life applications. In this lecture we will discuss:
1. A brief introduction to graphs
2. Topological Sort
3. Shortest-Path Algorithms
4. Network Flow Problems
5. Minimum Spanning Tree
6. Depth First Search

## Definitions
A **graph** $G = (V,E)$ consists of a set of **vertices**, V, and a set of **edges**, E. Each edge is a pair $(v,w)$, where $v,w \in V$. If the pair is ordered, or the matrix of edges is not symmetric, then the graph is directed. A vertex w is **adjacent** to v if and only if $(v,w) \in E$. In an **undirected** graph if there is an edge $(v,w) \in E \rightarrow (w,v) \in E$, in other words the matrix of edges is symmetric. Sometimes an edge has a third component, known as either a **weight** or **cost** to take it. <br>

A **path** in a graph is a sequence of vertices $w_1,w_2,w_3,...,w_n$ such that $(w_i,w_{i+1}) \in E$ for $1 \leq i \leq n$. The **length** of a path is the number of edges on the path, which is equal to $n-1$. We also allow for a vertex to have a path to itself which would be of length 0. This allows for us to define a special kind of edge. If the graph contains an edge $(v,v)$ from a vertex to itself, then the path v,v is referred to as a **loop**. A **simple path** is a path such that all vertices are distinct, except that the first and last vertex could be the same. <br>

A **cycle** in a directed graph is a path of length at least 1 such that $w_1 = w_n$. This cycle is simple if the path is simple. For undirected graphs, we require that the edges be distinct. This is defined so that the path u,v,u is not a cycle, because $(u,v)$ and $(v,u)$ are the same edge. So we define a cycle in an undirected graph a path, $p = w_1,w_2,...,w_n$, where $w_1 = w_n$ and $\forall(w_i,w_{i+1}) \in p\,,(w_{i+1},w_i) \not\in p$. However, in a directed graph these are different edges, so the path $p=u,v,u$ such that $(u,v),(v,u) \in E$. A directed graph is **acyclic** if it has no cycles, we also refer to this graph as a **DAG** (directed acyclic graph). <br>

An undirected graph is **connected** if there is a path from every vertex to every other vertex. A directed graph with this property is called **strongly connected**. If a directed graph is not strongly connected, but the underlying graph (without directed edges) is connected then it is said to be **weakly connected**. A **complete graph** is a graph in which there is an edge between every pair of vertices. <br>

An real world example that can be modeled as a graph is a road system. Each intersection is a vertex and each street is a directed edge. You could also associate a cost with each edge being speed limit, distance or time it takes to travel from one intersection to another. 

## Representations of Graphs
One simple way to represent a graph is to use a two-dimensional array known as an **adjacency matrix**. For each edge $(u,v) \in E$, we set A\[u\]\[v\] to 1; otherwise it's 0. If there is a weight associated to the edge we could simply set the value to the weight and for edges that don't exist we could set the weight to $\infty/-\infty$ (depending on the problem we are trying to solve) or null. <br>

However if the graph is **sparse**, meaning $\mid E \mid \lt \mid V \mid^2$, a better solution is to use an **adjacency list**. This is because a sparse matrix will have a bunch of empty entries taking up more space than neccessary. 

## Topological Sort
A **topological sort** is an ordering of vertices in a directed acyclic graph, such that there is a path from $v_i$ to $v_j$, then $v_j$ appears *after* $v_i$ in the ordering. Also the ordering is not unique as there are multiple legal sorts of a directed acyclic graph. It is easy to see topological ordering cannot be applied on a cyclic graph because for two vertices $v$ and $w$ on the cycle $v$ precedes $w$ and $w$ precedes $v$. If we look at the graph below $v_1, v_2, v_5, v_4, v_3, v_7, v_6$ and $v_1, v_2, v_5, v_4, v_7, v_6, v_3$ are both topological orderings.
<img src="./files/Graphs/topological_sort.png" width="400"/>

We can create a simple algorithm to find a topological ordering is first to find any vertex with no incoming edges. We define the **indegree** of a vertex $v$ as the number of edges $\mid\{(u,v) \in E\}\mid$, meaning the number of edges going into $v$. If we make the assumption that each vertex keeps track of it's indegree, which we can easily calculate then the following algorithm would be simple. The initial thought is to look for vertices that always have an indegree of 0 then assign it a ordering. Whenever we find one we decrease the indegree of it's adjacent vertices by one and find another vertex with indegree 0 and does not currently already an ordering. We repeat this until all the vertices have an order. Let's look at the code below:
```python
def find_next_vertex():
    # Iterate through the vertices in the graph
    for v in V:
        if v.indegree == 0 and v.ordering == None:
            return v
   
    return None

def top_sort():
    for i in range(len(V)):
        v = find_next_vertex()
        v.ordering = i
        for w in v.adjacent:
            w.indegree -= 1
```

While this code works it is inefficient as the running time is $O(V^2)$ since we iterate through the list of vertices each time we want to find a new vertex. Since we are only decreasing the indegree of vertices adjacent to the current vertex we know that the next 0 indegree vertex must be in that list. Therefore we could use a queue to keep track of all vertices we encounter with an indegree of 0 while iterating through the list of neighbors. Also in doing this we are assured to never encounter a vertex that already has an ordering unless the graph has a cycle. The topological ordering is then the order at which the vertices are dequeued. This algorithm also runs faster is $O(V+E)$ if we use an adjancency list. Let's look at the algorithm with queues.
```python
def top_sort():
    q = Queue()
    counter = 0 # keeps track of our ordering
    
    # Gather initial vertex
    for v in V:
        if v.indegree == 0:
            q.enqueue(v)
    
    while not q.isEmpty():
        v = q.dequeue()
        v.ordering = counter
        counter += 1
        
        for w in v.adjacent:
            w.indegree -= 1
            if w.indegree == 0:
                q.enqueue(w)
       
        if counter > len(V):
            raise Exception # we have encountered a cycle
```

## Shortest-Path Algorithms
A popular problem in graph theory is finding the shortest path from an initial vertex $v$ to another vertex $w$. You can think of this as of how Google maps finds the shortest route when looking up directions. In these problems the input is a weighted graph where each edge $(v_i,v_j)$ has an associated cost $c_{i,j}$ to traverse (or take) that edge. The cost of a path $v_1v_2...v_n = \sum_{i=1}^{n-1} c_{i,i+1}$. We refer to this as the **weighted path length**. The **unweighted path length** is just the number of edges on the path, which is n-1. <br>

##### Single-Source Shortest-Path Problem
Given as input a weighted graph, G = (V,E), and a distinguished vertex, s, find the shortest weighted path from s to every other vertex in G. <br>

For example in the following graph the shortest weighted path from $v_1$ to $v_6$ has a cost of 6 and a path of $v_1,v_4,v_7,v_6$
<img src="./files/Graphs/ex_one_shortest_path.png" width="400"/>

However what happens if the graph has negative cost edges like the graph below?
<img src="./files/Graphs/ex_two_shortest_path_neg.png" width="400"/>

There is a path from $v_5$ to $v_4$ is cost 1 however there exist a shorter path $v_5, v_4, v_2, v_5, v_4$ which has cost -5. This path is also not the shortest as we could go through the loop infinitely many times and constantly get a decreasing cost. Thus the shortest path between the two points is undefined. This loop is known as a **negative-cost cycle** and when one exists in a graph the shortest paths are undefined. Negative-cost edges are not necessarily bad, whereas negative-cost cycles are, but their presence makes the problem harder to solve. In the absence of a negative-cost cycle the shortest path from a vertex to iteslf is 0. 

### Unweighted Shortest Paths
In this instance when a graph is unweighted we are only interested in the number of edges contained on the path. In this case we could assign all the edges a weight of 1. For now we are only interested in the length of the shortest paths and not the actual paths themselves, as keeping track of the paths would just be a simple addition. So in the following graph suppose we choose s to be $v_3$. 
##### Graph image

Here we immediately know the shortest path from s to $v_3$ is 0 so we would mark it down. Now we would look at all vertices adjacent to s which will be a distance of 1 away. In this case $v_1$ and $v_6$ are both 1 away from s. We continue this finding all vertices of distance 2,3,... until all the shortest paths from s to each vertex are known. What we are doing here is performing **breadth-first search (BFS)**. It operates by processing each vertex in layers. The vertices that are closest are evaluated first then the most distance vertices are evaluated last. Another way to think about it is with trees. We can also perform BFS in trees by look at the children of the current node first then look at the childrens children until we reach the lowest level containing only leaves. <br>

##### Table image
With this strategy we can create an initial algorithm. We would create an initial table to keep track of information during the algorithm as above. For each vertex we will maintain three pieces of information. First, it's distance from s in the entry $d_v$. Initially all vertices are set to $\infty$ except for s whose distance from s to s is 0. The entry $p_v$ is used to maintain the actual paths from s to each vertex. The entry *known* is set to **true** after each vertex has been processed. Initially all vertices are not *known*, including s. When a vertex is marked *known*, we have a gurantee that no cheaper path will ever be found, which means processing for that vertex is complete. Now let's look at the following algorithm in code:
```python
def shortest_unweighted(G, s):
    dist = [math.inf for _ in range(len(G.V))]
    known = [False for _ in range(len(G.V))]
    path = [None for _ in range(len(G.V))]
        
    dist[s] = 0
    for i in range(len(V)):
        for v in V:
            if not known[v] and dist[v] == i:
                known[v] = True
                for u in v.adjacent:
                    if dist[u] == math.inf:
                        dist[u] = i + 1
                        path[u] = v
```
We can easily find the paths by back-tracking through the path variable. The running time of this algorithm is $O(\lvert V\rvert^2)$ because of the double nested for loops. The obvious inefficiency is that the first for loop goes until the size of V, even if all the vertices are *known*. We can remove this inefficiency by simply using a queue. At the start of each pass the queue only contains vertices of the current\_distance. We then enqueue all the adjacent vertices that have current\_distance + 1. In this case we can remove the *known* variable since once a vertex is processed it will never enter the queue again. Now let's look at the refined algorithm:
```python
def shortest_unweighted(G, s):
    q = Queue()
    
    dist = [math.inf for _ in range(len(G.V))]
    path = [None for _ in range(len(G.V))]
       
    dist[s] = 0
    q.enqueue(s)
    
    while not q.isEmpty():
        v = q.dequeue()
        
        for w in v.adjacent:
            if dist[w] == math.inf:
                dist[w] = dist[v] + 1
                path[w] = v
                q.enqueue(w)
```

We can see with the addition of using a queue that the running time has improved to $O(\lvert E\rvert + \lvert V\rvert)$, as long as we use an adjaceny list. 

### Dijkstra's Algorithm
If the graph is weighted, the problem becomes harder, but we can still use the general idea of the unweighted case. We use all the same information table as before. Thus each vertex is either *known* or *unknown*, a tenative distance $d_v$ is kept for each vertex (this distance is the shortest path from s to v using only known vertices), and we record $p_v$ which is the last vertex to cause a change to $d_v$. <br>

The general method to solve the single-source shortest-path problem is known as **dijkstra's algorithm**. This is a solution of a **greedy algorithm**. A greedy algorithm will solve the problem by always choosing what appears to be the best option at each stage. For example when creating change cashiers will always start with the highest bill first then work all the way down to the 1 dollar bill, this way they use the minimum number of bills. <br>

Dijkstra's algorithm works in stages. At each stage we select a vertex v which has the smallest $d_v$ among all the *unknown* vertices and declare that the shortest path from s to v is *known*. The remainder of the stage is updating the values of $d_v$ using this information. As we saw in the unweighted case we set $d_w = d_v + 1$ if $d_v = \infty$. Thus we essentially lowered the value of $d_w$ if v offered a shorter path. Here we want to apply the same logic so $d_w = d_v + c_{v,w}$ if this new value for $d_w$ is less than it's current. Thus the idea it is only a good idea to use v on the path to w if it offers a lower cost. We will use the following graph for our example which is followed by our initial table, which in this example $s = v_1$. 
##### Graph image
##### Table image

The first vertex selected is $v_1$, with a path length 0 (since the path from a vertex to itself is 0). We then mark this vertex as *known*. Now that $v_1$ is *known*, some entries need to be adjusted. The vertices adjacent to $v_1$ are $v_2$ and $v_4$, both these vertices get their entries adjusted as follows:

##### Table Image

Next $v_4$ is selected since it has the smallest distance of the *unknown* vertices and mark it as *known*. Vertices $v_3, v_5, v_6$ and $v_7$ are adjacent and updated accordingly. 

##### Table Image

Next we select $v_2$. Here $v_4$ is adjacent but it is already *known*, so no work is performed on it. $v_5$ is adjacent but it is not adjusted since it would raise the cost from 3 to 12. So no changes occur other than changing $v_2$ to *known*. Next we select $v_5$ which only has $v_7$ as adjacent, but it is not adjusted as it would raise the cost. Then we select $v_3$, and we adjust $v_6$ since 8 < 9 resulting in the following table.

##### Table Image

Next we select $v_7$ and $v_6$ gets updated from 8 to 6.

##### Table Image

Lastly $v_6$ is selected however no changes are made resulting in the final table

##### Table Image

Now let's look at some code for the algorithm itself:
```python
def dijkstra(G, s):
    dist = [math.inf for _ in range(len(G.V))]
    known = [False for _ in range(len(G.V))]
    path = [None for _ in range(len(G.V))]
    
    while unknown_vertices != 0:
        v = get_smallest(dist)
        known[v] = True
        
        for w in v.adjacent:
            if not known[w] and dist[w] > dist[v] + c[v,w]: #c is a matrix of cost for each edge
                dist[w] = dist[v] + c[v,w]
                path[w] = v
```

This algorithm will always work as long as there exist no negative cost edges. If any edge has negative cost the algorithm would produce the wrong answer. The total running time of this algorithm $O(\lvert E\rvert + \lvert V \rvert^2) = O(\lvert V \rvert^2)$. This is because each phase will take $O(\lvert V \rvert)$ to find the minimum vertex and thus $O(\lvert V \rvert^2)$ time will be spent finding the minimumover the course of the algorithm also the time of updating each vertex is $O(\lvert E\rvert)$ since at most we update each adjacent vertex. We can however optimize it by using a heap to get the minimum vertex and during the update of each vertex if we could use a **decrease\_key** operation if the vertex's distance was updated. This would give a total running time of $O(\lvert E \rvert \log\lvert V\rvert)$. 

### Graphs with Negatice Cost Edges