# Graph Theory

The study of graphs(networks) from a computer science perspective

[Graph Theory Youtube Series](https://www.youtube.com/watch?v=DgXR2OWQnLc&list=PLDV1Zeh2NRsDGO4--qE8yH72HFL1Km93P&index=1)

# Graph theory Introduction

### Objective: gain an understanding of how to apply graphs theory to real world applications

# what is graph theory
- Graph theory
    - is the mathematical theory of the properies and applications of graphs
    
Graphs can be used to represent almost any problem.
Given some contraints, how many different x exsist
- nodes represnt an entity
- edges or connections represent relationships

Types of graphs:
- Undirected
    - a graph in which edges have no orientation
    - the edge (u, v) is identical to the edge (v, u)
    - means A to B cost is the same as B to A cost
- Directed
    - a graph in which edge have orientations
    - edge (u, v) is the edge from node u to node v
    - unless told, edge (u, v) does not exsist
- Weighted
    - a graph can have edges that contain a certain weight to represent an arbitarty value
    - properties like cost, distance, quantity etc
    - represented by (u, v, w), where w is the weight
# special types of graphs

- Trees
    - an undirected graph with no cycles
    - can have many branches
- Rooted trees
    - a rooted tree is a tree with a designed root node where every edge either points away from or towards the root node
    - Out-tree is when the edges point away from the root of the graph
    - In-tree is when the edges point towards the root of the graph
- Directed Acyclic Graphs (DAGs)
    - Dags are directed graphs with no cycles
    - they represent structures with dependencies
- Bipartite Graph
    - A Bipartite graph is one whose vertices can be split into two independent groups U, V such that every edge connects between U and V
    - important for network flow
- Complete Graph
    - A complete graph is one where there is a unique edge between every pair of nodes
    - a complete graph with n vertices is denoted as the graph $K_n$


### How to represent graphs

- Adjacency Matrix
    - An adjacency matrix has a cell m[i][j] representing the edge weight of going from node i to node j for every cell
    
| Pros  | Cons  |
|---|---|
| Space efficient for representing dense graphs  | Requires O(V^2) space  |
| Edge weight lookup is O(1)  |  Iterating over all edges takes O(V^2( time |
| Simple graph representation  | -  |

- Adjacency List
    - An adjacency list is a way to represent a graph as a map from nodes to lists of edges

It can be represented as a adjacency dictionary, which is what is ment when adjacency list is used

Good graph representation with [bradfield representing a graph post](https://bradfieldcs.com/algos/graphs/representing-a-graph/)

[Graph Theory Youtube Series](https://www.youtube.com/watch?v=DgXR2OWQnLc&list=PLDV1Zeh2NRsDGO4--qE8yH72HFL1Km93P&index=2)

Introduction to graphs with Bradfield

- Graphs can be used to represent many interesting things about our world including
    - systems of roads
    - airline flights from city to city
    - how the internet is connected
    - sequence of classes you must complete first


- Vertex
    - called a node, is a fundamental part of a graph
    - It can have a name, which will call the 'key'
    - a vertex may also have additional inforation, which we call the 'payload'
    
- Edge
    - an edge is another fundamental part of a graph
    - an edge connects two verticies to show that there is a relationship beteen them
    - edges may be one-way or two-way
    - hf the edges in a graph are all one-way, then it is a directed graph
    
- Weight
    - An edge may be weighted to show that there is a cost to go from one vertex to another

- Path
    - A path in a graph is a sequence of verticies that are connected by edges
    
- Cycle
    - A cycle in a directed grph is a path that starts and ends at the same vertex
    - a graph with no cycles is called an acyclic graph
    - a directed graph with no cycles is called a directed ayclic graph or a DAG
    
- The Graph Abstract Data Type (ADT)
    - Graph()
        - creates a new graph
    - add_vertex(vertex)
        - adds an instance of Vertex to the graph
    - add_edge(from_vertex, to_vertex)
        - adds a new, weighted, directed edge to the graph that connects two verticies
    - get_vertex(key)
        - finds the vertex in the graph named key
    - get_verticies()
        - returns the list of all verticies in the graph
    - in
        - returns True if vertx is in a graph, else False

[link](https://bradfieldcs.com/algos/graphs/introduction/)

# Representing a graph

- Two most common abstract representations of graphs are:
    - adjacency matrix
    - adjacency list
        - is really a mapping that can be represented in python by
            - an object-orienged approach with a Python dict as its underlying data type
            - or a plain dict directly

- The Adjacency Matrix
    - One of the easiest ways to implement a graph is to use a two-dimensional matrix
    - in this matrix implementation, each of the rwos and columns represent a vertex in the graph
    - the value that is stored in the cell at the intersection of row v and column w indicates if there is an edge from vertex v to vertex w
    - when two verticies are connected by an edge, we say that they are adjacent
    - a value in a cell represents the weight of the edge from vertex v to vertex w
        - pros: simple
        - cons: not an efficient way to store graphs

- The adjacency List
    - A more space-efficient way to implement a sparesely connected graph
    - we keep a master collection of all the verticies in the graph object and then each vertex object in the graph maintains a list of the other verticies that it is connected to
    - in the implementation, the vertex class uses a dictionary rather than a list as the master collection, where the dictionary keys are the vertcies, and the values are the weightd
        - pros: that it allows us to compactly represent a sparse graph. The adjacency list also allows us to easily find all the links that are directly connected to a particular vertex
        
- An object-orinted approach
    - using dictionaries, it is easy to implement the adjacency list in Python
    - in this implementation we create two classes: Graph, which holds the master list of verticies, and Vertex, which will represent each vertex in the graph
    
- Each vertex uses a dictionary to keep track of the verticies to which it is connected, and the weight of each edge
    - if we weren't concerned with edge weights, we could use a set in place of a dictionary
    - this dictionary is called neighbors
    
- In the code below the add_neighbor method is used to add a connection from the vertex to another
- the get_connection method returns all the vertices in the adjacency list, as represented by the neighbors instance variable
- the get_weight method returns the weight of the edge from this vertex to the vertex passed as a parameter


[link(https://bradfieldcs.com/algos/graphs/representing-a-graph/)

In [9]:
class Vertex(object):
    def __init__(self, key):
        self.key = key
        self.neighbors = {}

    def add_neighbor(self, neighbor, weight=0):
        self.neighbors[neighbor] = weight

    def __str__(self):
        return '{} neighbors: {}'.format(
            self.key,
            [x.key for x in self.neighbors]
        )

    def get_connections(self):
        return self.neighbors.keys()

    def get_weight(self, neighbor):
        return self.neighbors[neighbor]
        
        
class Graph(object):
    def __init__(self):
        self.verticies = {}

    def add_vertex(self, vertex):
        self.verticies[vertex.key] = vertex

    def get_vertex(self, key):
        try:
            return self.verticies[key]
        except KeyError:
            return None

    def __contains__(self, key):
        return key in self.verticies

    def add_edge(self, from_key, to_key, weight=0):
        if from_key not in self.verticies:
            self.add_vertex(Vertex(from_key))
        if to_key not in self.verticies:
            self.add_vertex(Vertex(to_key))
        self.verticies[from_key].add_neighbor(self.verticies[to_key], weight)

    def get_vertices(self):
        return self.verticies.keys()

    def __iter__(self):
        return iter(self.verticies.values())

In [13]:
g = Graph()
for i in range(6):
    g.add_vertex(Vertex(i))
g.add_edge(0, 1, 5)
g.add_edge(0, 5, 2)
g.add_edge(1, 2, 4)
g.add_edge(2, 3, 9)
g.add_edge(3, 4, 7)
g.add_edge(3, 5, 3)
g.add_edge(4, 0, 1)
g.add_edge(5, 4, 8)
g.add_edge(5, 2, 1)
for v in g:
    for w in v.get_connections():
        print(f'{v.key} -> {w.key}')

0 -> 1
0 -> 5
1 -> 2
2 -> 3
3 -> 4
3 -> 5
4 -> 0
5 -> 4
5 -> 2


In [14]:
# Using a dictionaries directly

graph_dict = {
    0: {1: 5, 5: 2},
    1: {2: 4},
    2: {3: 9},
    3: {4: 7, 5: 3},
    4: {0: 1},
    5: {4: 8}
}

# Common graph theory problems

1. Shortest path problem
    - given a weighted graph, find the shortest path of edges from node A to node B
    - Algorithms
        - BFS (unweighted)
        - Dijkstra's
        - Bellman-Ford
        - Floyd Warshall
        - A*

2. Connectivity
    - Does there exist a path between node A and node B
    - Algorithms
        - Union find
        - any search algorithm (DFS)

3. Negative cycles
    - Does my weighed digraph have any negative cycles?
    - Algorithms
        - Bellman-Ford
        - Floyd-Warshall
        
4. Strongly connected components
    - Strongly connected Components (SCCs) can be thought of as self-contained cycles within a directed graph where every vertex in a given cycle can reach every other vertex in the same cycle
    - Algorithms
        - Tarjan's
        - Kosaraju's

5. Traveling salesman problem
    - Given a list of cities and the distance ebtween each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?
    - Algorithms
        - Held-Karp
        - branch and bound

6. Bridges
    - A bridge / cut edge is any edge in a graph whose removal increases the number of connected compontents
    - Bridges are important in graph theory because they often hint at weak points, bottlenecks or vulnerabliities in a graph

7. Articulation points
    - An articulation point / cut vertex is any node in a graph whose removal increases the number of connected components
    - Articulation points are important in graph theory because they often hint at weak points, bottlenexks or vulnerabilities in a graph
    
8. Minimum spanning tree (MST)
    - a minimum spanning tree(MST) is a subset of the edges of a connected, edge-weighted graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight
    - Algorithms
        - Kruskal's
        - Prim's
        - Boruvka's
        
9. Network flow: Max flow
    - with an infinite input source how much 'flow' can we push through the network?
    - suppose the edges are roads with cars, pipes with water or hallways packed with people. Flow represents the volume of water allowed to folw through the pipes, the number of cars the roads can sustain in traffic and the miximum amount of people that can navigate through the hallways
    - Algorithms
        - Ford-Fulkerson
        - Edmond-Karp & Dinic
        

# DFS overview YT series

- The Depth First Search (DFS) is the most fundament search algorith used to explore nodes and edges
- It runs with a time complexity of O(V+E) and is often usd as a building block in other algorithms

- By itself the DFS is not all the useful
- when augmented it can perform other task such as
    - count connected components
    - determine connectivity
    - find bridges/articulations
    
# Basics DFS

- DFS plunges depth first into a graph without regard for which edge it takes next until it cannot go any further at which point it backtracks and continues

Image is at timestamp 2:40
- order
    - Starts at a node (0)
        - search node (9)
            - search node (8)
                - search node (7)
                    - search node (10)
                        - search node (11)
                    - backtrack to node (10)
                 - backtrack to node (7)
                    - search node (3)
                        - search node (2)
                    - backtrack to node (3)
                        - search node (4)
                    - backtrack to node (3)
                        - search node (5)
                            - search node (6)
                        - backtrack to node (5)
                    - backtrack to node (3)
                - backtrack to node (7)
            - backtrack to node (8)
                - search node (1)
            - backtrack to node (8)
        - backtrack to node (9)
    - backtrack to node (0)
                    
The algorithm could have easily taken 1 instead of 9 in the begining, so there are many DFS solutions

# pseudo code

### Global or class scope varaibles
n = number of nodes in the graph
g = adjacency list representing graph
visited = [false, .. , false] # size of n

function dfs(at):
    if visisted[at]:
        return true
   visited[at] = true
   
   neighbours = graph[at]
   for next in neighbours:
       dfs(next)
       
### start DFS at node zer0
start_node = 0
dfs(stat_node)


### Other applications for DFS

- Can augment the DFS algorithm to
    - Compute a graph's minimum spanning tree
    - Detect and find cycles in a graph
    - Check if a graph is bipartite
    - Find stongly connected components
    - Topologically sort the nodes of a graph
    - Fund bridges and articulation points
    - Generate mazes
    
    
[link](https://www.youtube.com/watch?v=7fujbpJ0LB4&list=PLDV1Zeh2NRsDGO4--qE8yH72HFL1Km93P&index=4)

In [5]:
# putting in weight as 1 for right now
graph_dict = {
    0: {9: 1},
    1: {0: 1},
    2: {},
    3: {2: 1, 4: 1, 5: 1},
    4: {},
    5: {6: 1},
    6: {7: 1},
    7: {10: 1, 3: 1},
    8: {7: 1, 1: 1},
    9: {8: 1},
    10: {11: 1},
    11: {7: 1},
    12: {},
}


def dfs(graph, at, visited):
    if visited[at]:
        return
    visited[at] = True
    print(at)
    
    neighbors = graph[at]
    for neighbor in neighbors:
        dfs(graph, neighbor, visited)
    
visited = [False]*len(graph_dict.keys())
at = 0
dfs(graph_dict, at, visited)

0
9
8
7
10
11
3
2
4
5
6
1


In [None]:
'''
    Connected components

    - Sometimes a graph is split into multiple components. It's useful to be able to identify and count these componnents
    
    - Assign an integer value to each group to be able to tell them apart
   
    - We can use a DFS to identify components.
    - First, make sure all the nodes are labeled from [0, n) where n is the number of nodes.
    
    - Algorithm: Start a DFS at every node (except if it's already been visited) and
        mark all reachable nodes as being part of the same component
        
'''

In [23]:
# putting in weight as 1 for right now
graph_dict = {
    0: {8: 1},
    1: {5: 1},
    2: {9: 1},
    3: {9: 1},
    4: {0: 1},
    5: {16: 1, 17: 1},
    6: {11: 1},
    7: {6: 1},
    8: {4: 1, 14: 1},
    9: {8: 1, 15:1},
    10: {},
    11: {7: 1},
    12: {},
    13: {0: 1},
    14: {0: 1, 13:1},
    15: {2: 1, 10: 1},
    16: {},
    17: {},
}


def dfs(graph, at, label, visited):
    if visited[at] >= 0:
        return
    visited[at] = label
    print(at, label)
    
    neighbors = graph[at]
    for neighbor in neighbors:
        dfs(graph, neighbor, label, visited)
    
visited = [-1]*len(graph_dict.keys())
for counter, vertex in enumerate(graph_dict.keys()):
    dfs(graph_dict, vertex, counter, visited)
# This does not work for a directed graph unless you use  Tarjan's algoritm

0 0
8 0
4 0
14 0
13 0
1 1
5 1
16 1
17 1
2 2
9 2
15 2
10 2
3 3
6 6
11 6
7 6
12 12


# General Depth First Search

DPS goal is to search as deeply as possible, connecting as many does in the graph as possible and branching where necessary.  

It is even possible that a depth first search will create more than one tree. When the depth first search algorithm creates a group of trees we call this a depth first forest. As with the breadth first search our depth first search makes use of predecessor links to construct the tree. In addition, the depth first search will make use of two additional instance variable in the verte_ class. The new instance variables are the discovery and finish times. The discovery time tracks the number of steps in the algorithm before a verte_ is first encountered. The finish time is the number of steps in the algorithm before a verte_ is colored black. As we will see after looking at the algorithm, the discovery and finish time of the nodes provide some interesting propeties we can use in later algorithms.  

The code for our depth first search is shown below. We use a set to maintain a recod of the nodes that have been visited as we recursively traverse through our sample graph. For each verte_, any neighboring vertices that have not yet been visited are traversed. This is much like our depth first traversal for our knight's tour solution, except that we do not need to keep track of the path taken to reach every vertex, allowing us to more simply use our visited set.  

We also introduce a traversal_times dictionary here which the are vertices and the values we poplulate as dictionaries of the form {'disocvery' : m, 'finish' :n}, where the m and n value are integers obtained by incrementing a counter before and after each time a new verte_ is traversed.  



In [37]:
from collections import defaultdict

simple_graph = {
    'A': ['B', 'D'],
    'B': ['C', 'D'],
    'C': [],
    'D': ['E'],
    'E': ['B', 'F'],
    'F': ['C'],
}

def dfs(graph, starting_vertex):
    visited = set()
    counter = [0]
    traversal_times = defaultdict(dict)
    
    def traversal(vertex):
        visited.add(vertex)
        counter[0] += 1
        traversal_times[vertex]['discovery'] = counter[0]
        
        for next_vertex in graph[vertex]:
            if next_vertex not in visited:
                traversal(next_vertex)
                
        counter[0] += 1
        traversal_times[vertex]['finish'] = counter[0]
        
    # in this case start wiht just one vertex, but we could equally
    # dfs from all_vertices to produce a dfs forest
    traversal(starting_vertex)
    return traversal_times

traversal_times = dfs(simple_graph, 'A')

traversal_times

defaultdict(dict,
            {'A': {'discovery': 1, 'finish': 12},
             'B': {'discovery': 2, 'finish': 11},
             'C': {'discovery': 3, 'finish': 4},
             'D': {'discovery': 5, 'finish': 10},
             'E': {'discovery': 6, 'finish': 9},
             'F': {'discovery': 7, 'finish': 8}})

The Starting and finishing times for each node display property called the parenthesis property, This property means that all the children of a particular node in the depth first tree have a later discovery time and an earlier finish time than their parent.  


# Breadth first search

- The Breadth first search (BFS) is another fundamental search algorithm used to explore nodes and edges of a graph. It runs with a time complexity of O(V+E) and is often usd as a building block in other algorithms.  

- The BFS algorithm is particularly useful for one thing: finding the shortest path on unweighted graphs
- A BFS starts at some arbitrary node of a grph and explores the neighbor nodes first, before moving to the next level neighors

### Using a Queue

- The BFS algorithm uses a queue data structure to track which node to visit next
- Upon reaching a new node the algorithm adds t ot the queue to visit it later
- The queue data strcutre has an enqueue to add new elements at the end and a dequeue to remove elements at the front

### Pseudo code

n = number of nodes in the graph
g = adjacency list representing unweighted graph

function bfs(s, e): # Do a BFS starting at node s
    prev = solve(s)
    return reconstructPath(s, e, prev) # returns reconstructed path from s -> e
    
function solve(s):
    q = queue()
    q.enqueue(s)
    visited = [false, false, ..] # size n
    prev = [null, null, ...] # size n
    while !q.isEmpty();
        node = q.enqueue()
        neighbors = g.get(node)
        for (next: neighbors):
            if !visited[next]:
                q.enqueue(next)
                visited[next] = true
                prev[next] = node
    return prev
    
function recontructPath(s, e, prev):
    path = reversed(path)
    return path
    
    
                

[source](https://www.youtube.com/watch?v=oDqjPvD54Ss&list=PLDV1Zeh2NRsDGO4--qE8yH72HFL1Km93P&index=5)