# Graph Algorithms

Key topics include:
* Data structures for graphs
* Graph traversals
* Transitive closure
* Directed acyclic graphs
* Shortest paths
* Minimum spanning trees

In [1]:
# packages and data
import pandas as pd, numpy as np

* A graph is a way of representing relationships that exist between pairs of objects, i.e. a graph is a set of objects, called vertices, together with a collection of
pairwise connections between them, called edges
* Edges in a graph are either directed or undirected. An edge (u,v) is said to be directed from u to v if the pair (u,v) is ordered, with u preceding v. An edge (u,v) is said to be undirected if the pair (u,v) is not ordered
* If all the edges in a graph are undirected, then we say the graph is an undirected graph. Likewise, a directed graph, also called a digraph, is a graph whose edges are all directed. A graph that has both directed and undirected edges is often called a mixed graph
* A path is a sequence of alternating vertices and edges that starts at a vertex and ends at a vertex such that each edge is incident to its predecessor and successor
vertex. A cycle is a path that starts and ends at the same vertex, and that includes at least one edge. We say that a path is simple if each vertex in the path is distinct, and we say that a cycle is simple if each vertex in the cycle is distinct, except for the first and last one. A directed path is a path such that all edges are directed and are traversed along their direction. A directed cycle is similarly defined

## The Graph ADT -  Data Structures For Graphs

* A graph is a collection of vertices and edges. The ADT can be modeled using 3 data types: vertex, edge and graph:
    * Vertex is a lightweight object that stores an arbitrary element provided by the user
    * An edge also stores an associated object
* Graphs can be represented using 4 data structures:
    * edge list - used to maintain an unordered list of all edges
    * adjascent list - used to maintain, for each vertex, a separate list containing those edges that are incident to the vertex
    * adjacency map - similar to an adjacency list, but the secondary container of all edges incident to a vertex is organized as a map, rather than as a list, with the adjacent vertex serving as a key
    * adjacency matrix - provides worst-case O(1) access to a specific edge (u,v) by maintaining an n X n matrix, for a graph with n vertices

## Graph Traversals

* Formally, a traversal is a systematic procedure for exploring a graph by examining all of its vertices and edges. A traversal is efficient if it visits all the vertices
and edges in time proportional to their number, that is, in linear time
* There are generally 2 classes of graph traversals algorithms:
    * depth first search: this algorithm explores as deeply as possible along each branch before backtracking. Starting from a chosen node, it explores one path completely until it reaches a dead end or a visited node, then backtracks and explores another path
        * In terms of its running time, depth-first search is an efficient method for traversing a graph
    * breadth first search: this algorithm explores a graph level by level. It starts at a designated node, then visits all its immediate neighbors, then all their unvisited neighbors, and so on
        * The BFS algorithm is more akin to sending out, in all directions, many explorers who collectively traverse a graph in coordinated fashion as opposed to DFS which is akin to a single person being sent out to search and backtrack (while painitng the path or holding a string - Theseus & The Minotaur)
* Other graph traversal algorithms include:
    * Dijkstra's algorithm: a graph algorithm for finding the shortest paths between nodes in a weighted graph, which may represent, for example, a road network. The algorithm uses a min-priority queue data structure for selecting the shortest paths known so far
    * Prim-Jarn´ık Algorithm: a greedy algorithm that finds a minimum spanning tree for a weighted undirected graph. This means it finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized
    * Kruskal's algorithm: a greedy algorithm that in each step adds to the forest the lowest-weight edge that will not form a cycle. It finds a minimum spanning forest of an undirected edge-weighted graph.