# Graphs in Python

## Origins of Graph Theory
Before we start with the actual implementations of graphs in Python and before we start with the introduction of Python modules dealing with graphs, we want to devote ourselves to the origins of graph theory.
The origins take us back in time to the Künigsberg of the 18th century. Königsberg was a city in Prussia that time. The river Pregel flowed through the town, creating two islands. The city and the islands were connected by seven bridges as shown. The inhabitants of the city were moved by the question if it was possible to take a walk through the town by visiting each area of the town and crossing each bridge only once. Every bridge must have been crossed completely, i.e., it is not allowed to walk halfway onto a bridge and then turn around and later cross the other half from the other side. The walk needs not start and end at the same spot. Leonhard Euler solved the problem in 1735 by proving that it is not possible. He found out that the choice of a route inside each land area is irrelevant and that the only thing which mattered is the order (or the sequence) in which the bridges are crossed. He had formulated an abstraction of the problem, eliminating unnecessary facts and focusing on the land areas and the bridges connecting them. This way, he created the foundations of graph theory. If we see a "land area" as a vertex and each bridge as an edge, we have "reduced" the problem to a graph.
<img src=koenigsberg_bridges.png>
## Introduction into Graph Theory Using Python
<img src=simple_graph_isolated.png>
Before we start our treatize on possible Python representations of graphs, we want to present some general definitions of graphs and its components.
A "graph" in mathematics and computer science consists of "nodes", also known as "vertices". Nodes may or may not be connected with one another. In our illustration, - which is a pictorial representation of a graph, - the node "a" is connected with the node "c", but "a" is not connected with "b". The connecting line between two nodes is called an edge. If the edges between the nodes are undirected, the graph is called an undirected graph. If an edge is directed from one vertex (node) to another, a graph is called a directed graph. An directed edge is called an arc.
Though graphs may look very theoretical, many practical problems can be represented by graphs. They are often used to model problems or situations in physics, biology, psychology and above all in computer science. In computer science, graphs are used to represent networks of communication, data organization, computational devices, the flow of computation.
In the latter case, the are used to represent the data organisation, like the file system of an operating system, or communication networks. The link structure of websites can be seen as a graph as well, i.e., a directed graph, because a link is a directed edge or an arc.
Python has no built-in data type or class for graphs, but it is easy to implement them in Python. One data type is ideal for representing graphs in Python, i.e., dictionaries. The graph in our illustration can be implemented in the following way: 

In [2]:
graph = {"a" : ["c"],
         "b" : ["c", "e"],
         "c" : ["a", "b", "d", "e"],
         "d" : ["c"],
         "e" : ["c", "b"],
         "f" : []
        }

The keys of the dictionary above are the nodes of our graph. The corresponding values are lists with the nodes, which are connecting by an edge. There is no simpler and more elegant way to represent a graph.

An edge can be seen as a 2-tuple with nodes as elements, i.e. ("a","b")

Function to generate the list of all edges: 

In [3]:
def generate_edges(graph):
    edges = []
    for node in graph:
        for neighbour in graph[node]:
            edges.append((node, neighbour))

    return edges

print(generate_edges(graph))

[('a', 'c'), ('b', 'c'), ('b', 'e'), ('c', 'a'), ('c', 'b'), ('c', 'd'), ('c', 'e'), ('d', 'c'), ('e', 'c'), ('e', 'b')]


As we can see, there is no edge containing the node "f". "f" is an isolated node of our graph.
The following Python function calculates the isolated nodes of a given graph: 

In [4]:
def find_isolated_nodes(graph):
    """ returns a list of isolated nodes. """
    isolated = []
    for node in graph:
        if not graph[node]:
            isolated += node
    return isolated

In [5]:
find_isolated_nodes(graph)

['f']

## Graphs as a Python Class
Before we go on with writing functions for graphs, we have a first go at a Python graph class implementation. If you look at the following listing of our class, you can see in the \_\_init\_\_-method that we use a dictionary "self.\_\_graph\_dict" for storing the vertices and their corresponding adjacent vertices. 
<img src=simple_graph_with_loop.png>

In [6]:
""" A Python Class
A simple Python graph class, demonstrating the essential 
facts and functionalities of graphs.
"""


class Graph(object):
    
    def __init__(self, graph_dict=None):
        """ initializes a graph object 
            If no dictionary or None is given,
            an empty dictionary will be used
        """
        if graph_dict == None:
            graph_dict = {}
        self.__graph_dict = graph_dict

    def vertices(self):
        """ returns the vertices of a graph """
        return list(self.__graph_dict.keys())

    def edges(self):
        """ returns the edges of a graph """
        return self.__generate_edges()

    def add_vertex(self, vertex):
        """ If the vertex "vertex" is not in 
            self.__graph_dict, a key "vertex" with an empty
            list as a value is added to the dictionary. 
            Otherwise nothing has to be done. 
        """
        if vertex not in self.__graph_dict:
            self.__graph_dict[vertex] = []

    def add_edge(self, edge):
        """ assumes that edge is of type set, tuple or list
        """
        edge = set(edge)
        (vertex1, vertex2) = tuple(edge)
        if vertex1 in self.__graph_dict:
            self.__graph_dict[vertex1].append(vertex2)
        else:
            self.__graph_dict[vertex1] = [vertex2]
            
    def __generate_edges(self):
        """ A static method generating the edges of the 
            graph "graph". Edges are represented as sets 
            with one (a loop back to the vertex) or two 
            vertices 
        """
        edges = []
        for vertex in self.__graph_dict:
            for neighbour in self.__graph_dict[vertex]:
                if {neighbour, vertex} not in edges:
                    edges.append({vertex, neighbour})
        return edges

    def __str__(self):
        res = "vertices: "
        for k in self.__graph_dict:
            res += str(k) + " "
        res += "\nedges: "
        for edge in self.__generate_edges():
            res += str(edge) + " "
        return res
    
    '''
    def find_path(self, start_vertex, end_vertex, path=None):
        """ find a path from start_vertex to end_vertex
        in graph """
        if path == None:
            path = []
        graph = self.__graph_dict
        path = path + [start_vertex]
        if start_vertex == end_vertex:
            return path
        if start_vertex not in graph:
            return None
        for vertex in graph[start_vertex]:
            if vertex not in path:
                extended_path = self.find_path(vertex, 
                                               end_vertex, 
                                               path)
                if extended_path: 
                    return extended_path
        return None
    '''
        
    '''
    def find_all_paths(self, start_vertex, end_vertex, path=[]):
        """ find all paths from start_vertex to
        end_vertex in graph """
        graph = self.__graph_dict
        path = path + [start_vertex]
        if start_vertex == end_vertex:
            return [path]
        if start_vertex not in graph:
            return []
        paths = []
        for vertex in graph[start_vertex]:
            if vertex not in path:
                extended_paths = self.find_all_paths(vertex, 
                                                     end_vertex,
                                                     path)
                for p in extended_paths:
                    paths.append(p)
        return paths
    '''
        
if __name__ == "__main__":

    g = {"a" : ["d"],
         "b" : ["c"],
         "c" : ["b", "c", "d", "e"],
         "d" : ["a", "c"],
         "e" : ["c"],
         "f" : []
        }


    graph = Graph(g)

    print("Vertices of graph:")
    print(graph.vertices())

    print("Edges of graph:")
    print(graph.edges())

    print("Add vertex:")
    graph.add_vertex("z")

    print("Vertices of graph:")
    print(graph.vertices())
 
    print("Add an edge:")
    graph.add_edge({"a","z"})
    
    print("Vertices of graph:")
    print(graph.vertices())

    print("Edges of graph:")
    print(graph.edges())

    print('Adding an edge {"x","y"} with new vertices:')
    graph.add_edge({"x","y"})
    print("Vertices of graph:")
    print(graph.vertices())
    print("Edges of graph:")
    print(graph.edges())

Vertices of graph:
['a', 'b', 'c', 'd', 'e', 'f']
Edges of graph:
[{'d', 'a'}, {'c', 'b'}, {'c'}, {'c', 'd'}, {'c', 'e'}]
Add vertex:
Vertices of graph:
['a', 'b', 'c', 'd', 'e', 'f', 'z']
Add an edge:
Vertices of graph:
['a', 'b', 'c', 'd', 'e', 'f', 'z']
Edges of graph:
[{'d', 'a'}, {'c', 'b'}, {'c'}, {'c', 'd'}, {'c', 'e'}, {'z', 'a'}]
Adding an edge {"x","y"} with new vertices:
Vertices of graph:
['a', 'b', 'c', 'd', 'e', 'f', 'z', 'x']
Edges of graph:
[{'d', 'a'}, {'c', 'b'}, {'c'}, {'c', 'd'}, {'c', 'e'}, {'z', 'a'}, {'x', 'y'}]


## Paths in Graphs
We want to find now the shortest path from one node to another node. Before we come to the Python code for this problem, we will have to present some formal definitions.

**Adjacent vertices:**
Two vertices are adjacent when they are both incident to a common edge.

**Path in an undirected Graph:**
A path in an undirected graph is a sequence of vertices $P = ( v_1, v_2, ..., v_n )$ such that $v_i$ is adjacent to $v_{i+1}$ for 1 ≤ i < n. Such a path P is called a path of length n from $v_1$ to $v_n$.

**Simple Path:**
A path with no repeated vertices is called a simple path.

Example:
(a, c, e) is a simple path in our graph, as well as (a,c,e,b). (a,c,e,b,c,d) is a path but not a simple path, because the node c appears twice.

The following method finds a path from a start vertex to an end vertex: 

In [9]:
def find_path(self, start_vertex, end_vertex, path=None):
    """ find a path from start_vertex to end_vertex
    in graph """
    if path == None:
        path = []
    graph = self.__graph_dict
    path = path + [start_vertex]
    if start_vertex == end_vertex:
        return path
    if start_vertex not in graph:
        return None
    for vertex in graph[start_vertex]:
        if vertex not in path:
            extended_path = self.find_path(vertex, 
                                           end_vertex, 
                                           path)
            if extended_path: 
                return extended_path
    return None

If we update our graph class including the find_path method (don't forget to run the cell after you add the method!), we can check the way of working of our find\_path function: 

In [2]:
g = { "a" : ["d"],
      "b" : ["c"],
      "c" : ["b", "c", "d", "e"],
      "d" : ["a", "c"],
      "e" : ["c"],
      "f" : []
    }

graph = Graph(g)

print("Vertices of graph:")
print(graph.vertices())

print("Edges of graph:")
print(graph.edges())


print('The path from vertex "a" to vertex "b":')
path = graph.find_path("a", "b")
print(path)

print('The path from vertex "a" to vertex "f":')
path = graph.find_path("a", "f")
print(path)

print('The path from vertex "c" to vertex "c":')
path = graph.find_path("c", "c")
print(path)

Vertices of graph:
['a', 'c', 'b', 'e', 'd', 'f']
Edges of graph:
[set(['a', 'd']), set(['c', 'b']), set(['c']), set(['c', 'd']), set(['c', 'e'])]
The path from vertex "a" to vertex "b":
['a', 'd', 'c', 'b']
The path from vertex "a" to vertex "f":
None
The path from vertex "c" to vertex "c":
['c']


The method find_all_paths finds all the paths between a start vertex to an end vertex:

In [4]:
def find_all_paths(self, start_vertex, end_vertex, path=[]):
    """ find all paths from start_vertex to
    end_vertex in graph """
    graph = self.__graph_dict
    path = path + [start_vertex]
    if start_vertex == end_vertex:
        return [path]
    if start_vertex not in graph:
        return []
    paths = []
    for vertex in graph[start_vertex]:
        if vertex not in path:
            extended_paths = self.find_all_paths(vertex, 
                                                 end_vertex,
                                                 path)
            for p in extended_paths:
                paths.append(p)
    return paths

We slightly changed our example graph by adding edges from "a" to "f" and from "f" to "d" to test the previously defined method: 

In [7]:
g = { "a" : ["d", "f"],
      "b" : ["c"],
      "c" : ["b", "c", "d", "e"],
      "d" : ["a", "c"],
      "e" : ["c"],
      "f" : ["d"]
    }


graph = Graph(g)

print("Vertices of graph:")
print(graph.vertices())

print("Edges of graph:")
print(graph.edges())


print('All paths from vertex "a" to vertex "b":')
path = graph.find_all_paths("a", "b")
print(path)

print('All paths from vertex "a" to vertex "f":')
path = graph.find_all_paths("a", "f")
print(path)

print('All paths from vertex "c" to vertex "c":')
path = graph.find_all_paths("c", "c")
print(path)

NameError: name 'Graph' is not defined

## Degree
<img src=simple_graph_with_loop.png>
The degree of a vertex v in a graph is the number of edges connecting it, with loops counted twice. The degree of a vertex v is denoted deg(v). The maximum degree of a graph G, denoted by Δ(G), and the minimum degree of a graph, denoted by δ(G), are the maximum and minimum degree of its vertices.

In the example graph, the maximum degree is 5 at vertex c and the minimum degree is 0, i.e, the isolated vertex f.

If all the degrees in a graph are the same, the graph is a regular graph. In a regular graph, all degrees are the same, and so we can speak of the degree of the graph.


The degree sum formula (Handshaking lemma):

$∑_{v ∈ V}$deg(v) = 2 |E|

This means that the sum of degrees of all the vertices is equal to the number of edges multiplied by 2. We can conclude that the number of vertices with odd degree has to be even. This statement is known as the handshaking lemma. The name "handshaking lemma" stems from a popular mathematical problem: In any group of people the number of people who have shaken hands with an odd number of other people from the group is even.

The following method calculates the degree of a vertex: 

In [8]:
def vertex_degree(self, vertex):
    """ The degree of a vertex is the number of edges connecting
        it, i.e., the number of adjacent vertices. Loops are counted
        double, i.e. every occurence of vertex in the list
        of adjacent vertices. """ 
    adj_vertices =  self.__graph_dict[vertex]
    degree = len(adj_vertices) + adj_vertices.count(vertex)
    return degree

The following method calculates a list containing the isolated vertices of a graph: 

In [8]:
def find_isolated_vertices(self):
    """ returns a list of isolated vertices. """
    graph = self.__graph_dict
    isolated = []
    for vertex in graph:
        print(isolated, vertex)
        if not graph[vertex]:
            isolated += [vertex]
    return isolated

## Degree Sequence
The degree sequence of an undirected graph is defined as the sequence of its vertex degrees in a non-increasing order.
The following method returns a tuple with the degree sequence of the instance graph: 

In [9]:
def degree_sequence(self):
    """ calculates the degree sequence """
    seq = []
    for vertex in self.__graph_dict:
        seq.append(self.vertex_degree(vertex))
    seq.sort(reverse=True)
    return tuple(seq)