In [2]:
%%html
<style>h1{text-align:center;}h1{text-transform:none;}.rendered_html h4{color:#17b6eb;font-size: 1.6em;}img[alt=dia1]{width:35%;}img[alt=book]{width:20%;font-size: 3em;}img[alt=dia2]{width:50%;}.author{font-size:8px;}</style>

# Lecture 11: Graphs

## 0. Rate me!

![dia2](img/11evaluation.png)

Link: https://lehrevaluation.hochschulevaluierungsverbund.de/evasys/online/

Password: YDXSP 

open until June 17th


## 1. Introduction to Graphs


### 1.1 It's NOT about the following "Graphs" 

![dia2](img/11batman.jpg)
<div class="author">src: J. Matthew Register</div>

### It also not about this guy

![dia1](img/11dracula.jpg)
<div class="author">src: klexikon.de</div>

### Today, it's all about this...

![dia1](img/11graph.png)
<div class="author">src: medium.com</div>

### 1.2 Definition
<div class="author">src: wikipedia.org</div>

- A __graph__ is an abstract data type that is meant to implement the undirected graph and directed graph concepts from the field of graph theory within mathematics.

- A __graph data structure__ consists of a finite __set of vertices__ (aka nodes/points), together with a set of unordered pairs of these vertices for an undirected graph or a set of ordered pairs for a directed graph. These pairs are known as __edges__ (aka links/lines).

![book](img/11graph.svg)


### 1.3 Real world examples

Drawing a subway map can be optimized using graph theory.

![dia1](img/11subway.png)

Social media networks model the connection between their users and their friends in a graph structure.

![dia1](img/11social.webp)

Search engines use graph representations to store semantic relationships in *Knowledge Graphs*.
![dia1](img/11knowledge.png)
<div class="author">src: wipro.com</div>

Recommendation algorithms use graph theoretic principles.

![dia1](img/11reco.png)

Looking at the examples above, a __graph data structure__ is a collection of data nodes (let's call them __vertices__) which are connected to other data vertices.

Every relationship is an __edge__ from one vertex to another.

Hence, a __graph__ is a __data structure $(V, E)$__, which consists of:

- a collection of *vertices* $V$

- a collection of *edges* $E$, represented as ordered pairs of vertices $(u, v)$

![dia1](img/11vertix2.png)
<div class="author">src: David Condrey, CC BY-SA 3.0, via wikimedia.org</div>

<div class="author">src: wikipedia.org</div>

The two vertices forming an edge are said to be the endpoints of this edge, and the edge is said to be incident to the vertices. A vertex $w$ is said to be adjacent to another vertex $v$ if the graph contains an edge $(v,w)$. The *neighborhood* of a vertex $v$ is an *induced subgraph* of the graph, formed by all vertices adjacent to *v*. 

- The __degree of a vertex__, denoted $𝛿(v)$ in a graph is the number of edges incident to it.

- An __isolated vertex__ is a vertex with degree zero (no endpoint of any edge)

- A __leaf vertex__ is a vertex with degree one

- A __simplicial vertex__ is one whose neighbors form a clique: every two neighbors are adjacent. 

- A __universal vertex__ is a vertex that is adjacent to every other vertex in the graph. 

![book](img/11vertix.png)

Let's try to describe this graph as a collection of vertices and edges:

```
V = {1, 2, 3, 4, 5, 6}
E = {(1,2), (1,5), (2,3), (2,5), (3,4), (4,5), (4,6)}
G = {V, E}
```

Notice, that the position of the vertices does not change semantics of the graph.


#### Exercise 1
<div class="author">macs.hw.ac.uk</div>


In each of the following, two of the graphs are the same and the third is different. Find the odd man out in each case.

![](img/11ex0.png)


Solution:

- (a) the last one

- (b) the middle one

- (c) the last one

### 1.4 Graph Terminology
<div class="author">programiz.com</div>

- __Adjacency__: A vertex is said to be adjacent to another vertex if there is an edge connecting them. 
    
- __Path__: A sequence of edges that allows you to go from vertex A to vertex B is called a path. 
    
- __Directed Graph__: A graph in which an edge $(u,v)$ doesn't necessarily mean that there is an edge $(v, u)$ as well. The edges in such a graph are represented by arrows to show the direction of the edge. One can distinguish the __outdegree__ (number of outgoing edges), denoted $𝛿^+(v)$, from the __indegree__ (number of incoming edges), denoted $𝛿^-(v)$. A __source vertex__ is a vertex with indegree zero, while a __sink vertex__ is a vertex with outdegree zero.

![book](img/11directed.svg)




### 1.5 Graph Representations

1. Edge list

2. Adjacency matrix

3. Adjacency list

4. Incidence Matrix

### 1.5.1 Edge list

An edge list is a data structure used to represent a graph as an unordered list of its edges. 

An edge list may be considered a variation on an adjacency list which is represented as a length $|V|$ array of lists. Since each edge contains just two or three numbers, the total space for an edge list is $Θ(|E|)$.

__Advantages__:
 
- easy to loop/iterate over all edges

__Disadvantages__:

- hard to quickly tell if an edge exists between two vertices
- hard to find the degree of a vertex

##### Python implementation

![dia2](img/11edgelist.png)

```
[e_1, e_2, ... , e_n]
e_n = [v_x, v_y]
```

In [7]:
edge_list = [[1,2], [1,4], [1,7], [2,3], [2,5], [3,6], [4,7], [5,6], [6,7]]

#### Exercise 2
If the graph is dense and the number of edges is large, an adjacency matrix should be the first choice. Even if the graph and the adjacency matrix is sparse, we can represent it using data structures for sparse matrices
Implement an edge list of the graph shown below.

![dia1](img/11ex1.png)


In [None]:
edge_list = 

In [10]:
# Solution
edge_list = [[1,2], [1,5], [1,6], [2,3], [2,7], [3,4], [4,5], [4,7], [5,6], [5,7]]

### 1.5.2 Adjacency Matrix

An adjacency matrix is a 2D array of $N x N$ vertices. It is a way of representing a graph as a matrix of booleans. A finite graph can be represented in the form of a square matrix on a computer, where the boolean value of the matrix indicates if there is a direct path between two vertices. Each row and column represent a vertex.

If the value of any elemenlastt $a[i][j]$ is 1, it represents that there is an edge connecting vertex $i$ and vertex $j$.

- the non-diagonal entry $a[i][j]$ is the number of edges joining vertex $i$ and vertex $j$
- the diagonal entry $a[i][i]$ corresponds to the number of loops at vertex $i$ (often disallowed)
- in an undirected graph $a[i][j]=a[j][i]$ for all $i, j$ (diagonally symmetric)

![dia2](img/11amatrix.png)
<div class="author">src: Tone Brathen</div>


__Advantages__:
 
- fast to tell whether an edge exists between any two vertices $i$ and $j$
- other basic operations like adding an edge or removing an edge are also extremely time efficient, constant time operations
- if the graph is dense and the number of edges is large, an adjacency matrix should be the first choice
- expensive matrix operations can be calculated in parallel (i.e. using GPUs)
- efficient analysis of relationship between vertices by performing operations on the adjacent matrix

__Disadvantages__:

- consumes a lot of memory on sparse graphs with only a few edges, since it is required to reserve space for every possible link between all vertices. Hence, adajacency lists are often a better choice for many tasks


##### Python implementation

![dia2](img/11amatrix.png)


In [35]:
adj_matrix = [
[0, 1, 0, 1, 0, 0, 1],
[1, 0, 1, 0, 1, 0, 0],
[0, 1, 0, 0, 0, 1, 0],
[1, 0, 0, 0, 0, 0, 1],
[0, 1, 0, 0, 0, 1, 1],
[0, 0, 1, 0, 1, 0, 1],
[1, 0, 0, 1, 1, 1, 0], 
]

#### Exercise 3
<div class="author">macs.hw.ac.uk</div>

For each of the graphs below write down the adjacency matrix.

![](img/11ex3.png)


__Solution:__

![](img/11ex3sol.png)


#### Exercise 4
<div class="author">macs.hw.ac.uk</div>

Draw the graphs having the following adjacency matrices
![dia2](img/11ex4.png)


__Solution:__

![](img/11ex4sol.png)


### 1.5.3 Adjacency List

An adjacency list represents a graph as an array of linked lists.

The index of the array represents a vertex and each element in its linked list represents the other vertices that form an edge with the vertex.

- in unweighted graphs, the lists can simply be references to other vertices and thus use little memory
- in undirected graphs, edge $(i,j)$ is stored in both $i$ and $j$ lists

![](img/11adjlist.png)
<div class="author">src: Tone Brathen</div>


__Advantages__:
 
- an adjacency list is efficient in terms of storage because we only need to store the values for the edges
- it is efficient to find all vertices adjacent to a vertex
- new vertices can be added to the graph easily, and they can be connected with existing nodes simply by adding elements to the appropriate arrays

__Disadvantages__:

- determining whether an edge exists between two vertices requires $O(n)$ time, where $N$ is the average number of edges per node


##### Python implementation

![dia2](img/11adjlist.png)


In [38]:
adj_list = [[2, 4, 7], [1, 3, 5], [2, 6], [1, 7], [2, 6, 7], [3, 5, 7], [1, 4, 5, 6]]

#Alternative: Sinmple dictionary of vertices and its edges

adjlist_dict = {'1': set(['2', '4', '7']),
                '2': set(['1', '3', '5']),
                '3': set(['2', '6']),
                '4': set(['1', '7']),
                '5': set(['2', '6', '7']),
                '6': set(['3', '5', '7']),
                '7': set(['1', '4', '6', '6'])}

In [40]:
# Adjascency List representation in Python
# Source: programiz.com

class AdjNode:
    def __init__(self, value):
        self.vertex = value
        self.next = None


class Graph:
    def __init__(self, num):
        self.V = num
        self.graph = [None] * self.V

    # Add edges
    def add_edge(self, s, d):
        node = AdjNode(d)
        node.next = self.graph[s]
        self.graph[s] = node
        node = AdjNode(s)
        node.next = self.graph[d]
        self.graph[d] = node

    # Print the graph
    def print_agraph(self):
        for i in range(self.V):
            print("Vertex " + str(i) + ":", end="")
            temp = self.graph[i]
            while temp:
                print(" -> {}".format(temp.vertex), end="")
                temp = temp.next
            print(" \n")


if __name__ == "__main__":
    V = 5

    # Create graph and edges
    graph = Graph(V)
    graph.add_edge(0, 1)
    graph.add_edge(0, 2)
    graph.add_edge(0, 3)
    graph.add_edge(1, 2)

    graph.print_agraph()

Vertex 0: -> 3 -> 2 -> 1 

Vertex 1: -> 2 -> 0 

Vertex 2: -> 1 -> 0 

Vertex 3: -> 0 

Vertex 4: 



#### Exercise 5

Write down the adjacency list specifying the graph below.

![dia2](img/11ex5.png)


__Solution:__

![](img/11ex5sol.png)


### 1.5.4 Incidence Matrix

An *Incidence Matrix* is a two-dimensional matrix, in which the rows represent the vertices and columns represent the edges. The entries indicate the incidence relation between the vertex at a row and edge at a column.

![dia2](img/11incidence.png)
<div class="author">src: Laura Leal-Taixe</div>


## 1.6 Weighted/directed Graphs

A __weighted graph__ or a network is a graph in which a number (the weight) is assigned to each edge. Such weights might represent for example costs, lengths or capacities, depending on the problem at hand. Such graphs arise in many contexts, for example in shortest path problems such as the traveling salesman problem. 

A __directed graph__ or digraph is a graph in which edges have orientations. 

![dia1](img/11weight.png)


##### 1.6.1 Representation of weighted/directed graphs in Adjacancy Lists and Adjacancy Matrices

1. *weighted*
    - __Adjacancy List__: store weight in each edge node
    - __Adjacancy Matrix__: store weight in each matrix box
2. *directed*
    - __Adjacancy List__: edges appear only in start vertex's list
    - __Adjacancy Matrix__: no longer diagonally symmetric
    
![dia2](img/11wd.png)


## 1.7 Runtime Comparison
<div class="author">src: wikipedia.org</div>

||Adjacency List|Adjacency Matrix|Incidence Matrix|
|:---|:---|:---|:---|
|Store Graph|$O(|V|+|E|)$|$O(|V|^2)$|$O(|V|\cdot|E|)$|
|Add vertex|$O(1)$|$O(|V|^2)$|$O(|V|\cdot|E|)$|
|Add edge|$O(1)$|$O(1)$|$O(|V|\cdot|E|)$|
|Remove vertex|$O(|V|^2)$|$O(|V|^2)$|$O(|V|\cdot|E|)$|
|Remove edge|$O(|V|)$|$O(1)$|$O(|E|)$|
|Remarks|slow to remove vertices and edges because finding them is inefficient|slow to add or remove vertices because matrix must be copied/resized|slow to add or remove vertices and edges because matrix must be resized/copied|

## 1.8 Types of Graphs

##### Types
![](img/11types.png)
<div class="author">src: gousios.gr</div>



##### Connected Graph
<div class="author">src: programmiz.com</div>


A connected graph is a graph in which there is always a path from a vertex to any other vertex.

![book](img/11connected.webp)


##### Spanning Trees

Sometimes, when working with graphs, it is only necessary to connect all nodes, but it does not matter how they are connected. For example, a network with redundant connections.

For routing decisions, we want to reduce a graph to a tree. The tree tells us exactly how data will be send.

A spanning tree is a sub-graph of an undirected connected graph, which includes all the vertices of the graph with a minimum possible number of edges. If a vertex is missed, then it is not a spanning tree.


![dia2](img/11sp2.webp)

The maximum number of spanning trees is equal to:
$$n^{n-2}$$

#### Exercise 6

What is the maximum number of possible spanning trees for a graph with 4 vertices? Draw all (some) of the possible spanning trees.

__Solution:__

![dia2](img/11ex6sol.jpg)


##### Minimum Spanning Tree (MST)

A __minimum spanning tree (MST)__ or __minimum weight spanning tree__ is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight.

![dia2](img/11mst.webp)
<div class="author">src: i1.wp.com</div>

__Applications:__

- Approximating travelling salesman problem
- Approximating multi-terminal minimum cut problem
- Approximating minimum-cost weighted perfect Cluster Analysis
- Handwriting recognition
- Image segmentation
- Circuit design

__ Algorithms for finding the MST:__

- Kruskal's Algorithm
- Prim's Algorithm

### 1.9. Kruskal's Algorithm

Kruskal's algorithm finds a minimum spanning forest of an undirected edge-weighted graph. If the graph is connected, it finds a minimum spanning tree. It is a *greedy algorithm* in graph theory as in each step it adds the next lowest-weight edge that will not form a cycle to the minimum spanning forest.

1. The edges are sorted in ascending order of their weights

2. the edge having the minimum weight is selected and added to the MST. If an edge creates a cycle, it is rejected
    
3. The above steps are repeated untill all the vertices are covered

![dia2](img/11kruskal.png)
<div class="author">src: O'Reilly Media</div>

In [45]:
# Kruskal's algorithm in Python
# Source: programiz.com

class Graph:
    def __init__(self, vertices):
        self.V = vertices
        self.graph = []

    def add_edge(self, u, v, w):
        self.graph.append([u, v, w])

    # Search function

    def find(self, parent, i):
        if parent[i] == i:
            return i
        return self.find(parent, parent[i])

    def apply_union(self, parent, rank, x, y):
        xroot = self.find(parent, x)
        yroot = self.find(parent, y)
        if rank[xroot] < rank[yroot]:
            parent[xroot] = yroot
        elif rank[xroot] > rank[yroot]:
            parent[yroot] = xroot
        else:
            parent[yroot] = xroot
            rank[xroot] += 1

    #  Applying Kruskal algorithm
    def kruskal_algo(self):
        result = []
        i, e = 0, 0
        self.graph = sorted(self.graph, key=lambda item: item[2])
        parent = []
        rank = []
        for node in range(self.V):
            parent.append(node)
            rank.append(0)
        while e < self.V - 1:
            u, v, w = self.graph[i]
            i = i + 1
            x = self.find(parent, u)
            y = self.find(parent, v)
            if x != y:
                e = e + 1
                result.append([u, v, w])
                self.apply_union(parent, rank, x, y)
        for u, v, weight in result:
            print("%d - %d: %d" % (u, v, weight))


g = Graph(6)
g.add_edge(0, 1, 4)
g.add_edge(0, 2, 4)
g.add_edge(1, 2, 2)
g.add_edge(1, 0, 4)
g.add_edge(2, 0, 4)
g.add_edge(2, 1, 2)
g.add_edge(2, 3, 3)
g.add_edge(2, 5, 2)
g.add_edge(2, 4, 4)
g.add_edge(3, 2, 3)
g.add_edge(3, 4, 3)
g.add_edge(4, 2, 4)
g.add_edge(4, 3, 3)
g.add_edge(5, 2, 2)
g.add_edge(5, 4, 3)
g.kruskal_algo()

1 - 2: 2
2 - 5: 2
2 - 3: 3
3 - 4: 3
0 - 1: 4


#### Exercise 7

Consider the graph below. Provide the order in which Kruskal's algorithm adds the edges to the MST. Compute the MST using Kruskal's algorithm.

![dia2](img/11ex7.png)


__Solution:__

{c,d},{d,f},{b,e},{e,f},{a,c}

Cost: 18

#### Exercise 8

Consider a set of 5 towns. The cost of construction of a road between town $i$ and $j$ is $a_{ij}$. Find the minimum cost of connecting the towns with each other.

$$\begin{bmatrix}
0&3&5&11&9\\
3&0&3&9&8\\
5&3&0&\infty&10\\
11&9&\infty&0&7\\
9&8&10&7&0
\end{bmatrix}$$

__Solution:__

Using Krukal's algorithm solving the MST problem of the tree $\{(1,2),(2,3),(2,5),(4,5)\}$, the total cost is 21.