A Graph is a non-linear data structure (do not have to traverse the elements sequentially) consisting of nodes and edges. The nodes are sometimes also referred to as vertices and the edges are lines or arcs that connect any two nodes in the graph.

Graphs are used to solve many real-life problems. Graphs are used to represent networks. The networks may include paths in a city or telephone network or circuit network. Graphs are also used in social networks like linkedIn, Facebook. For example, in Facebook, each person is represented with a vertex(or node). Each node is a structure and contains information like person id, name, gender, locale etc.

# Introduction, DFS, BFS

Graph is a data structure that consists of following two components:<br>
1. A finite set of vertices also called as nodes.<br>
2. A finite set of ordered pair of the form (u, v) called as edge. The pair is ordered because (u, v) is not same as (v, u) in case of a directed graph(di-graph). The pair of the form (u, v) indicates that there is an edge from vertex u to vertex v. The edges may contain weight/value/cost.

<strong>Representations</strong>

Following two are the most commonly used representations of a graph.<br>
1. Adjacency Matrix<br>
2. Adjacency List<br>
There are other representations also like, Incidence Matrix and Incidence List. The choice of the graph representation is situation specific. It totally depends on the type of operations to be performed and ease of use.

Adjacency Matrix is a 2D array of size V x V where V is the number of vertices in a graph. Let the 2D array be adj[][], a slot adj[i][j] = 1 indicates that there is an edge from vertex i to vertex j. Adjacency matrix for undirected graph is always symmetric. Adjacency Matrix is also used to represent weighted graphs. If adj[i][j] = w, then there is an edge from vertex i to vertex j with weight w.

Pros: Representation is easier to implement and follow. Removing an edge takes O(1) time. Queries like whether there is an edge from vertex ‘u’ to vertex ‘v’ are efficient and can be done O(1).
<br><br>
Cons: Consumes more space O(V^2). Even if the graph is sparse(contains less number of edges), it consumes the same space. Adding a vertex is O(V^2) time.

<strong>Adjacency Matrix</strong>

In [2]:
class Graph:
    def __init__(self,numVertices):
        self.adjMat=[[-1 for j in range(numVertices)] for i in range(numVertices)]
        self.numVertices=numVertices
        self.vertices={}
        self.verticesList=[None]*numVertices

    def addVertex(self,vtx,id):
        if vtx>=0 and vtx<self.numVertices:
            self.vertices[id]=vtx
            self.verticesList[vtx]=id

    def setEdge(self,frm,to,cost):
        frm=self.vertices[frm]
        to=self.vertices[to]
        self.adjMat[frm][to]=cost

    def getVertices(self):
        return self.verticesList

    def getEdges(self):
        edges=[]
        for i in range(self.numVertices):
            for j in range(self.numVertices):
                if self.adjMat[i][j]!=-1:
                    edges.append((self.verticesList[i],self.verticesList[j],self.adjMat[i][j]))
        return edges

if __name__ == '__main__':
    graph=Graph(6)
    graph.addVertex(0,'a')
    graph.addVertex(1,'b')
    graph.addVertex(2,'c')
    graph.addVertex(3,'d')
    graph.addVertex(4,'e')
    graph.addVertex(5,'f')
    print(graph.getVertices())
    graph.setEdge('a','e',10)
    graph.setEdge('a','c',20)
    graph.setEdge('c','b',30)
    graph.setEdge('b','e',40)
    graph.setEdge('e','d',50)
    graph.setEdge('f','e',60)
    print(graph.getEdges())
    #print(graph.adjMat)


['a', 'b', 'c', 'd', 'e', 'f']
[('a', 'c', 20), ('a', 'e', 10), ('b', 'e', 40), ('c', 'b', 30), ('e', 'd', 50), ('f', 'e', 60)]


<strong>Adjacency List</strong>

An array of lists is used. Size of the array is equal to the number of vertices. Let the array be array[]. An entry array[i] represents the list of vertices adjacent to the ith vertex

In [3]:
from collections import defaultdict
class Graph:
    def __init__(self,v):
        self.v=v
        self.graph=defaultdict(list)

    def addEdge(self,u,v):
        self.graph[u].append(v)

    def getGraph(self):
        return self.graph

if __name__ == '__main__':
    graph=Graph(5)
    graph.addEdge(0, 1)
    graph.addEdge(0, 4)
    graph.addEdge(1, 2)
    graph.addEdge(1, 3)
    graph.addEdge(1, 4)
    graph.addEdge(2, 3)
    graph.addEdge(3, 4)
    print(graph.getGraph())


defaultdict(<class 'list'>, {0: [1, 4], 1: [2, 3, 4], 2: [3], 3: [4]})


Pros: Saves space O(|V|+|E|) . In the worst case, there can be C(V, 2) number of edges in a graph thus consuming O(V^2) space. Adding a vertex is easier.
<br><br>
Cons: Queries like whether there is an edge from vertex u to vertex v are not efficient and can be done O(V).

# Breadth First Search or BFS for a Graph

Breadth First Traversal (or Search) for a graph is similar to Breadth First Traversal of a tree. The only catch here is, unlike trees, graphs may contain cycles, so we may come to the same node again. To avoid processing a node more than once, we use a boolean visited array. For simplicity, it is assumed that all vertices are reachable from the starting vertex.

In [5]:
from collections import defaultdict

class Graph:
    def __init__(self, v):
        self.v = v
        self.graph = defaultdict(list)

    def addEdge(self, u, v):
        self.graph[u].append(v)

    def BFS(self, s):
        queue=[]
        visited=[False]*self.v

        queue.append(s)
        visited[s]=True

        while queue:
            temp=queue.pop(0)
            print(temp,end=' ')
            for i in self.graph[temp]:
                if visited[i]==False:
                    queue.append(i)
                    visited[i]=True


if __name__ == '__main__':
    g = Graph(4)
    g.addEdge(0, 1)
    g.addEdge(0, 2)
    g.addEdge(1, 2)
    g.addEdge(2, 0)
    g.addEdge(2, 3)
    g.addEdge(3, 3)
    g.BFS(2)


2 0 3 1 

Note that the above code traverses only the vertices reachable from a given source vertex. All the vertices may not be reachable from a given vertex (example Disconnected graph). To print all the vertices, we can modify the BFS function to do traversal starting from all nodes one by one

Time Complexity: O(V+E) where V is number of vertices in the graph and E is number of edges in the graph.

# Depth First Search or DFS for a Graph

In [7]:
from collections import defaultdict
class Graph:
    def __init__(self,v):
        self.v=v
        self.graph=defaultdict(list)

    def addEdge(self,u,v):
        self.graph[u].append(v)

    def DFSUtil(self,v,visited):
        visited[v]=True
        print(v,end=" ")
        for i in self.graph[v]:
            if visited[i]==False:
                self.DFSUtil(i,visited)

    def DFS(self,s):
        visited=[False]*self.v
        self.DFSUtil(s,visited)

if __name__ == '__main__':
    g = Graph(4)
    g.addEdge(0, 1)
    g.addEdge(0, 2)
    g.addEdge(1, 2)
    g.addEdge(2, 0)
    g.addEdge(2, 3)
    g.addEdge(3, 3)
    g.DFS(2)


2 0 1 3 

Time Complexity: O(V+E) where V is number of vertices in the graph and E is number of edges in the graph.

The above code traverses only the vertices reachable from a given source vertex. All the vertices may not be reachable from a given vertex (example Disconnected graph). To do complete DFS traversal of such graphs, we must call DFSUtil() for every vertex. Also, before calling DFSUtil(), we should check if it is already printed by some other call of DFSUtil()

# Applications of Breadth First Traversal

1) Shortest Path and Minimum Spanning Tree for unweighted graph->n an unweighted graph, the shortest path is the path with least number of edges. With Breadth First, we always reach a vertex from given source using the minimum number of edges. Also, in case of unweighted graphs, any spanning tree is Minimum Spanning Tree
<br><br>
2) Peer to Peer Networks. In Peer to Peer Networks like BitTorrent, Breadth First Search is used to find all neighbor nodes.
<br><br>
3) Crawlers in Search Engines: Crawlers build index using Breadth First. The idea is to start from source page and follow all links from source and keep doing same. Depth First Traversal can also be used for crawlers, but the advantage with Breadth First Traversal is, depth or levels of the built tree can be limited.
<br><br>
4) Social Networking Websites: In social networks, we can find people within a given distance ‘k’ from a person using Breadth First Search till ‘k’ levels.
<br><br>
5) GPS Navigation systems
<br><br>
6) Broadcasting in Network
<br><br>
7) Garbage Collection:Breadth First Search is preferred over Depth First Search because of better locality of reference:
<br><br>
8) Cycle Detection
<br><br>
9) Path Finding
<br><br>
10) Finding all nodes within one connected component: We can either use Breadth First or Depth First Traversal to find all nodes reachable from a given node.

# Applications of Depth First Search

1) For a weighted graph, DFS traversal of the graph produces the minimum spanning tree and all pair shortest path tree.
<br><br>
2) Detecting cycle in a graph
<br><br>
3) Path Finding
i) Call DFS(G, u) with u as the start vertex.
ii) Use a stack S to keep track of the path between the start vertex and the current vertex.
iii) As soon as destination vertex z is encountered, return the path as the
contents of the stack
<br><br>
4) Topological Sorting
<br><br>
5) Finding Strongly Connected Components of a graph
<br><br>
6) Solving puzzles with only one solution, such as mazes


In [10]:
# Graph representations using set and hash LATER

# Find a Mother Vertex in a Graph

Brute Force solution-> To run BFS/DFS for each vertex and check if all the vertices are reachable. This approach takes O(V(E+V)) time, which is very inefficient for large graphs.

In [16]:
from collections import defaultdict
class Graph:
    def __init__(self,v):
        self.v=v
        self.graph=defaultdict(list)

    def addEdge(self,u,v):
        self.graph[u].append(v)

    def BFSUtil(self,s):
        queue=[]
        self.visited[s]=True
        queue.append(s)

        while queue:
            temp=queue.pop(0)
            for node in self.graph[temp]:
                if self.visited[node]==False:
                    queue.append(node)
                    self.visited[node]=True
        if False in self.visited:
            return False
        return True


    def BFS(self):
        self.visited=[]
        for i in range(self.v):
            self.visited=[False]*self.v
            if self.BFSUtil(i):
                return i
        return -1

if __name__ == '__main__':
    g=Graph(7)
    g.addEdge(0, 1);
    g.addEdge(0, 2);
    g.addEdge(1, 3);
    g.addEdge(4, 1);
    g.addEdge(6, 4);
    g.addEdge(5, 6);
    g.addEdge(5, 2);
    g.addEdge(6, 0);
    print(g.BFS())


5


How to find mother vertex?
<br><br>
Case 1:- Undirected Connected Graph : In this case, all the vertices are mother vertices as we can reach to all the other nodes in the graph.
<br>
Case 2:- Undirected/Directed Disconnected Graph : In this case, there is no mother vertices as we cannot reach to all the other nodes in the graph.
<br>
Case 3:- Directed Connected Graph : In this case, we have to find a vertex -v in the graph such that we can reach to all the other nodes in the graph through a directed path.

The idea is based on Kosaraju’s Strongly Connected Component Algorithm. In a graph of strongly connected components, mother vertices are always vertices of source component in component graph. The idea is->
<br>
If there exist mother vertex (or vertices), then one of the mother vertices is the last finished vertex in DFS. (Or a mother vertex has the maximum finish time in DFS traversal).
<br>
A vertex is said to be finished in DFS if a recursive call for its DFS is over, i.e., all descendants of the vertex have been visited.



Approach-><br>
Do DFS traversal of the given graph. While doing traversal keep track of last finished vertex ‘v’. This step takes O(V+E) time.<br>
If there exist mother vertex (or vetices), then v must be one (or one of them). Check if v is a mother vertex by doing DFS/BFS from v. This step also takes O(V+E) time.

In [17]:
from collections import defaultdict
class Graph:
    def __init__(self,v):
        self.v=v
        self.graph=defaultdict(list)

    def addEdge(self,u,v):
        self.graph[u].append(v)

    def DFSUtil(self,v,visited):
        visited[v]=True
        for i in self.graph[v]:
            if visited[i]==False:
                self.DFSUtil(i,visited)

    def findMotherVertex(self):
        visited=[False]*self.v
        v=0
        for i in range(self.v):
            if visited[i]==False:
                self.DFSUtil(i,visited)
                v=i

        visited=[False]*self.v
        self.DFSUtil(v,visited)
        if False in visited:
            return -1
        return v

if __name__ == '__main__':
    g=Graph(7)
    g.addEdge(0, 1)
    g.addEdge(0, 2)
    g.addEdge(1, 3)
    g.addEdge(4, 1)
    g.addEdge(6, 4)
    g.addEdge(5, 6)
    g.addEdge(5, 2)
    g.addEdge(6, 0)
    result=g.findMotherVertex()
    print(result)


5


# Transitive Closure of a Graph using DFS

Given a directed graph, find out if a vertex v is reachable from another vertex u for all vertex pairs (u, v) in the given graph. Here reachable mean that there is a path from vertex u to v. The reach-ability matrix is called transitive closure of a graph.

In [18]:
from collections import defaultdict
class Graph:
    def __init__(self,v):
        self.v=v
        self.graph=defaultdict(list)
        self.tClosure=[[1 if i==j else 0 for j in range(self.v) ]for i in range(self.v)]

    parent=0

    def addEdge(self,u,v):
        self.graph[u].append(v)

    def DFSUtil(self,v):
        self.visited[v]=True
        self.tClosure[Graph.parent][v]=1
        for i in self.graph[v]:
            if self.visited[i]==False:
                self.DFSUtil(i)

    def findTClosure(self):
        self.visited=[]
        # parent=0
        for i in range(self.v):
            Graph.parent=i
            self.visited=[False]*self.v
            if self.visited[i] == False:
                self.DFSUtil(i)
        return self.tClosure

if __name__ == '__main__':
    g = Graph(4)
    g.addEdge(0, 1)
    g.addEdge(0, 2)
    g.addEdge(1, 2)
    g.addEdge(2, 0)
    g.addEdge(2, 3)
    g.addEdge(3, 3)
    result=g.findTClosure()
    for i in range(g.v):
        print(result[i])


[1, 1, 1, 1]
[1, 1, 1, 1]
[1, 1, 1, 1]
[0, 0, 0, 1]
