
# **Fundamentals of Graphs**
<div style="text-align: right">INFO 6205 Program Structures and Algorithms</div>
<div style="text-align: right">Goutham Kanahasabai, 18 August 2023</div>

 <img src="https://drive.google.com/uc?export=view&id=1R3_GwUjuQ_VGjjWi9VIHfiXBYnVcceVg" width="700px" height="" alt= “”  caption=
  "sjfgksd">

  **Image generated using DALL.E**  
  Prompt: "Create an image of a computer science graph network that visually captures the interconnections between nodes and edges. Showcase the complexity and patterns of relationships within the graph, highlighting its relevance in representing real-world networks and systems."






## **Table of Contents**



1. [Introduction](#introduction)
   - What are Graphs?
   - Types of Graphs
   
2. [Representation of Graphs](#rep)
   - Adjacency Lists
   - Adhacency Matrices
   
3. [Graph Traversals](#traversals)
  - Breadth-first search
  - Depth-first search

4. [Directed Acyclic Graphs](#dags)
  - What are DAGs
  - Topological Sort
   
5. [Real-World Applications](#applications)

   
6. [References](#refs)


## <a name="introduction"></a> **Introduction**


 Graphs are a fundamental concept in computer science that represent a wide range of relationships and connections between various entities. They are a powerful way to model and solve problems that involve networks, relationships, and dependencies.


### **What are Graphs?**

A graph is a collection of nodes (or vertices) and edges that connect pairs of nodes.
A graph is represented as `G = {V, E}`, where `G` is the graph space, `V` is the set of vertices and E is the set of edges.

For example, here's a simple graph:
<img src="https://drive.google.com/uc?export=view&id=1Ily9mhuaeHo4sUShjqGXhTCJAR4gUc1Y" alt= “” width="500px" height="" caption=
"sjfgksd">

In this graph, the set of vertices `V = {1, 2, 3, 4, 5}`,
and the set of edges is `E = {(1, 2), (2, 1), (1, 3), (3, 1), (2, 3), (3, 2), (2, 4), (4, 2) (2, 5), (5, 2), (3, 5), (5, 3), (4, 5), (5, 4)}`


Common terminology related to graphs includes:

1. **Vertex**: An individual element in the graph
2. **Edge**: A connection between two nodes. It can have a direction (directed graph) or not (undirected graph).
3. **Path**: A sequence of nodes where each adjacent pair is connected by an edge.
4. **Cycle**: A path that starts from a given vertex and ends at the same vertex
5. **Degree**: The degree of a node is the number of edges incident on it (for undirected graphs) or the number of outgoing edges (for directed graphs).

### **Types of Graphs**

There are primarily 2 types of graphs:

1. **Directed Graphs**: A directed graph is a set of vertices (nodes) connected by edges, where each node has a direction associated with it. Edges are usually represented by arrows pointing in the direction the graph can be traversed.


  <img src="https://drive.google.com/uc?export=view&id=1P4bfQKRgUumVwRdeE4PYmgzgMBB83MLH" alt= “” width="400px" height="" caption=
  "sjfgksd">

  In the directed graph depicted above, the edge set comparises of the following edges: `{(A, B), (B, C), (B, D), (C, E), (D, B), (E, D), (E, F)}`


2. **Undirected Graphs** : Undirected graphs have edges without any direction. The edges in these graphs signify a two-way relationship, as in each edge can be traversed in both directions.

  The following is a simple undirected graph:

<img src="https://drive.google.com/uc?export=view&id=1q3K5af1VUH9mofBpB04Fp-Co55nxpc3n" alt= “” width="400px" height="" caption=
"sjfgksd">


The edges of a graph can also have numeric values associated with them, making that graph a weight graph.  A weighted graph is defined as a graph in which the edges are assigned some weights which represent cost, distance or other units.
Edge weights can be positive or negative integers.

<img src="https://drive.google.com/uc?export=view&id=1QB-492SAe7RTj2EG2e1SautPm-OqPUPf" alt= “” width="400px" height="" caption=
"sjfgksd">




## <a name="rep"></a> **Representation of Graphs**

The two most common ways to represent a graph are as follows:

1. Adjacency Matrix
2. Adjacency List

Lets look at each of these representation with some code!

### **Adjacency Matrix**

In the adjacency matrix approach, the graph in consideration is represented as a boolean matrix, comprising of 0s and 1s.
Let’s assume that we have a graph with `n` vertices. We need to construct a 2D matrix, `Mat[n][n]` which is of dimensions `n x n`.

The matrix is then filled out based on the following criteria:

* For every edge from vertix `i` to `j`, mark `Mat[i][j]` as 1.
* For all other edges from vertex `i` to `j`, mark `Mat[i][j]` as 0.
'

Here's a simple undirected graph:

<img src="https://drive.google.com/uc?export=view&id=1z2hex--GCdqSb4E1kC7KIOx8nz2XSdOJ" alt= “” width="400px" height="" caption=
"sjfgksd">

The adjacency matrix for the graph above would be as follows:

<img src="https://drive.google.com/uc?export=view&id=1WfhIOqlIFqKhrmqJVOeNa2dO5Xd5Ov5T" alt= “” width="400px" height="" caption=
"sjfgksd">

Lets look at some code!

In [2]:
# Define a Graph class for adjacency matrix representation
class Graph(object):

    # Initialize the matrix
    def __init__(self, num_nodes):
        self.adj_matrix = []
        for _ in range(num_nodes):
            self.adj_matrix.append([0 for _ in range(num_nodes)])  # Initialize with zeros
        self.num_nodes = num_nodes  # Number of nodes in the graph

    # Add edges to the matrix
    def add_edge(self, node1, node2):
        if node1 == node2:
            print("Same vertex %d and %d" % (node1, node2))
        self.adj_matrix[node1][node2] = 1  # Mark edge between node1 and node2
        self.adj_matrix[node2][node1] = 1  # Mark edge between node2 and node1 (undirected)

    # Remove edges from the matrix
    def remove_edge(self, node1, node2):
        if self.adj_matrix[node1][node2] == 0:
            print("No edge between %d and %d" % (node1, node2))
            return
        self.adj_matrix[node1][node2] = 0  # Remove edge between node1 and node2
        self.adj_matrix[node2][node1] = 0  # Remove edge between node2 and node1 (undirected)

    # Get the number of nodes in the graph
    def __len__(self):
        return self.num_nodes

    # Print the adjacency matrix
    def print_matrix(self):
        for row in self.adj_matrix:
            for val in row:
                print('{:4}'.format(val), end='')  # Print each value with formatting
            print()  # Move to the next line after a row is printed


# Main function
def main():
    # Create a graph with 6 nodes
    graph = Graph(6)

    # Add edges to the graph
    graph.add_edge(0, 1)
    graph.add_edge(0, 2)
    graph.add_edge(1, 3)
    graph.add_edge(2, 4)
    graph.add_edge(3, 5)
    graph.add_edge(4, 5)

    # Print the adjacency matrix
    graph.print_matrix()


# Run the main function if the script is executed directly
if __name__ == '__main__':
    main()


   0   1   1   0   0   0
   1   0   0   1   0   0
   1   0   0   0   1   0
   0   1   0   0   0   1
   0   0   1   0   0   1
   0   0   0   1   1   0


### Facts about Adjacency Matrix:

* Basic operations such as adding an edge, removing an edge, and checking whether there is an edge from vertex `i` to vertex `j`  take contant time i.e. `O(1)`
* This representation makes use of `V x V` dimensional matrix, so space required in worst case is `O(|V|^2)`. Graphs are usually sparse in nature, and do not have a lot of edges. This is why adjacency lists are the better choice for most use cases.


### **Adjacency List**

In this representation, an array of lists stores the edges between two vertices. The array's size equals the number of vertices (i.e, `n`). Each index in this array represents a specific vertex in the graph. The entry at the index `i` of the array contains a linked list containing the vertices that are adjacent to vertex `i`.

Assuming there are `n` vertices in the graph, we create an array of list of size `n` called `List[n]`.

* `List[0]` contains all nodes connected to vertex 0.
* `List[1]` contains all nodes connected to vertex 1 and so on.


For the same graph we used for the adjacency matrix, here's the adjacency list representation:

<img src="https://drive.google.com/uc?export=view&id=13fL982jP0NDNkHvSibBDQ2pRFWUse0xb" alt= “” width="500px" height="" caption=
"sjfgksd">







In [8]:
# Adjacency List representation in Python

# Class to represent a node in the adjacency list
class AdjacencyNode:
    def __init__(self, vertex_value):
        self.vertex = vertex_value  # Value of the vertex/node
        self.next_node = None       # Pointer to the next node in the linked list


# Class to represent a graph using adjacency list
class Graph:
    def __init__(self, num_vertices):
        self.num_vertices = num_vertices         # Number of vertices (nodes) in the graph
        self.adjacency_list = [None] * self.num_vertices  # Initialize the array to hold adjacency lists

    # Method to add edges to the graph
    def add_edge(self, source_vertex, destination_vertex):
        # Create a new node for the destination vertex and add it to the source's list
        destination_node = AdjacencyNode(destination_vertex)
        destination_node.next_node = self.adjacency_list[source_vertex]
        self.adjacency_list[source_vertex] = destination_node

        # Create a new node for the source vertex and add it to the destination's list
        source_node = AdjacencyNode(source_vertex)
        source_node.next_node = self.adjacency_list[destination_vertex]
        self.adjacency_list[destination_vertex] = source_node

    # Method to print the graph
    def print_adjacency_list(self):
        for vertex_index in range(self.num_vertices):
            print("Vertex " + str(vertex_index) + ":", end="")
            current_node = self.adjacency_list[vertex_index]
            while current_node:
                print(" -> {}".format(current_node.vertex), end="")
                current_node = current_node.next_node
            print(" \n")


if __name__ == "__main__":
    num_vertices = 5

    # Create a graph and add edges
    graph = Graph(num_vertices)
    graph.add_edge(0, 1)
    graph.add_edge(0, 2)
    graph.add_edge(0, 3)
    graph.add_edge(1, 2)

    # Print the adjacency list representation of the graph
    graph.print_adjacency_list()


Vertex 0: -> 3 -> 2 -> 1 

Vertex 1: -> 2 -> 0 

Vertex 2: -> 1 -> 0 

Vertex 3: -> 0 

Vertex 4: 



### Facts about Adjacency List:

* An adjacency list is efficient with respect to memory as we only need to store the values for the edges.
* A vertex can have at most `O(|V|)` neighbours so in worst case we would have to check for every adjacent vertex. Therefore, time complexity is `O(|V|)` .


## <a name="traversals"></a> **Graph Traversals**


The goal of a graph traversal, generally, is to find all nodes reachable from a given source node. In an undirected graph we follow all edges; in a directed graph we follow only the outgoing edges.
Two of the most commonly used algorithms for traversal are: ***Depth-First Search (DFS)*** and ***Breadth-First Search (BFS)***.
Lets look into each of these algorithms in more detail.

### **Breadth-First Search (BFS)**

Breadth-First Search (BFS) explores a graph level by level. It starts from a selected node (usually called the source node) and systematically visits all nodes that are reachable from the source node, in breadth-first order.  
BFS **guarantees** that nodes are visited level by level, meaning all nodes at distance 1 from the source are visited before moving on to nodes at distance 2, and so on. This property makes BFS suitable for finding the shortest path in unweighted graphs.  

BFS is commonly used in finding shortest paths, network routing protocols, social network analysis, and more.
*italicized text*

#### Implementation

To implement BFS, we use a queue to keep track of nodes to be visited. The queue ensures that nodes are visited in the order they were added, which is crucial for maintaining the breadth-first exploration order.
Consider the following undirected graph:

<img src="https://drive.google.com/uc?export=view&id=1zKamWjh9Tno0o9rItUXJkzKRVlzYsiT2" alt= “” width="400px" height="" caption=
"sjfgksd">


In [16]:
graph = {
    'A': ['B', 'C'],
    'B': ['D', 'E'],
    'C': ['F'],
    'D': [],
    'E': [],
    'F': []
}

visited = []  # List to track visited nodes
queue = []    # Initialize a queue

def bfs(visited, graph, node):
    visited.append(node)  # Mark the current node as visited
    queue.append(node)    # Enqueue the current node

    while queue:  # Loop to visit nodes in BFS order
        current_node = queue.pop(0)  # Dequeue the front node
        print(current_node, end=" ")  # Print the visited node

        # Explore neighbors of the current node
        for neighbor in graph[current_node]:
            if neighbor not in visited:
                visited.append(neighbor)
                queue.append(neighbor)  # Enqueue unvisited neighbors

# Driver Code
print("Following is the Breadth-First Search traversal orer:")
bfs(visited, graph, 'A')  # Call the BFS function starting from node 'A'

# Explanation:
# The graph dictionary represents the adjacency list of the graph. Each key is a node, and its corresponding value is a list of neighbors.
# The visited list is used to keep track of visited nodes.
# The queue is used to maintain the order in which nodes are explored.
# The bfs function implements the BFS algorithm. It starts by marking the initial node as visited and enqueuing it in the queue. Then, it enters a loop that continues as long as the queue is not empty.
# In each iteration of the loop, the first node is dequeued from the queue, printed, and then all its unvisited neighbors are enqueued and marked as visited.
# The code then prints the nodes visited in the BFS order.



Following is the Breadth-First Search traversal orer:
A B C D E F 

#### **Time and Space complexity**

**Time**
* The time complexity is determined the number of vertices in the graph.  
* In the worst case, BFS may visit all vertices and edges in the graph.
Therefore, the time complexity of BFS is `O(V + E)`, where V represents the number of vertices and E represents the number of edges in the graph.

 Although this may look linear, the time complexity of BFS can also be expressed as `O(b^d)` in certain scenarios, where 'b' is the average branching factor of the graph and 'd' is the depth of the graph.

**Space**

* The space complexity of BFS depends on the maximum number of vertices in the queue at any given time.
* In the worst case, if the graph is a complete graph, every vertices at each level will be stored in the queue. Therefore, the space complexity can be stated as `O(V)`


### **Depth-First Search (DFS)**

 Depth-First Search (DFS) is another graph traversal algorithm, similar to Breadth-First Search (BFS), but it explores the graph in a different manner. Instead of visiting all neighbors at a given depth level before moving to the next level, DFS focuses on exploring as far down a branch as possible before backtracking.



#### Implementation

DFS uses a stack to remember the nodes to be explored. The algorithm keeps moving deeper into the graph until it reaches a dead end (a node with no unvisited neighbors), at which point it backtracks to a previous node and explores other branches.

Looking at the same graph that we used earlier, lets look at how DFS is implemented and the traversal order:
<img src="https://drive.google.com/uc?export=view&id=1zKamWjh9Tno0o9rItUXJkzKRVlzYsiT2" alt= “” width="400px" height="" caption=
"sjfgksd">


In [17]:
graph = {
    'A': ['B', 'C'],
    'B': ['D', 'E'],
    'C': ['F'],
    'D': [],
    'E': [],
    'F': []
}

visited = set()  # Using a set for efficient membership checking

def dfs(graph, node):
    if node not in visited:
        print(node, end=" ")  # Print the current node
        visited.add(node)      # Mark the current node as visited

        for neighbor in graph[node]:
            dfs(graph, neighbor)  # Recursively visit unvisited neighbors

# Driver Code
print("Following is the Depth-First Search:")
dfs(graph, 'A')  # Call the DFS function starting from node '5'


Following is the Depth-First Search:
A B D E C F 

#### **Time and Space complexity**

**Time**
* In DFS, the time complexity is once again determined by the number of vertices and edges in the graph.
* Speaking worst case, DFS algorithm will visit all vertices and edges in the graph.
Therefore, the time complexity of DFS is `O(V + E)`, where `V` is the number of vertices and `E` is the number of edges.

**Space**

* The space complexity of DFS depends on the maximum depth of recursion.
* Speaking worst case, if the graph is a long path, recursion can go as deep as the number of vertices.
Therefore, the space complexity of DFS is `O(V)`, where `V` represents the number of vertices in the graph.

## **Directed Acyclic Graphs**



A Directed Acyclic Graph (DAG) is a type of graph that consists of vertices connected by directed edges, where the edges have a direction and form a structure without any cycles. You cannot follow a sequence of edges and return to the same vertex in a directed acyclic graph by following the direction of the edges.

<img src="https://drive.google.com/uc?export=view&id=1Psm4U0svZr3RgXKzaADmDV7CMGvRieYH" alt= “” width="400px" height="" caption=
"sjfgksd">

The key characteristics of a DAG are:

1. **Directed Edges**: Each edge in a DAG has a specific direction, indicating a one-way relationship from one vertex to another.

2. **Acyclic**: A fundamental property of a DAG is that it contains no cycles


### **Topological Sorting**

Topological sorting provides a linear ordering of the vertices that respects the direction of edges, ensuring that if there's an edge from vertex A to vertex B, vertex A will appear before vertex B in the sorted order.

 If the graph is a DAG, it's guaranteed that a valid topological order exists. If the graph has cycles (is not acyclic), it's impossible to find a topological order, as cycles introduce circular dependencies that cannot be resolved.

 Lets look at how can go about implementing this algorithm in code, for the following graph!

 <img src="https://drive.google.com/uc?export=view&id=1Mx7hHwfZGQkFFGJ7vRqjhHG7mwvj9uwc" alt= “” width="700px" height="" caption=
"sjfgksd">



In [18]:
from collections import defaultdict

class DirectedGraph:
    def __init__(self):
        self.adj_list = defaultdict(list)  # Create an adjacency list to represent the graph

    def add_edge(self, source, destination):
        self.adj_list[source].append(destination)  # Add an edge from source to destination

        # Ensure destination vertex is initialized in the adjacency list
        if destination not in self.adj_list:
            self.adj_list[destination] = []

    def topological_sort_util(self, vertex, visited, stack):
        visited[vertex] = True  # Mark the current vertex as visited

        # Visit all neighbors of the current vertex
        for neighbor in self.adj_list[vertex]:
            if not visited[neighbor]:  # If the neighbor is not visited yet
                self.topological_sort_util(neighbor, visited, stack)  # Recursively visit the neighbor

        stack.append(vertex)  # Add the current vertex to the stack after all its neighbors are visited

    def topological_sort(self):
        visited = {vertex: False for vertex in self.adj_list}  # Mark all vertices as not visited
        stack = []  # Initialize an empty stack for the result

        # Iterate through all vertices to perform DFS for topological sorting
        for vertex in self.adj_list:
            if not visited[vertex]:  # If the vertex is not visited yet
                self.topological_sort_util(vertex, visited, stack)  # Perform DFS on the vertex

        return stack[::-1]  # Return the stack containing the topological sorting order

# Example usage
graph = DirectedGraph()
graph.add_edge('A', 'B')
graph.add_edge('B', 'C')
graph.add_edge('B', 'D')
graph.add_edge('C', 'E')
graph.add_edge('D', 'E')
graph.add_edge('E', 'F')

sorted_order = graph.topological_sort()
print("Topological Sorting:", sorted_order)


Topological Sorting: ['A', 'B', 'D', 'C', 'E', 'F']


## <a name="applications"></a> **Real World Applications and Resources**



### 1. Adjacency Matrices and Adjacency Lists

 - **Social Networks**: In social network analysis, an adjacency matrix can represent relationships between individuals. The matrix can indicate whether two individuals are friends or connected in some way. It's used for analyzing network properties, finding influential individuals, and understanding social dynamics.
 - **Web Page Linkages**: For web pages, an adjacency matrix can represent links between pages. This is used in web ranking algorithms (like PageRank) to determine the importance of a web page based on its links and the links of pages that link to it.

### 2. Breadth-First Search

 - **Shortest Path Finding**: BFS can be used to find the shortest path between two nodes in an unweighted graph. This is useful in navigation systems, map routing applications, and logistics planning.
 - **Broadcasting and Networking**: BFS can simulate broadcasting in networks. It helps in broadcasting messages efficiently to all nodes within a certain distance from the source node. Also, BFS can be used to find the shortest path for data packets to traverse through a network of routers or nodes.

### 3. Depth-First Search

  - **Maze Solving**: DFS can be used to solve mazes by exploring possible paths deeply before backtracking. It's commonly applied in robotics and automated systems to navigate through complex environments.

  - **Topological Sorting**: In project scheduling and task dependencies, DFS can be used to perform topological sorting, helping to determine the order in which tasks should be executed.

### Resources/Tools for learning
- Depth First Search Visualization tool developed by USF is a great tool to help understand DFS in a visual manner: https://www.cs.usfca.edu/~galles/visualization/DFS.html
- Breadth First Search visualization tool developed by USF is great too: https://www.cs.usfca.edu/~galles/visualization/BFS.html
- Adjacency Matrix tool: https://graphonline.ru/en/create_graph_by_matrix


 ## **Quiz**

Time to test your knowledge with a small quiz!

## <a name="refs"></a>**References**



1. Kleinberg, J., & Tardos, É. (2021). Algorithm Design. Pearson.
2. Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms. MIT Press.
3.   [GeeksForGeeks](https://www.geeksforgeeks.org/)
4.   [Programiz](https://www.programiz.com/dsa/graph)
5. [Cornell CS2112 material](https://www.cs.cornell.edu/courses/cs2112/2012sp/lectures/lec24/lec24-12sp.html)


