# <center><b> Graphs </b></center>
Graphs are a non-linear data structure, in which the problem is represented as a network by connecting a set of nodes with edges, like a telephone network or social network. For example, in a graph, nodes can represent different cities while the links between them represent edges.

A graph is a set of a finite number of vertices (also known as nodes) and edges, in which the edges are the links between vertices, and each edge in a graph joins two distinct nodes. Moreover, a graph is a formal mathematical representation of a network, i.e. a graph G is an ordered pair of a set V of vertices and a set E of edges, given as $G = (V, E)$


<center><img src="./img/92.png" width="200"/></center>


<u> Notations </u>
- **Node or vertex**: A point or node in a graph is called a vertex. In the preceding diagram, the vertices or nodes are A, B, C, D, and E and are denoted by a dot.

- **Edge**: This is a connection between two vertices. The line connecting A and B is an example of an edge. 
- **Loop**: When an edge from a node is returned to itself , that edge forms a loop, e.g. D node. 
- **Degree of a vertex/node**: The total number of edges that are incidental on a given vertex is called the degree of that vertex. For example, the degree of the B vertex in the previous diagram is 4 . 
- **Adjacency**: This refers to the connection(s) between any two nodes; thus, if there is a connection between any two vertices or nodes, then they are said to be adjacent to each other. For example, the C node is adjacent to the A node because there is an edge between them. 
- **Path**: A sequence of vertices and edges between any two nodes represents a path. For example, CABE represents a path from the C node to the E node. 
- **Leaf vertex (also called pendant vertex)**: A vertex or node is called a leaf vertex or pendant vertex if it has exactly one degree.
- **Diameter**: of a graph, is the maximum of the distance between any two pair of nodes in the graph
- **Distance**: is the smallest number of steps needed to reach one from the other in the graph

## <center><b> Directed and Undirected graphs </b></center>

If the connecting edges in a graph are undirected, then the graph is called an undirected graph, and if the connecting edges in a graph are directed, then it is called a directed graph. 

An **undirected graph** simply represents edges as lines between the nodes. 
There is no additional information about the relationship between the nodes, other than the fact that they are connected.

For **directed graphs**, the arrow of an edge determines the flow of direction.

In directed graphs, we distinguish between the **in-degree** and **out-degree** according to edges incoming to anode and outgoing from it.

<center><img src="./img/94.png" width="500"/></center>


#### <center><b> Special Cases </b></center>
- **Complete Graph**: a graph with an edge between all pair of nodes
- **Sparse graph**: a graph with a few edges with $m = 0(n)$, $m = O(nlogn)$
- **Dense graph**: a graph with many edges, with $m = \Omega (n^2)$
- **Unrooted tree**: is a connected graph with $m = n - 1$
- **A rooted tree**: is a connected graph with $m = n-1$ in which one node is designated as the *root*


<center><img src="./img/95.png" width="500"/></center>


<u> Graph operations </u>

<center><img src="./img/93.png" width="300"/></center>

## <center><b> Graph representation </b></center>

A graph representation technique means how we store the graph in memory, i.e., how we store the vertices, edges, and weights (if the graph is a weighted graph). Graphs can be represented with two methods, i.e. (1) an **adjacency list** based on a linked list, and (2) an **adjacency matrix**.


An adjacency list representation is based on a linked list. In this, we represent the graph by maintaining a list of neighbors (also called a adjacent node) for every vertex (or node) of the graph.


In an adjacency matrix representation of a graph, we maintain a matrix that represents which node is adjacent to which other node in the graph; i.e., the adjacency matrix has the information of every edge in the graph, which is represented by cells of the matrix.


An adjacency list is preferable when we expect that the graph is going to be sparse and we will have a smaller number of edges;

The adjacency matrix is preferable when we expect the graph to have a lot of edges, and the matrix will be dense.

## <center><b> Adjacency Lists </b></center>

In this representation, all the nodes directly connected to a node x are listed in its adjacent list of nodes. 

The graph below is represented by displaying the adjacent list for all the nodes of the graph. Two nodes, A and B , in the graph shown in Figure 9.7, are said to be adjacent if there is a direct connection between them:

<center><img src="./img/96.png" width="230"/></center>

A linked list can be used to implement the adjacency list. In order to represent the graph, we need the number of linked lists equal to the total number of nodes in the graph. At each index, the adjacent nodes to that vertex are stored.

<center><img src="./img/97.png" width="230"/></center>


Here, the first node represents the A vertex of the graph, with its adjacent nodes being B and C . 

The second node represents the B vertex of the graph, with its adjacent nodes of E , C , and A . Similarly, the other vertices, C , E , and F , of the graph are represented with their adjacent nodes

So, to implement a graph efficiently using Python, a dictionary data structure is used since it is more suitable to represent the graph.


<center><img src="./img/98.png" width="230"/></center>

**Pros**:
- Use less space
- Flexible, nodes can be complex object

**Cons**:
- Checking if an edge exists is $O(n)$, generally slower (must go through all the nodes)
- Getting all the incoming edges of a node is slow

In [2]:
class Graph_AdjacencyList:
    
    def __init__(self):
        """initializer, the nodes are private"""
        self.__nodes = dict()
        
    def __len__(self):
        """return the size of the graph
           accessible through len(Graph)
        """
        return len(self.__nodes)
    
    def V(self):
        """returns the nodes"""
        return self.__nodes.keys()
    
    def node_iterator(self):
        """a generator of nodes to access all of them once"""
        for n in self.__node.keys():
            yield n
            
    def edge_iterator(self):
        """a generator of edges to access all of them once"""
        for u in self.__nodes:
            for v in self.__nodes[u]:
                yield (u,v, self.__nodes[u][v])
                
    def insert_node(self, node):
        """adds the node to the graph"""
        if node not in self.__nodes:
            self.__nodes[node] = dict()
            
    def insert_edge(self, u, v, weight=1):
        """adds the edge to the graph"""
        self.insert_node(u)  #start node
        self.insert_node(v)  # end node
        self.__nodes[u][v] = weight # add the edge

## <center><b> Adjacency Matrix </b></center>

Another approach to representing a graph is to use an adjacency matrix. In this, the graph is represented by showing the nodes and their interconnections through edges. Using this method, the dimensions $( V x V )$ of a matrix are used to represent the graph, where each cell denotes an edge in the graph. 

A matrix is a two-dimensional array. So, the idea here is to represent the cells of the matrix with a 1 or a 0 , depending on whether two nodes are connected by an edge or not.

An adjacency matrix can be implemented using the given adjacency list. 

he adjacency matrix is a 2D array of size $V x V$ where $V$ is the number of vertices in a graph. Let the 2D array be $adj[][]$, a slot $adj[i][j]$ = 1 indicates that there is an edge from vertex $i$ to vertex $j$. Adjacency matrix for undirected graph is always symmetric. Adjacency Matrix is also used to represent weighted graphs. If $adj[i][j] = w$, then there is an edge from vertex $i$ to vertex $j$ with weight $w$.

<center><img src="./img/99.png" width="260"/></center>

There are multiple implementation possibilities in Python:
- list/set/dict
- libraries such as networksx, igraph

**Pros**:
- Flexible, can be used to put weights on edges
- Quick to check if an edge exists, $O(1)$
- In undirected graphs, the matrix is symmetric, so we can save space by only storing the upper or lower triangle

**Cons**:
- Uses a lot of space (nxn matrix no matter how many edges)

In [3]:
class GraphAsAdjacencyMatrix:
    def __init__(self):
        # would be better a set, but I need an index
        self.__nodes = list()
        self.__matrix = list()
        
    def __len__(self):
        return len(self.__nodes)
    
    def nodes(self):
        return self.__nodes
    
    def matrix(self):
        return self.__matrix
    
    def __str__(self):
        return str(self.__matrix)
    
    def insertNode(self):
        self.__nodes.append(len(self.__nodes))
        for row in self.__matrix:
            row.append(0)
        self.__matrix.append([0]*len(self.__nodes))
        
    def insertEdge(self, u, v, weight=1):
        self.__matrix[u][v] = weight
        
    def deletineEdge(self, u, v):
        self.__matrix[u][v] = 0
        
    def deleteNode(self, u):
        self.__nodes.pop(u)
        self.__matrix.pop(u)
        for row in self.__matrix:
            row.pop(u)
            
    def adjacent(self, u, v):
        return self.__matrix[u][v] != 0
    
    def edges(self):
        for u in range(len(self.__nodes)):
            for v in range(len(self.__nodes)):
                if self.__matrix[u][v] != 0:
                    yield (u,v, self.__matrix[u][v])
                    

## <center><b> Complexity Recap </b></center>


**ADJACENCY MATRIX**
- Requires $O(n^2)$ space
- Checking if an edge exists is $O(1)$
- Looping through all the edges takes $O(n^2)$
- Ideal for dense graphs

**ADJACENCY LIST**
- Requires $O(n m)$ space
- Checking if an edge exists is $O(n)$
- Looping through all the edges takes $O(n+m)$
- Ideal for sparse graphs

<hr>

## <center><b> Graph Traversal </b></center>

A graph traversal means to visit all the vertices of the graph while keeping track of which nodes or vertices have already been visited and which ones have not. A graph traversal algorithm is efficient if it traverses all the nodes of the graph in the minimum possible time. 

Graph traversal, also known as a graph search algorithm, is quite similar to the tree traversal algorithms like **preorder** , **inorder** , **postorder** , and level order algorithms; similar to them, in a graph search algorithm we start with a node and traverse through edges to all other nodes in the graph.

A common strategy of graph traversal is to follow a path until a dead end is reached, then traverse back up until there is a point where we meet an alternative path. We can also iteratively move from one node to another in order to traverse the full graph or part of it.

## <center><b> Breadth-first traversal </b></center>

Breadth-first search (BFS) works very similarly to how a level order traversal algorithm works in a tree data structure. 

The BFS algorithm also works level by level; it starts by visiting the root node at level 0, and then all the nodes at the first level directly connected to the root node are visited at level 1. The level 1 node has a distance of 1 from the root node. After visiting all the nodes at level 1, the level 2 nodes are visited next. Likewise, all the nodes in the graph are traversed level by level until all the nodes are visited.


A **queue** data structure is used to store the information of vertices that are to be visited in a graph. We begin with the starting node. Firstly, we visit that node, and then we look up all of its neighboring, or adjacent, vertices. We first visit these adjacent vertices one by one, while adding their neighbors to the list of vertices that are to be visited. We follow this process until we have visited all the vertices of the graph, ensuring that no vertex is visited twice.

We start visiting the first node, i.e., A node, and then we add all it adjacent vertices, B, C, and E, to the queue.

<center><img src="./img/100.png" width="260"/></center>

Once we have visited the A vertex, next, we visit its first adjacent vertex, B, and add those adjacent vertices of vertex B that are not already added in the queue or not visited. In this case, we have to add the D vertex (since it has two vertices, A and D nodes, out of which A is already visited) to the queue, as shown in the following diagram:

<center><img src="./img/101.png" width="260"/></center>

Next, we visit the C vertex, and then we add its adjacent vertices, i.e., A, B, and D, to the queue. However, the A and B vertices are already visited, so we add only the D vertex to the queue, as shown in the following diagram:

<center><img src="./img/102.png" width="260"/></center>

Next, we visit the E vertex, and then we add its adjacent vertices, i.e., A, B, and D, to the queue. However, the A and B vertices are already visited, so we add only the D vertex to the queue, as shown in the following diagram:

<center><img src="./img/103.png" width="260"/></center>

Similarly, after visiting the E vertex, we visit the D vertex in the last step.

<center><img src="./img/104.png" width="260"/></center>

In [4]:
# not sure about this
from collections import deque

class Graph:
    def __init__(self):
        self.graph = {}

    def add_edge(self, node, neighbors):
        self.graph[node] = neighbors

    def bfs(self, start_node):
        visited = set()
        queue = deque([start_node])

        while queue:
            current_node = queue.popleft()
            if current_node not in visited:
                print(current_node, end=' ')
                visited.add(current_node)
                queue.extend(self.graph[current_node])

# Example usage:
# Create a graph and add edges
graph = Graph()
graph.add_edge(1, [2, 3])
graph.add_edge(2, [4, 5])
graph.add_edge(3, [6])
graph.add_edge(4, [])
graph.add_edge(5, [7])
graph.add_edge(6, [])
graph.add_edge(7, [])

# Perform BFS starting from node 1
print("BFS starting from node 1:")
graph.bfs(1)    

BFS starting from node 1:
1 2 3 4 5 6 7 

## <center><b> Depth-first traversal </b></center>

The depth-first search (DFS) or traversal algorithm traverses the graph similar to how the preorder traversal algorithm works in trees. In the DFS algorithm, we traverse the tree in the depth of any particular path in the graph. As such, child nodes are visited first before sibling nodes..


In this, we start with the root node; firstly we visit it, and then we see all the adjacent vertices of the current node. We start visiting one of the adjacent nodes. If the edge leads to a visited node, we backtrack to the current node. And, if the edge leads to an unvisited node, then we go to that node and continue processing from that node. We continue the same process until we reach a dead end when there is no unvisited node; in that case, we backtrack to previous nodes, and we stop when we reach the root node while backtracking.


We start by visiting the A node, and then we look at the neighbors of the A vertex, then a neighbor of that neighbor, and so on. After visiting the A vertex, we visit one of its neighbors, B (in our example, we sort alphabetically; however, any neighbor can be added).

<center><img src="./img/105.png" width="260"/></center>

After visiting the B vertex, we look at another neighbor of A, that is, S, as there is no vertex connected to B that can be visited. Next, we look for the neighbors of the S vertex, which are the C and G vertices. We visit C as shown

<center><img src="./img/106.png" width="260"/></center>

After visiting the C node, we visit its neighboring vertices, D and E, as shown

<center><img src="./img/107.png" width="260"/></center>

Similarly, after visiting the E vertex, we visit the H and G vertices, as shown:

<center><img src="./img/108.png" width="260"/></center>

Finally, we visit the F node,

<center><img src="./img/109.png" width="260"/></center>

In [5]:
class Graph:
    def __init__(self):
        self.graph = {}

    def add_edge(self, node, neighbors):
        self.graph[node] = neighbors

    def dfs(self, start_node, visited=None):
        if visited is None:
            visited = set()

        if start_node not in visited:
            print(start_node, end=' ')
            visited.add(start_node)

            for neighbor in self.graph[start_node]:
                self.dfs(neighbor, visited)

# Example usage:
# Create a graph and add edges
graph = Graph()
graph.add_edge(1, [2, 3])
graph.add_edge(2, [4, 5])
graph.add_edge(3, [6])
graph.add_edge(4, [])
graph.add_edge(5, [7])
graph.add_edge(6, [])
graph.add_edge(7, [])

# Perform DFS starting from node 1
print("DFS starting from node 1:")
graph.dfs(1)

DFS starting from node 1:
1 2 4 5 7 3 6 