This notebook implements the **Depth First Search(DFS)** algorithm. This algorithm is widely used for searching in graphs. It starts with one starting node and go deeper and deeper unless it finds target node. If there is dead end then it backtrack to initial node and choose other path to find path. This is complete algorithm but it may or may not optimal for graphs. It's time complexity and space complexity are O(V + E) and O(V) respectively where V is total number of nodes and E is total number of edges.

Real world applications of this algorithm are:

1. Path finding and Navigation.
2. Web Crawling
3. Social Network Analysis    
4. Compiler Design.
5. Robotics and Autonomous vehicles.
6. Game Development.

In [28]:
# Importing required modules to work wiith datasets.
import pandas as pd

# Reading data from the datasets.
cities_data = pd.read_csv('connected_cities.csv')   #Reading data for cities.

Below cell is creating a Graph class to create graph from the above read data from csv files. This class also provides other informations about graph like total number nodes in graph. We will use object of this class as graph in algorithms.

In [30]:
# Creating graph class.
class Graph(object):
    '''This is a graph.'''
    def __init__(self, dataset):
        '''Initializes the graph.'''
        self.graph = self._create_graph(dataset)

    # Accesor methods
    @property
    def graph_data(self):
        '''Return the adjacency list of the nodes.'''
        return self.graph
    
    @property
    def total_nodes(self):
        '''The total unique nodes in the graph.'''
        total_node = 0
        for i in self.graph:  # iterating over each key in dictinary.
            total_node += 1
        return total_node
    
    @property
    def edges(self):
        '''Total number of edges in the graph. Since the graph is directed so there
        are two edges between any two nodes of the graph.'''
        total_edges = 0
        for i in self.graph:
            total_edges += len(self.graph[i])  #Calcuting length of each list for every node.
        return total_edges
    
    def is_connected(self):
        '''
        Return True if graph is connected otherewise Fasle.
        
        Since each node of the graph is considered as key in the adjacency list(actually a dictionary).
        So, if any key is having value as blank list then the node is isolated from the graph. Since
        here in the graph the list contains all the nodes to which key node is connnected. So, if this
        is the case then graph is disconnected.

        Note: Connected does not mean fully connected. This function only tell whether a graph is 
        connected or not.
        '''
        for i in self.graph:
            if len(self.graph[i]) == 0:
                return False
        return True
    
    #Non-public helper functions.
    def _create_graph(self, dataset):
        '''This is helper function to create graph.'''
        graph = {}
        for node1, node2 in zip(dataset['node1'], dataset['node2']):
            if node1 not in graph:
                graph[node1] = []   #Adding nodes to the graph.
            graph[node1].append(node2)  #Adding edges to the graph.
        return graph

# Graph creation.
graph = Graph(cities_data)

#------------------Checking some information about grapah.---------------------------
print(graph.graph_data) #giving details of nodes connected to any node.
print('Total nodes in the graph: ', graph.total_nodes)
print('Total edges in the graph: ', graph.edges)
print('Is graph conneced: ', graph.is_connected())

{'Jodhpur': ['Bikaner', 'Rajsamand'], 'Rajsamand': ['Jodhpur', 'Sikar'], 'Bikaner': ['Jodhpur', 'Sri Ganganagar'], 'Sri Ganganagar': ['Bikaner', 'Sikar'], 'Sikar': ['Sri Ganganagar', 'Rajsamand', 'Una', 'Jaipur'], 'Una': ['Sikar', 'Baghpat'], 'Jaipur': ['Bundi', 'Delhi', 'Sikar'], 'Bundi': ['Jaipur', 'Kota', 'Belagavi'], 'Belagavi': ['Bundi', 'Hanamkonda', 'Calicut'], 'Calicut': ['Belagavi'], 'Delhi': ['Jaipur', 'Faridabad', 'Baghpat'], 'Kota': ['Bundi', 'Bhopal', 'Agra'], 'Baghpat': ['Una', 'Delhi', 'Aligarh'], 'Faridabad': ['Agra', 'Delhi'], 'Bhopal': ['Kota', 'Morena'], 'Agra': ['Faridabad', 'Aligarh', 'Morena', 'Kota'], 'Aligarh': ['Agra', 'Baghpat', 'Sitapur', 'Mahoba'], 'Morena': ['Bhopal', 'Sagar', 'Agra'], 'Sagar': ['Morena', 'Balaghat'], 'Balaghat': ['Sagar', 'Hanamkonda'], 'Hanamkonda': ['Balaghat', 'Belagavi'], 'Mahoba': ['Aligarh', 'Lucknow', 'Chitrakoot'], 'Sitapur': ['Aligarh', 'Lucknow'], 'Lucknow': ['Sitapur', 'Mahoba', 'Raebareli', 'Lakhimpur'], 'Lakhimpur': ['Lucknow'

In [33]:
#Implemeting the actual algorithm.
visited_list = []               #List to keep track of the nodes visited in the graph.
path = []                       #This stores the path followed to search for target node.
def DFS(starting_node, target_node):
    '''
    Search for target node and track the path of the target node from starting node.
    
    This will surely give path if target node exists and graph is connected.
    '''
    if starting_node not in visited_list:
        visited_list.append(starting_node)  #Appending node to visited list.
        path.append(starting_node)
        for node in graph.graph[starting_node]:
            if node == target_node:     # checking for target node.
                path.append(node)
                return True
            if node not in visited_list:
                if DFS(node, target_node):
                    return True # Stopping further looping if target found
        path.pop()  #Removing element from path while backtracking.
    return False

#-------------Checking output of algorithm for different inputs-----------------
DFS('Bikaner', 'Pakur')
DFS('Bikaner', 'Sri Ganganagar')    #Long path for just neighbour nodes.
DFS('Sagar', 'Bhopal')  # different path length reversing target and starting node.
DFS('Bhopal', 'Sagar')
DFS('Calicut', 'Gaya')
DFS('Gaya', 'Calicut')  # Same path length but path is not same.

The path from Bikaner to Pakur is: ['Bikaner', 'Jodhpur', 'Rajsamand', 'Sikar', 'Una', 'Baghpat', 'Delhi', 'Jaipur', 'Bundi', 'Kota', 'Bhopal', 'Morena', 'Agra', 'Aligarh', 'Sitapur', 'Lucknow', 'Mahoba', 'Chitrakoot', 'Prayagraj', 'Mirzapur', 'Ghazipur', 'Rohtas', 'Daudnagar', 'Patna', 'Sitamarhi', 'Madhepura', 'Araria', 'Bhagalpur', 'Pakur']
The path from Bikaner to Sri Ganganagar is: ['Bikaner', 'Jodhpur', 'Rajsamand', 'Sikar', 'Sri Ganganagar']
The path from Sagar to Bhopal is: ['Sagar', 'Morena', 'Bhopal']
The path from Bhopal to Sagar is: ['Bhopal', 'Kota', 'Bundi', 'Jaipur', 'Delhi', 'Faridabad', 'Agra', 'Morena', 'Sagar']
The path from Calicut to Gaya is: ['Calicut', 'Belagavi', 'Bundi', 'Jaipur', 'Delhi', 'Faridabad', 'Agra', 'Aligarh', 'Sitapur', 'Lucknow', 'Mahoba', 'Chitrakoot', 'Prayagraj', 'Mirzapur', 'Ghazipur', 'Rohtas', 'Daudnagar', 'Patna', 'Sitamarhi', 'Madhepura', 'Araria', 'Bhagalpur', 'Nawada', 'Gaya']
The path from Gaya to Calicut is: ['Gaya', 'Palamu', 'Daudnaga

Observations:

1. 'Bikaner' and 'Pakur' are very far from each other. While applying this dfs algorihm to find path this 
is giving very long path which is of no use in real life.
2. while searching for 'Sri Ganganagar' starting from 'Bikaner' it's taking long path but they are neighbour of each other.
3. To search for 'Bhopal' starting with 'Sagar' algorithm gives shortest possible path.
4. Algorithm gives differennt ways if we just interchange the starting node and target node.
5. Algorithm can give long path if we are just changing the starting and target nodes.