# Lesson 3: Understanding and Implementing Depth-First Search (DFS) for Graphs


Welcome back! Today, we are continuing our exploratory journey through the intricate world of graph data structures. We have already navigated the meandering paths of adjacency matrices and adjacency lists. Today, we're stepping into another mesmerizing aspect of this data structures journey - the **Depth-First Search (DFS)** algorithm. Also known as the 'maze explorer', DFS is a master key to various graph-related challenges in fields ranging from computer networking to genetic genealogy.

One of the hallmarks of DFS is its penchant for penetrating as far as possible into a graph along a route before retracing its steps (or backtracking) when it reaches an endpoint. Then it delves into the next available route. It can be visualized as exploration within a network of caves, where each cave has multiple tunnels. You choose a tunnel, traverse as far as you can until you reach a dead end, return, choose another unexplored tunnel, and continue this process until no path is left unexplored.

# Understanding DFS

**Depth-First Search** or DFS is an algorithmic solution for traversing or searching through tree data structures or graph nodes. Its strategy of diving as deep as possible into a graph's branch before backtracking inspired its nomenclature.

Let’s visualize a familiar scenario for a moment. Suppose we're playing a video game situated in a complex map, loaded with winding paths and hidden rooms. You opt for a path and continue walking until you encounter a dead end. What's the next move? You revert, select another available path, and persist with this procedure until all possible paths are traversed — that’s DFS for you!

To better understand DFS within a graph context, consider this graph:

### Initial Graph:

*(Insert graph diagram here)*

Here's how DFS explores the graph: **A > B > D > E > C**. DFS proceeds from A to B, then advances from B to D. As D has no unvisited adjacent nodes, DFS backtracks to B and resumes the traversal towards E. When all adjoining nodes from E are visited, DFS backtracks to B once again and finally advances towards C.

# DFS Algorithm

Now, let's discover the DFS algorithm! It primarily initiates at the root or the start node of a graph, plunges as far as feasible down a branch, and then backtracks when it cannot delve further (i.e., it arrives at a node with no unvisited adjacent nodes).

Here's a high-level pseudocode illustration of the DFS algorithm to ease our discussion:

```plaintext
1. Mark the current node as 'visited' and print the node.
2. For every adjacent unvisited node of the current node:
    2.1. Invoke the recursive DFS function.
```

Discussing DFS's time and space complexity is pivotal to understanding an algorithm's efficiency. The time complexity of DFS is **O(V + E)**, where **V** indicates the number of vertices, and **E** represents the number of edges (connections between vertices) in the graph. The space complexity is **O(V)**, considering the storage of the visited nodes.

# Implementation of DFS

Let's shift from theory to practice by implementing the DFS algorithm. We'll demonstrate it in Python using our trusty adjacency list representation of the graph.

### Python Code Example:

```python
def DFS(graph, start, visited):
    visited.add(start)
    print(start, end=' ')
    
    for next_node in graph[start]:
        if next_node not in visited:
            DFS(graph, next_node, visited)

graph = {
    'A': set(['B', 'C']),
    'B': set(['A', 'D', 'E']),
    'C': set(['A']),
    'D': set(['B']),
    'E': set(['B']),
}

visited = set()
DFS(graph, 'A', visited)  # Output: A B D E C
```

In this Python code, we define a recursive function **DFS()**, taking three parameters — `graph`, `start`, and `visited`. The set `visited` keeps track of all visited nodes. We commence from the start node and add it to `visited`. For any `next_node` in the adjacency list of `start` that is not yet visited, we recursively invoke **DFS()**.

# Analyzing DFS

DFS's versatility makes it a powerful tool with a broad spectrum of applications. On a higher level, DFS excels in problems related to the establishment of connections within graphs and the discovery of pathways between two nodes. In terms of time efficiency, DFS thrives on densely connected graphs where the probability of finding the target quickly exceeds that of BFS.

However, all tools have their limitations and nuances. DFS does not perform optimally in problems necessitating the shortest path, such as GPS routing problems, where BFS is a superior choice. Additionally, DFS requires careful management when dealing with cycles within the graph, as it could end up in an infinite loop without effective control over the visited nodes.

# Application of DFS to Real-life Scenarios

Having mastered the DFS algorithm, let's delve into its real-world applications. One significant application of DFS is in the domain of computer games. Envision a scenario where you are an explorer venturing through a mythical jungle in search of sacred artifacts scattered across a complex network of trails filled with obstacles and rewards. To ensure a unique route is chosen each time, the game could employ DFS to steer your game character through the virtual jungle.

Another intriguing application of DFS lies within the social network domain. DFS algorithms could navigate a connection web from a known user to an unknown user. Using DFS, developers can innovate a feature showcasing how two users are connected through mutual connections on the platform, similar to LinkedIn's feature that displays how a user is connected to another user through mutual connections.

# DFS for Cycle Detection

One practical use of DFS is determining whether a graph contains a cycle. If we encounter a previously visited node while executing DFS, then a cycle exists in the graph.

For instance, consider the following graph:

*(Insert graph diagram here)*

Exploring the graph starting at A using DFS, our path would be **A > B > D > E > C**. On arriving at node C and checking its neighbors, we find that A and E have already been visited, which suggests a cycle in the graph.

# DFS for Pathfinding

DFS can also be deployed for pathfinding. Suppose we have a maze represented as a graph, and the goal is to find a path from one corner to another. The DFS algorithm would enable exploration of paths, selecting a path, and following it to the farthest point until a dead-end is reached before reverting and attempting the next available path until the destination is reached. Each move marks the node as visited and the path taken is retained.

However, while DFS can help locate a path, it does not guarantee the most efficient or shortest path. In scenarios requiring the shortest path, we would utilize another algorithm, like BFS or Dijkstra's algorithm.

# Conclusion

Congratulations on cementing your understanding of a foundational algorithm in computer science — **Depth-First Search**! You've delved into the inner workings of DFS, differentiated between DFS and BFS, and implemented DFS in Python. Moreover, you have recognized the applicability of DFS through several uses in real-world scenarios such as social networking or game algorithms.

# Practice Exercises Announcement

Brace yourself for a series of engaging practice exercises tailor-made to fortify your newfound DFS skills. Are you ready for a quest to discover hidden treasures in complex mazes or to untangle a web of connections on a social platform? With the power of DFS at your disposal, these challenges are just a few lines of Python code away. So, adorn your explorer's hat, and embark on the whirlwind journey of DFS!


## Exploring States with Depth-First Search

Are you ready to map out your journey through the States as a future graph whizz? Imagine that you are planning a road trip, starting from Washington, but unsure where to go next.

Why not use Depth-First Search (DFS) to plan the route?

This task challenges you to run a Python program that navigates a graph of inter-state connections using DFS. Understanding the output of this program will give you a tangible understanding of DFS!

Press the Run button to begin exploring!

```python
graph = {
    'Washington': set(['California', 'Nevada']),
    'California': set(['Washington', 'Oregon']),
    'Nevada': set(['Washington', 'Oregon']),
    'Oregon': set(['California', 'Nevada'])
}

def DFS(graph, start, visited):
    if start in visited:  # if the node has already been visited, just return the visited set
        return
   
    visited.add(start)
    print(start, end=" ")

    for state in graph[start]:
        if state not in visited:
            DFS(graph, state, visited)

# Call the DFS function starting with 'Washington'
visited = set()
DFS(graph, 'Washington', visited)  # Output: Washington Nevada Oregon California
print('\nVisited states:', visited)  # Print all visited states

```

## Adjusting the Start Node for DFS Traversal

Great job on running your first DFS traversal! Now, let's delve further into the cave.

The provided starter code traverses a social networking graph starting at 'Alice'. However, what if we were to begin the traversal from another node, say 'Bob'? Could you adjust the code to initiate the DFS from 'Bob'? Implementing this change might result in a new traversal order.

Keep exploring, brave coder!

```python
# Define a graph using dictionary
graph = {
    'Alice': set(['Carol', 'David']),
    'Bob': set(['Alice', 'Eve']),
    'Carol': set(['Alice', 'Eve']),
    'David': set(['Alice']),
    'Eve': set(['Bob','Carol']),
}

def DFS(graph, start_node, visited):
    """
    Function to implement DFS for the graph.
    """
    if start_node in visited:
        return

    visited.add(start_node)
    print(start_node, end=' --> ')

    for neighbour in graph[start_node]:
        if neighbour not in visited:
            DFS(graph, neighbour, visited)

visited = set()
# Call our DFS function, starting with 'Alice'
DFS(graph, 'Alice', visited)  # Depicts the DFS traversal. It could vary basis the order in which neighbors are processed.
print('\nVisited nodes:', visited)

```

## Debug and Correct the DFS Traversal

Fantastic! You're doing a great job implementing depth-first searches in a graph. Now, let's try a more challenging task!

Imagine you've been given the responsibility of analyzing a business offers network. A particular piece of code should identify the companies that have been visited, but it appears to be malfunctioning. Could you locate and rectify the issue to reveal the correct DFS traversal order?

Best of luck, young explorer!

```python
# Define the graph using a dictionary
offer_network = {
    'Company_A': set(['Company_B', 'Company_D']),
    'Company_B': set(['Company_A', 'Company_C']),
    'Company_C': set(['Company_B', 'Company_D']),
    'Company_D': set(['Company_A', 'Company_C'])
}

def DFS(offer_network, start_company, visited_offers):
    """
    Function implementing the DFS algorithm to traverse the graph.
    """
    visited_offers.add(start_company)
    print(start_company, end=' --> ')
    
    for partner in offer_network[start_company]:
        DFS(offer_network, partner, visited_offers)

visited_companies = set()
# Invoke DFS function, starting with 'Company_A'
DFS(offer_network, 'Company_A', visited_companies) 
print('\nVisited companies:', visited_companies) 


```

Great job! You're doing well navigating through these challenges. Let's take a look at the code and address the issue causing it to malfunction.

### Problem:
The issue with the current implementation is that the DFS traversal keeps revisiting nodes (or companies) and doesn't handle backtracking properly, which could lead to a potentially infinite traversal in graphs with cycles or improper marking of visited nodes.

In the given code:
- The traversal starts from `'Company_A'` but does not check if a node has already been visited before calling DFS on it. This causes the DFS to revisit already visited companies, which is not the intended behavior.
- The `visited_offers` set is only passed around and updated but is not being checked properly before further recursive calls.

### Solution:
We need to make sure that each company is visited only once. The solution is to check if a company is in the `visited_offers` set before recursively calling DFS on it. This way, we avoid revisiting companies.

Here's the corrected version of the code:

```python
# Define the graph using a dictionary
offer_network = {
    'Company_A': set(['Company_B', 'Company_D']),
    'Company_B': set(['Company_A', 'Company_C']),
    'Company_C': set(['Company_B', 'Company_D']),
    'Company_D': set(['Company_A', 'Company_C'])
}

def DFS(offer_network, start_company, visited_offers):
    """
    Function implementing the DFS algorithm to traverse the graph.
    """
    if start_company in visited_offers:
        return  # If the company has been visited, stop further recursion
    
    visited_offers.add(start_company)
    print(start_company, end=' --> ')
    
    for partner in offer_network[start_company]:
        DFS(offer_network, partner, visited_offers)

visited_companies = set()
# Invoke DFS function, starting with 'Company_A'
DFS(offer_network, 'Company_A', visited_companies)
print('\nVisited companies:', visited_companies)
```

### Explanation of the Changes:
1. **Check if Company is Already Visited**: Before performing the DFS on a company, we check if it has already been visited by using `if start_company in visited_offers`. If it has been visited, we return early to avoid revisiting it.
2. **Recursive DFS Traversal**: The rest of the DFS traversal remains the same. For each company, we continue traversing to its partners unless it's already in the visited set.

### Sample Output:
Running this corrected code should produce the correct DFS traversal and show the visited companies:

```
Company_A --> Company_B --> Company_C --> Company_D --> 
Visited companies: {'Company_A', 'Company_B', 'Company_C', 'Company_D'}
```

Now, the DFS traversal will correctly visit each company once and avoid any redundant visits or infinite loops.

## Implementing DFS Traversal for University Courses