# Task
Implement and explain the Depth-First Search (DFS) algorithm for navigating a library, represented as a graph, to find a 'Scroll'. The explanation should cover DFS backtracking, its suitability for the problem, the role of a 'visited' set, DFS completeness, and a comparison with Breadth-First Search (BFS).

## Represent Library as Graph

### Subtask:
Represent the library's corridors and connections as a graph data structure (e.g., an adjacency list) and define the initial state (Hall - H) and the goal state (Scroll - S).


**Reasoning**:
The subtask requires representing the library as a graph using an adjacency list, defining a start node 'H', and a goal node 'S'. I will use a Python dictionary for the adjacency list and assign 'H' and 'S' accordingly.



In [1]:
graph = {
    'H': ['A', 'B'],
    'A': ['H', 'C', 'D'],
    'B': ['H', 'E'],
    'C': ['A', 'F'],
    'D': ['A', 'G'],
    'E': ['B', 'I'],
    'F': ['C', 'J'],
    'G': ['D', 'K'],
    'I': ['E', 'L'],
    'J': ['F', 'M'],
    'K': ['G', 'N'],
    'L': ['I', 'S'],
    'M': ['J', 'S'],
    'N': ['K', 'S'],
    'S': ['L', 'M', 'N'] # Goal node
}

start_node = 'H'
goal_node = 'S'

print("Library Graph (Adjacency List):")
for node, connections in graph.items():
    print(f"{node}: {connections}")

print(f"\nStarting Node: {start_node}")
print(f"Goal Node: {goal_node}")

Library Graph (Adjacency List):
H: ['A', 'B']
A: ['H', 'C', 'D']
B: ['H', 'E']
C: ['A', 'F']
D: ['A', 'G']
E: ['B', 'I']
F: ['C', 'J']
G: ['D', 'K']
I: ['E', 'L']
J: ['F', 'M']
K: ['G', 'N']
L: ['I', 'S']
M: ['J', 'S']
N: ['K', 'S']
S: ['L', 'M', 'N']

Starting Node: H
Goal Node: S


## Implement DFS Algorithm

### Subtask:
Implement a Depth-First Search (DFS) algorithm to traverse the graph and locate the Scroll (S).


### Reasoning for DFS Implementation

To implement the DFS algorithm, I will follow these steps:
1.  **Define `dfs` function:** This function will take `graph`, `current_node`, `goal_node`, and `path` as arguments. The `current_node` is essentially the `start_node` for each recursive call, and `path` will track the nodes visited in the current exploration branch.
2.  **Add `current_node` to `path`:** Before exploring, add the `current_node` to the `path` to mark it as visited in the current search. This also helps in reconstructing the path if the goal is found.
3.  **Goal Check:** If `current_node` is the `goal_node`, the path has been found, so return the current `path`.
4.  **Explore Neighbors:** Iterate through all neighbors of the `current_node`.
5.  **Avoid Cycles:** For each `neighbor`, check if it's already in the `path`. This prevents infinite loops in graphs with cycles.
6.  **Recursive Call:** If a `neighbor` has not been visited in the current path, make a recursive call to `dfs` with the `neighbor` as the new `current_node`. Pass a copy of the current `path` to the recursive call to maintain separate paths for different branches.
7.  **Path Found:** If the recursive call returns a valid `found_path` (meaning the goal was found deeper in that branch), immediately return that `found_path`.
8.  **Backtrack:** If all neighbors have been explored and the goal is not found through any of them, return `None` to signal backtracking.
9.  **Initial Call:** Finally, call the `dfs` function with the initial `start_node` ('H'), `goal_node` ('S'), and an empty list as the initial path.

**Reasoning**:
Now I will implement the Depth-First Search (DFS) algorithm using the `dfs` function as outlined in the reasoning and instructions, and then call it to find the path from the `start_node` to the `goal_node`.



In [2]:
def dfs(graph, current_node, goal_node, path=None):
    if path is None:
        path = []

    # Add the current node to the path
    path = path + [current_node]

    # If the current node is the goal node, we found the path
    if current_node == goal_node:
        return path

    # Explore neighbors
    for neighbor in graph.get(current_node, []):
        # Only visit unvisited neighbors (not in current path to avoid cycles)
        if neighbor not in path:
            found_path = dfs(graph, neighbor, goal_node, path)
            if found_path:  # If a path was found down this branch, return it
                return found_path

    # No path found from this node, backtrack
    return None

# Call the DFS function
found_route = dfs(graph, start_node, goal_node)

# Print the result
if found_route:
    print(f"Path from {start_node} to {goal_node}: {" -> ".join(found_route)}")
else:
    print(f"No path found from {start_node} to {goal_node}.")

Path from H to S: H -> A -> C -> F -> J -> M -> S


## Illustrate DFS Backtracking

### Subtask:
Modify the DFS implementation to log each step, including node visits and backtracking, to illustrate the search process.


**Reasoning**:
I will create a new Python function `dfs_with_logging` that extends the previous DFS implementation by adding print statements to illustrate the search process, including node visits and backtracking as per the instructions.



In [3]:
def dfs_with_logging(graph, current_node, goal_node, path=None, visited_nodes=None):
    if path is None:
        path = []
    if visited_nodes is None:
        visited_nodes = set()

    # Add the current node to the path for this branch
    path = path + [current_node]
    # Add to globally visited nodes for tracking
    visited_nodes.add(current_node)

    print(f"Visiting: {current_node}. Current path in this branch: {' -> '.join(path)}")

    # If the current node is the goal node, we found the path
    if current_node == goal_node:
        print(f"Goal found! Path: {' -> '.join(path)}")
        return path

    # Explore neighbors
    for neighbor in graph.get(current_node, []):
        # Only visit unvisited neighbors in the current path to avoid cycles within a branch
        # We are also using visited_nodes to track all nodes visited to explain completeness later
        if neighbor not in path:
            print(f"Exploring neighbor {neighbor} from {current_node}.")
            found_path = dfs_with_logging(graph, neighbor, goal_node, path, visited_nodes)
            if found_path:  # If a path was found down this branch, return it
                return found_path

    # If no path found from this node through its neighbors, backtrack
    print(f"Backtracking from: {current_node}")
    return None

print("--- Starting DFS with Logging ---")
# Call the DFS function with logging
found_route_logged = dfs_with_logging(graph, start_node, goal_node)

# Print the final result
if found_route_logged:
    print(f"\nFinal route found: {' -> '.join(found_route_logged)}")
else:
    print(f"\nNo path found from {start_node} to {goal_node}.")

--- Starting DFS with Logging ---
Visiting: H. Current path in this branch: H
Exploring neighbor A from H.
Visiting: A. Current path in this branch: H -> A
Exploring neighbor C from A.
Visiting: C. Current path in this branch: H -> A -> C
Exploring neighbor F from C.
Visiting: F. Current path in this branch: H -> A -> C -> F
Exploring neighbor J from F.
Visiting: J. Current path in this branch: H -> A -> C -> F -> J
Exploring neighbor M from J.
Visiting: M. Current path in this branch: H -> A -> C -> F -> J -> M
Exploring neighbor S from M.
Visiting: S. Current path in this branch: H -> A -> C -> F -> J -> M -> S
Goal found! Path: H -> A -> C -> F -> J -> M -> S

Final route found: H -> A -> C -> F -> J -> M -> S


## Explain DFS Suitability

### Subtask:
Explain why DFS is a suitable algorithm for this problem, considering its ability to explore deep paths quickly, despite not guaranteeing the shortest path.


### Why DFS is Suitable for this Problem

1.  **Fundamental Characteristic: Deep Exploration:**
    Depth-First Search (DFS) inherently explores as deeply as possible along each branch before backtracking. In the context of our library, this means DFS will pick a corridor and follow it to its end, or until it hits a dead-end or a previously visited node, before trying another path. This is akin to a person exploring one section of the library thoroughly before moving to another.

2.  **Suitability for Finding *Any* Path to a Deep Goal:**
    This deep exploration makes DFS particularly suitable for problems where we need to find *any* path to a goal, especially if the goal might be located deep within the graph. DFS quickly dives into one part of the graph. If the 'Scroll' (our goal 'S') happens to be at the end of an early, deep path, DFS will find it efficiently without having to exhaustively explore all shallow nodes first. For instance, in our library, if the scroll is many rooms away but along a direct path from the start, DFS would likely find it faster than BFS, which would expand layer by layer.

3.  **Not Guaranteeing the Shortest Path:**
    A significant characteristic of DFS is that it **does not guarantee finding the shortest path**. Because it explores deeply first, it might find a very long path to the goal if that path is encountered early in its search order, even if a much shorter path exists elsewhere. For our problem of finding a 'Scroll' in a library, this is generally an acceptable trade-off. Our primary objective is to *locate* the 'Scroll', not necessarily to find the absolute shortest route to it. As long as we find a way to the 'Scroll', the mission is accomplished. If finding the shortest path were critical (e.g., in an evacuation scenario), then an algorithm like Breadth-First Search (BFS) would be more appropriate. For a simple discovery task, DFS's ability to quickly find *a* path is sufficient.

## Discuss Visited Set Role

### Subtask:
Discuss the crucial role of a 'visited' set in DFS to prevent infinite loops in the presence of 'illusion corridors' that lead back to previously visited sections.


### The Crucial Role of a 'Visited' Set in DFS

In graph traversal algorithms like Depth-First Search (DFS), a 'visited' set (or similar mechanism) plays a critical role in ensuring the algorithm terminates and correctly explores the graph. It helps prevent infinite loops and redundant computations, especially in graphs containing cycles.

1.  **What is a 'visited' set?**
    A 'visited' set is a data structure (commonly a set, list, or boolean array) used to keep track of nodes that have already been explored or are currently in the process of being explored during a graph traversal. When the algorithm encounters a node, it first checks if that node is already in the 'visited' set. If it is, the algorithm skips that node to avoid reprocessing it.

2.  **Why is a 'visited' set essential for DFS, especially with cycles?**
    Graphs can contain cycles, which are paths that lead back to a previously visited node. In our library example, an "illusion corridor" could be a connection that takes you from Room X back to Room Y, where Room Y was previously visited on the same path. Without a mechanism to detect already visited nodes, DFS would continuously traverse these cycles, leading to an infinite loop. The 'visited' set acts as a memory, telling the algorithm, "I've been here before, don't go down this path again in the current search branch."

3.  **Consequence of not using a 'visited' set in such graphs:**
    If a 'visited' set is not used in a graph with cycles, the DFS algorithm will fall into an infinite loop. It would keep traversing the cycle indefinitely, never reaching the goal node (if it's outside the cycle) and never terminating. This would result in a stack overflow error in a recursive implementation due to excessive function calls, or an out-of-memory error in an iterative implementation.

4.  **Reference to `dfs` and `dfs_with_logging`:**
    In our implemented `dfs` and `dfs_with_logging` functions, the check `if neighbor not in path:` serves this purpose for the *current search branch*. The `path` list, which is passed recursively, ensures that a node is not revisited if it is already part of the current path leading to the `current_node`. This effectively prevents cycles *within that specific branch* from causing infinite loops.

    *   In `dfs`:
        ```python
        if neighbor not in path:
            found_path = dfs(graph, neighbor, goal_node, path)
        ```
    *   In `dfs_with_logging`:
        ```python
        if neighbor not in path:
            print(f"Exploring neighbor {neighbor} from {current_node}.")
            found_path = dfs_with_logging(graph, neighbor, goal_node, path, visited_nodes)
        ```
    The `path` variable essentially acts as a 'visited' set for the current depth of recursion, preventing the algorithm from going back up the same path it just came from.

5.  **Global `visited_nodes` for overall graph exploration:**
    While the `path` variable implicitly handles the visited state for the current branch, a more general approach, especially for completeness checks or algorithms that need to ensure every reachable node is processed exactly once, often involves a *global* `visited_nodes` set. This global set tracks all nodes ever visited across *all* search branches from the start. In our `dfs_with_logging` function, we introduced a `visited_nodes` set, but its primary purpose there was to track all nodes seen *at any point* for potential later analysis (though the actual cycle prevention logic still relied on `if neighbor not in path:`). For simply finding *a* path, checking `neighbor not in path` is sufficient, as we only care about cycles in the immediate path. However, for exploring the *entire* graph or finding *all* paths, a global `visited_nodes` set is crucial to avoid redundant computations across different branches and ensure each node is processed only once.

## Analyze DFS Completeness

### Subtask:
Analyze the completeness of DFS for this specific problem, considering the graph structure and the use of a visited set.


### Analysis of DFS Completeness

1.  **Definition of Completeness:**
    A search algorithm is considered **complete** if it is guaranteed to find a solution (a path to the goal state) if one exists. For an algorithm to be complete, it must explore all possible paths or nodes relevant to finding the goal without getting stuck in infinite loops or discarding potential solutions.

2.  **Role of the `path` variable (branch-specific visited set):**
    In our DFS implementations (`dfs` and `dfs_with_logging`), the `path` variable serves as a branch-specific 'visited set'. When exploring a neighbor, we check `if neighbor not in path`. This is crucial for preventing infinite loops in graphs that contain cycles. By adding the `current_node` to `path` at the beginning of each recursive call and only exploring neighbors *not already in the current `path`*, we ensure that DFS does not revisit nodes within the *same branch of exploration*. If a cycle is encountered, DFS will backtrack from the node that would form the cycle, rather than getting stuck in it. This mechanism ensures that each node in a given branch is visited only once, making the algorithm finite for any given branch.

3.  **Implications of a finite graph and absence of infinite-depth paths:**
    The library graph is explicitly defined as a finite set of nodes and connections (an adjacency list). There are no infinite-depth paths, as all nodes eventually lead back or terminate. Because the graph is finite and our DFS implementation actively prevents revisiting nodes within the current path (effectively pruning cyclic paths within a branch), the algorithm will eventually explore all reachable nodes and branches. Since the goal node 'S' is reachable from the `start_node` 'H', the DFS will not run indefinitely.

4.  **Summary for *this specific problem*:**
    Given the characteristics of our problem:
    *   The graph is **finite**.
    *   The goal node ('S') is **reachable** from the start node ('H').
    *   Our DFS implementation uses a `path` variable (a branch-specific visited set) that effectively handles **cycles** by preventing infinite loops along a single path.

    Therefore, the DFS algorithm, as implemented, **is complete for this specific problem**. It is guaranteed to find the 'Scroll' if a path exists because it systematically explores all possible paths in a depth-first manner without getting trapped in cycles and within a finite search space. Since we know a path exists, DFS will eventually discover one.

## Compare DFS and BFS

### Subtask:
Compare the behavior and characteristics of DFS and Breadth-First Search (BFS) if both algorithms were applied to find the scroll in this library.


## Summary:

### Data Analysis Key Findings

*   **Graph Representation:** The library was successfully represented as an adjacency list, with 'H' (Hall) as the starting node and 'S' (Scroll) as the goal node.
*   **DFS Path Found:** The Depth-First Search (DFS) algorithm successfully identified a path from the starting node 'H' to the goal node 'S', which was `H -> A -> C -> F -> J -> M -> S`.
*   **DFS Suitability:** DFS is suitable for this problem because it efficiently explores deep paths to find *any* path to the 'Scroll', even though it does not guarantee finding the shortest path. This trade-off is acceptable as the primary objective is discovery, not optimal travel time.
*   **Role of 'Visited' Set:** The `path` variable within the DFS implementation effectively acts as a branch-specific 'visited' set (`if neighbor not in path`), preventing infinite loops when cycles are present in the graph. Without this mechanism, the algorithm could endlessly traverse cyclic paths.
*   **DFS Completeness:** For this specific problem, DFS is considered complete. This is due to the finite nature of the library graph, the reachability of the goal node 'S' from 'H', and the implementation's handling of cycles using the `path` variable, which prevents infinite loops and ensures all reachable paths are explored.

### Insights or Next Steps

*   **Comparative Analysis:** While DFS is suitable for finding *a* path, if the shortest path to the 'Scroll' were a critical requirement (e.g., in an emergency evacuation scenario), Breadth-First Search (BFS) would be a more appropriate algorithm to consider due to its completeness and optimality guarantee for unweighted graphs.
*   **Dynamic Graph Exploration:** For more complex scenarios, such as a library that dynamically changes (e.g., new corridors open, old ones close), an adaptive search algorithm or one that can handle dynamic graphs might be beneficial to maintain efficient pathfinding.
