![pythonLogo.png](attachment:pythonLogo.png)
# Depth-First Search (DFS) and Breadth-First Search (BFS) #

### Examples by Codecademy + commentary and exercise by N. Day - September 2022 ###

## This section introduces Depth-First Search and Breadth-First Search. It contains interactive code (there may be some errors for you to repair) and a multiple choice quiz. At the end there is also <font color="red">a logbook exercise for you to complete</font>  ##




# Reminder about Trees #
 - Trees have a root node (top of the tree), from which branches stem (child nodes). The children that do not have branches (descendents) are known as leaf nodes.

 ![Treesgif](http://www.thecrazyprogrammer.com/wp-content/uploads/2017/08/Tree-Data-Structure.gif)


 - Last week contrasted the (non-binary) tree and the Binary Search Tree (BST). Like Binary Search, we saw that a BST has to be configured in a particular format to enable logarithmic search. Each parent/root node can only have two children - a leftchild which has a value less than the root, and a rightchild which has a value greater than the root. 

 ![BST_arrange](https://blog.penjee.com/wp-content/uploads/2015/12/optimal-binary-search-tree-from-sorted-array.gif)

In [1]:
class TreeNode:
  def __init__(self, value):
    self.value = value
    self.children = []

  def __repr__(self, level=0):
    # HELPER METHOD TO PRINT TREE!
    ret = "--->" * level + repr(self.value) + "\n"
    for child in self.children:
      ret += child.__repr__(level+1)
    return ret

  def add_child(self, child_node):
    self.children.append(child_node) 

### TEST CODE TO PRINT TREE
company = [
  "Monkey Business CEO", 
  "VP of Bananas", 
  "VP of Lazing Around", 
  "Associate Chimp", 
  "Chief Bonobo", "Produce Manager", "Tire Swing R & D"]
root = TreeNode(company.pop(0))
for count in range(2):
  child = TreeNode(company.pop(0))
  root.add_child(child)

root.children[0].add_child(TreeNode(company.pop(0)))
root.children[0].add_child(TreeNode(company.pop(0)))
root.children[1].add_child(TreeNode(company.pop(0)))
root.children[1].add_child(TreeNode(company.pop(0)))
print("MONKEY BUSINESS, LLC.")
print("=====================")
print(root)


MONKEY BUSINESS, LLC.
'Monkey Business CEO'
--->'VP of Bananas'
--->--->'Associate Chimp'
--->--->'Chief Bonobo'
--->'VP of Lazing Around'
--->--->'Produce Manager'
--->--->'Tire Swing R & D'



# Breadth-First Search (BFS) vs. Depth-First Search (DFS)

A breadth-first search is when you inspect every node on a level starting at the top of the tree and then move to the next level. A depth-first search is where you search deep into a branch and don’t move to the next one until you’ve reached the end. Each approach has unique characteristics but the process for each one is almost exactly the same. The only difference in their approach is how they store the nodes that need to be searched next. These nodes are known as the frontier.

## Queues and stacks

The queue and the stack are the two data structures that can be used for storing nodes in a search frontier. A queue follows “First In First Out” (FIFO) behavior, where the order the data goes in the queue is the order it leaves the queue. This equates to any line you may have stood on to wait for the bus or to grab a cup of coffee.

A stack follows “Last In First Out” (LIFO) behavior which means that the most recent data added will be the first to leave. To get to a book at the bottom of a stack of books you must first remove the books that were more recently placed in the stack. The different behaviors of the queue and the stack will help define the behavior of the two search algorithms in this article.

## Breadth-First Search Using a Queue

Storing the frontier nodes in a queue creates the level-by-level pattern of a breadth-first search. Child nodes are searched in the order they are added to the frontier. The nodes on the next level are always behind the nodes on the current level. Breadth-first search is known as a complete algorithm since no matter how deep the goal is in the tree it will always be located.

## Depth-First Search Using a Stack

Frontier nodes stored in a stack create the deep dive of a depth-first search. Nodes added to the frontier early on can expect to remain in the stack while their sibling’s children (and their children, and so on) are searched. Depth-first search is not considered a complete algorithm since searching an infinite branch in a tree can go on forever. In this situation, an entire section of the tree would be left un-inspected.

## Path to the goal

It is important to note that it is not enough to find the node with the correct value. Once the goal node is found using either method of tree traversal, you must be able to provide the path of nodes from the root node to the goal node. This can be done in many ways from saving paths as you search down the tree to working with trees that can supply the path when needed.

The location of the goal node has a significant impact on determining which search algorithm will be able to find the goal first. That is why these approaches are generally used as building blocks for more complex traversal algorithms. With more information on the location of the goal value in the tree, you can optimize the breadth-first search and depth-first search algorithms. Then they become powerful tools that can help you find that file you were looking for.

# Tree recap and intro to Breadth-First-Search

The breadth-first search is a tree traversal algorithm that searches a tree level by level. The search starts with the root node and works its way through every sibling node on a level before moving deeper into the tree. There are multiple ways to implement a breadth-first search. In this lesson, we will use an iterative approach that involves maintaining a collection of nodes in a frontier queue.

Here are the steps needed to implement this search algorithm:

- Import values into a tree data structure
- Determine the root node of the tree and the goal value to search for
- Create a queue of node paths that lead to the nodes that need to be searched
- Get a path from the queue to obtain the next node to search
- If the goal value isn’t found in the node, add a path to the queue for each of the node’s children

The iterative approach to the breadth-first search algorithm is also represented by the flow diagram next to the code editor. In this lesson, we’ll learn how to implement this algorithm in Python.

In [None]:
from tree import TreeNode
from bfs import bfs

sample_root_node = TreeNode("Home")
docs = TreeNode("Documents")
photos = TreeNode("Photos")
sample_root_node.children = [docs, photos]
my_wish = TreeNode("WishList.txt")
my_todo = TreeNode("TodoList.txt")
my_cat = TreeNode("Fluffy.jpg")
my_dog = TreeNode("Spot.jpg")
docs.children = [my_wish, my_todo]
photos.children = [my_cat, my_dog]

print(sample_root_node)
goal_path = bfs(sample_root_node, "Fluffy.jpg")
if goal_path is None:
  print("No path found")
else:
  print("Path found")
  for node in goal_path:
    print(node.value)

# Another Example of a Tree

In [None]:
from tree import TreeNode

sample_root_node = TreeNode("Home")
docs = TreeNode("Documents")
photos = TreeNode("Photos")
sample_root_node.children = [docs, photos]
my_wish = TreeNode("WishList.txt")
my_todo = TreeNode("TodoList.txt")
my_cat = TreeNode("Fluffy.jpg")
my_dog = TreeNode("Spot.jpg")
docs.children = [my_wish, my_todo]
photos.children = [my_cat, my_dog]

# Write your code below. 
print(sample_root_node)
goal_path = None
if goal_path is None:
  print("No path found")
else:
  print("Path found")
  for node in goal_path:
    print(node.value)

# The Breadth-First Search (BFS) Algorithm

Key point - BFS uses a QUEUE to store each level at a time. 

- 1 - Add root node to the queue 
- 2 - Check the contents of the queue - is the root node the goal (target)? Remember the items of the queue are removed once checked (dequeued)
- 3 - If not, add the descendents of the root to the queue 
- 4 - Check the contents of the queue - is the goal (target) one of the nodes in the queue?
- 5 - If not, add the descendents of the contents of the queue (or what was there if removed)
- 6 - Check the contents - is the goal one of the nodes in the queue?

<b>Revision question: In what order are items added and removed from the queue?</b>

![BFS](Breadth-First-Tree-Traversal.gif)


In [None]:
from collections import deque

# Breadth-first search function
def bfs(root_node, goal_value):

  # initialize frontier queue
  path_queue = deque()

  # add root path to the frontier
  initial_path = [root_node]
  path_queue.appendleft(initial_path)
  
  # search loop that continues as long as
  # there are paths in the frontier
  while path_queue:
    # get the next path and node 
    # then output node value
    current_path = path_queue.pop()
    current_node = current_path[-1]
    print(f"Searching node with value: {current_node.value}")

    # check if the goal node is found
    if current_node.value == goal_value:
      return current_path

    # add paths to children to the  frontier
    for child in current_node.children:
      new_path = current_path[:]
      new_path.append(child)
      path_queue.appendleft(new_path)

  # return an empty path if goal not found
  return None

# The Depth-First Search (DFS) Algorithm

Key point: DFS uses a STACK to store (stack) each descendent as it traverses each branch of the tree.

- 1 - Add root node to the stack
- 2 - Check the contents of the stack - is the root node the goal (target)? Remember the items of the queue are removed once checked (popped off the stack)
- 3 - If not, add the descendents of the root to the stack 
- 4 - Pop the last item from the top of the stack 
- 5 - Check to see if it is the goal (target)
- 6 - If not, push (add) its descendents onto the stack (on top of the other node(s) from level 1)
- 7 - Pop them from the stack one by one to see if they are goal
- 8 - If not, repeat steps 4 - 7 for each branch from level 1

<b> Revision question: In what order are items added and removed from the stack?</b>

![DFS](Depth-First-Tree-Traversal.gif)


DFS is great for solving puzzles with one solution. Here with this maze, the idea is to explore one route fully (until a dead-end is met, or the goal is met) before attempting other routes. 

![DFS_maze](https://d18l82el6cdm1i.cloudfront.net/uploads/mf7THWHAbL-mazegif.gif)

In [None]:
from TreeNode import TreeNode, sample_root_node, print_path, print_tree

print_tree(sample_root_node)

def dfs(root, target, path=()):
  path = path + (root,)

  if root.value == target:
    return path

  for child in root.children:
    path_found = dfs(child, target, path)

    if path_found is not None:
      return path_found

  return None
        
node = dfs(sample_root_node, "F")
print(node)

## <font color="red">Logbook Exercise 10</font> ##

Insert a 'code' cell below. In this do the following:

- 1 - First implement a standard (non-binary) tree class. Remember that levels in a non-binary tree can have more than two children.
- 2 - Now insert nodes with values of single letters (A-Z) in your tree at a variety of levels. For this exercise, it would be best to avoid duplicates again. At least four levels are recommended.
- 3 - Now, copy the BFS code from above, and modify it so that it works with your Queue class that you have written previously. 
- 4 - Implement BFS on your sample tree, and direct it towards one of the nodes within your tree. Run this several times to check that it correctly finds the path to the target node. Feel free to print the tree so you can check whether it returns the correct path.
- 5 - Now, copy the DFS code from above, and modify it to work with your Stack class written previously. 
- 6 - Implement DFS on the same dataset. Check this correctly finds the path to the target node specified. 



## <font color="red">Logbook Exercise 11</font> ##

Your last logbook is a challenge! 

Your task: <b>Write a function called is_symmetrical that accepts a tree as an argument. This function should establish whether any tree (BST or non-binary) has the same number of nodes stored on each branch of the tree. The function should return True if the tree passed in is symmetrical, and should return False if not.</b>

Insert a 'code' cell below and write an algorithm to solve this task. You are advised to test this function on several trees of different sizes and contents. 

# References & Learning Resources#

 - W3Schools - there are many online resources for Python but the Python tutorial at https://www.w3schools.com/python/ is thorough, progressive, interactive and free. If you complete the main tutorial (skip the bits on installing Python as we will be using Ancaconda/Jupyter) the later sections on **"File Handling"**, **"NumPy"** and **"Machine Learning"** are also relevant. The **"Exercises"** and **"Quiz"** sections are also worthwhile activities for consolidating knowledge.
 - **Phillips, D. (2015). Python 3 object-oriented programming. Packt Publishing Ltd.** Although a 3rd edition has been released the 2nd edition is still pretty much up-to-date  and seems to be widely available in PDF format. As an added bonus this covers Design Patterns in some detail.
 - **https://www.learnpython.org/** is another comprehensive and intercative resource
 - **https://docs.python.org/3.7/tutorial/** is Python's own text-based tutorial. Despite the seemingly daunting number of sub-sections, it can be consumed in a fairly short time and manages to be both concise and comprehensive.
 - **Think Python 2e** is an excellent in-depth and free version of the O'Reilly hardcopy by Allen B. Downey and is available here ... https://greenteapress.com/wp/think-python-2e/
 - https://www.sololearn.com/ - great for mobile learning on the go ... free! Recommended by JJ
 - I have also adapted examples from *Learn Python In A Day: The Ultimate Crash Course To Learning The Basics Of Python In No Time* by *Acodemy* but this is out of print and is only mentioned for completeness.