# Chapter 6: breadth-first search

This chapter introduces graphs, breadth-first search, and topological sort.  Put simply, breadth-first search aims to find a solution to the shortest-path problem.

## What is a graph

Graphs model connections between nodes and edges-- the nodes are items, and the edges are the lines connecting the items.

## Breadth-first search

Breadth-first search can be used to answer two main question-- 1) is there a path from node A to node B; and 2) if a path exists between node A and node B, what's the shortest path?

In answering the first question- is there a path from node A to node B- we first check all of the first-degree connections before checking second-degree connections, etc.  Consequently, the order by which you you add people to search through matters. 

In breadth-first search, because we're going to be searching broadly and checking all of the first-degree connections before others, we'll use a data structure called a queue.  A queue is similar to a stack that we looked at in previous chapters to understand how recursive functions are handled behind the scenes-- items are stored one on top of another and we are not able to access a random element in the data structure.  However, unlike a stack, which operates using LIFO (last in, first out), a queue works on FIFO (first in, first out).  

With stacks, the only operations we would do were push and pop to "push" items onto and "pop" items off of the stack.  In contrast, with queues, we only have the enqueue and dequeue operations to add items to the end of the queue and remove items from the start of the queue, respectively.  

## Exercises


**6.1) Find the length of the shortest path from start to finish.**

The length of the shortest path with this example is 2.

**6.2) Find the length of the shortest path from "cab" to "bat".**

The length of the shortest path here is also 2-- "cab" to "cat" to "bat" (finish).

## Implementing the graph

We can implement a graph using a hash table-- the key denotes "you" or the starting point, and we can include an array of first-degree neighbors as the values.

In [1]:
# Implement the graph using a hash table
graph = {}
graph['you'] = ['alice', 'bob', 'claire']

# Add neighbors for bob
graph['bob'] = ['anuj', 'peggy']

# Add neighbors for alice
graph['alice'] = ['peggy']

# Add neighbors for claire
graph['claire'] = ['thom', 'jonny']

# Add empty neighbor arrays 
graph['anuj'] = []
graph['peggy'] = []
graph['jonny'] = []
graph['thom'] = []

The answer to the pop quiz is no, it doesn't matter because hash tables are unordered, unlike arrays.


## Implementing the algorithm

We'll implement the algorithm by keeping a queue of all of the people to check.  We'll first dequeue and pop the first person off the queue and check if they are a mango seller.  If yes, we're done but if not, we add all of their neighbors to the end of the queue.  This whole process starts over in a loop until we've gone through the entire queue.

In [22]:
# Create a double-ended queue
import collections

# Create a new queue
search_queue = collections.deque()

# Add all of your first-degree neighbors to the search queue
search_queue += graph['you']

# View the current search queue
search_queue

deque(['alice', 'bob', 'claire'])

Now we'll implement the rest of the algorithm to find the mango sellers.  We'll assume that anyone with "m" as the last letter in their name is a mango seller.  First, we'll build a while loop that will iterate over the queue while it isn't empty.  We'll pop a person off the front of the queue and check to see if they're a mango seller.  If they are, we'll return `True` and our job is done!  If not, we'll add that person's neighbors to the end of the queue and repeat.  I've added a few print statements to better track the people in the queue for each iteration.

In [23]:
while search_queue:
    print('----------------------------------\n')
    print(f'Queue is currently: {search_queue}')
    person = search_queue.popleft()
    print(f'Checking if {person} is a mango seller...')
    if person[-1] == 'm':
        print(f'{person} is a mango seller!')
        print(True) # if True, you're done
        break
    else:
        print(f'{person} is not a mango seller, adding their neighbors {graph[person]}.\n')
        search_queue += graph[person]

----------------------------------

Queue is currently: deque(['alice', 'bob', 'claire'])
Checking if alice is a mango seller...
alice is not a mango seller, adding their neighbors ['peggy'].

----------------------------------

Queue is currently: deque(['bob', 'claire', 'peggy'])
Checking if bob is a mango seller...
bob is not a mango seller, adding their neighbors ['anuj', 'peggy'].

----------------------------------

Queue is currently: deque(['claire', 'peggy', 'anuj', 'peggy'])
Checking if claire is a mango seller...
claire is not a mango seller, adding their neighbors ['thom', 'jonny'].

----------------------------------

Queue is currently: deque(['peggy', 'anuj', 'peggy', 'thom', 'jonny'])
Checking if peggy is a mango seller...
peggy is not a mango seller, adding their neighbors [].

----------------------------------

Queue is currently: deque(['anuj', 'peggy', 'thom', 'jonny'])
Checking if anuj is a mango seller...
anuj is not a mango seller, adding their neighbors [].

--

This is a good implementation, but we also need to make sure that we don't check the same person twice-- we did this with Peggy above.  We'll use the same basic structure of the algorithm we've already written, but also keep an array of people that we've already searched, sort of like setting a flag as "visitied" for each node in the graph.

In [26]:
def bfs(name):
    search_queue = collections.deque()
    search_queue += graph[name]
    searched = [] # keep track of people already searched
    
    while search_queue:
        print('----------------------------------\n')
        print(f'Queue is currently: {list(search_queue)}')
        person = search_queue.popleft()
        if person not in searched:
            print(f'Check if {person} is a mango seller')
            if person[-1] == 'm':
                print(f'{person} is a mango seller!')
                return True
            else: 
                print(f'{person} is not a mango seller, adding their neighbors {graph[person]}\n')
                search_queue += graph[person]
    return False # if not in the graph, return False
    

In [27]:
# Test using 'you' as the start
bfs('you')

----------------------------------

Queue is currently: ['alice', 'bob', 'claire']
Check if alice is a mango seller
alice is not a mango seller, adding their neighbors ['peggy']

----------------------------------

Queue is currently: ['bob', 'claire', 'peggy']
Check if bob is a mango seller
bob is not a mango seller, adding their neighbors ['anuj', 'peggy']

----------------------------------

Queue is currently: ['claire', 'peggy', 'anuj', 'peggy']
Check if claire is a mango seller
claire is not a mango seller, adding their neighbors ['thom', 'jonny']

----------------------------------

Queue is currently: ['peggy', 'anuj', 'peggy', 'thom', 'jonny']
Check if peggy is a mango seller
peggy is not a mango seller, adding their neighbors []

----------------------------------

Queue is currently: ['anuj', 'peggy', 'thom', 'jonny']
Check if anuj is a mango seller
anuj is not a mango seller, adding their neighbors []

----------------------------------

Queue is currently: ['peggy', 'thom'

True

## Exercises

**6.3) For the example morning routne, which of the three lists is valid or invalid?**

The first list is invalid, because eating breakfast depends on brushing your teeth.  The second list is valid, because showering doesn't depend on anything else besides waking up, and brushing teeth comes before eating breakfast.  The last list is invalid, since we can't shower before we wake.

**6.4) Below is a valid list for the graph in the textbook.**

1) Wake up

2) Exercise

3) Brush teeth

4) Pack lunch

5) Shower

6) Eat breakfast

7) Get dressed

**6.5) Which of the following graphs are also trees?**

Trees are special types of graph where no edges ever point back.  In the examples provided, A is a tree.  C is also a tree because no edges ever point back, just presented in a slightly different way.