### Depth First Search -- DFS 

**Depth First Search (DFS)**, as the name implies seeks to aggressively search deep into a graph, only backtracking when it can go no further or all nodes along a path have been explored.  Just like BFS, DFS will find all the nodes that are findable from a given starting point of a graph in linear O(m+n) time.  DFS, however, is uniquely suited for finding Topological Ordering of a directed graph and for finding the Strongly Connected Components of directed acyclic graphs -- applications which BFS cannot do.

When to use BFS and DFS:

Use BFS for:

    - finding shortest paths between nodes
    - solutons are rare, but shallow in graphs

Use DFS for

    - finding Strongly Connected Components of directed graphs
    - finding a topological ordering of directed acyclic graphs
    - searching for solutions to mazes
    - when solutions are plentiful but deep within graphs
    
Use either BFS or DFS:

    - finding all nodes that can be found from a given node
    - finding Connected Components of undirected Graph
    

   
Today you are going to Code 2 versions of DFS to find all the findable nodes of an undirected graph.  You'll code a recursive version making use of the "call stack" to aggressively search the edges of any new nodes, and you'll do an iterative (looping) version that makes use of a Stack data structure with the Last In First Out (LIFO) principle.

### Exercise 1: Recursive DFS

Follow the explanation of DFS from Class lecture, or from the [Tim Roughgarten video](https://youtu.be/_9_VUNrWGUs) to implement a recursive version of Depth First Search that returns a set of all the findable nodes from a given start node.



In [1]:
# Write a recursive version of Depth First Search that returns a list of Explored edges

def DFS_recursive(G, s, E = set()):
    '''
    input Graph G as adjacency list dictionary
    s starting point to begin search
    E set of points explored, default value o
f empty set
    returns: the set of explored elements
    '''
    E.add(s)
    for v in G[s]:
        # recursive call
        if v not in E:
            DFS_recursive(G, v, E = E)
    # base case of returning
    return E




### Exercise 2: Iterative or Looping DFS

Sometimes, like with our Scrabble word graph, a recursive DFS might exceed the recursion depth that Python allows.  In this case, we'll want to use a loop.  To do this, you make use of a data structrue called a STACK which is a Last In First Out data structure (LIFO).

Again, you'll use the deque class from the collections module for your Stack, as it allows O(1) appends to the top of the stack and O(1) removal from the top of the stack.

Here is an example of adding elements to a deque stack.

```python
from collections import deque
stack = deque()
stack.append('some node') # adds a node to the top of the stack
stack.pop() # removes the last node added to the stack
```

Follow the ideas explanation from Class and the outline below to implement the iterative/looping version of DFS.

In [2]:
# Write an interative version of DFS using a Stack
# Again you can use deque from the Collections module which can .pop from and .append to the 
# "TOP" of the stack in O(1)
from collections import deque

def DFS_iter(G, s):
    '''
    DFS of Graph G from point s
    return set of all explored nodes
    '''
    E = {s} # mark s as explored

    # initialize a stack with all nodes adjacent to s
    stack = deque(G[s])
    # put your loop here
    while stack:
        i = stack.pop()
        E.add(i)
        for v in G[i]:
            if v not in E:
                E.add(v)
                stack.append(v)

    
    # end loop

    return E
    

In [3]:
# small test graph for checking your functions

G = {'s':['a','b'],
    'a': ['s','b','c'],
    'b':['s','c','d'],
    'c':['a','b','d'],
    'd':['b','c','e'],
    'e':['c','d'],
     'f':['g','h'],
     'g':['f'],
     'h':['f']
    }


for node in ['f','s']:
    print(DFS_recursive(G, node, E=set()), 1)
    print(DFS_iter(G, node), 2)

{'g', 'h', 'f'} 1
{'g', 'h', 'f'} 2
{'e', 'a', 'd', 's', 'b', 'c'} 1
{'e', 'a', 'd', 's', 'b', 'c'} 2


### Part 2: Connected Components of Undirected Graphs

A connected component of a graph consists of all the elements that can be reached from a given starting point.  If there are two nodes that not connected by any path in an undirected graph, then these nodes must be part of different Connected Components.

A connected component can be thought of as an island of nodes that are not connected to the other nodes of a given graph.  For example, in the 4_letter_word_graph you created, the following 3 words constitute a connected component of the graph: `{'AMBO', 'UMBO', 'AMMO'}`  Those three words are connected to each other, but there is NO path from these words to any other words in the 4_letter_word_graph.

Since DFS and BFS will both find all the findable nodes from a given starting node, to find the Connected Components of a graph, you only need embed DFS or BFS inside of a loop of all the nodes of a graph, keeping track of all the nodes you've seen so far and calling DFS or BFS on each node that has not been seen already.

You can review this idea by re-watching the recording of class posted in Classroom, or watching this [7 minutes Tim Roughgarden Video on Connected Components](https://youtu.be/vHqaiQlOzOw)


### Exercise 3
Write a loop or a function containing a loop to find the answer to this question:

How many different connected components exist in the 4_letter_word_graph?

In [29]:
def one_away(word1, word2):
    '''
    determines if 2 words of the same length is 1 away
    inputs: 2 words
    outputs: whether or not the words are 1 away
    '''
    if word1 == word2:
        return False
    if len(word1) == len(word2):
        dif = False
        for i in range(len(word1)):
            # using the dif variable as an extra strike before the out
            if word1[i] != word2[i]:
                if dif:
                    return False
                dif = True
        return True
    return False


def create_dictionary_graph(word_length=4):
    '''reads the word_length words from the sowpods.txt file and 
    return an adjacency list dictionary of the form
    key: word, value: list of words that are one_away from that word'''
    with open("sowpods.txt","r") as sowpods:
        # may not run as fast, but looked fancier, don't flame me please
        words = [l.strip().upper() for l in sowpods if len(l.strip().upper()) == word_length]
    letter_graph, count, length = {}, 0, len(words)
    for w in words:
        letter_graph[w] = []
        for o in words:
            if one_away(w, o):
                letter_graph[w].append(o)
        count += 1
        if not count%500:
            print(count/length)
    return letter_graph


def write_to_file(l):
    '''
    only need to tell the program how long you the words to be
    input: word length
    output: none, but writes/createsa file
    '''
    d = create_dictionary_graph(l)
    with open(f'{l}_letter_graph.txt', 'w') as file:
        for w in d:
            file.write(f"{w}")
            for w_ in d[w]:
                file.write(f" {w_}")
            # for formatting
            file.write("\n")
    print(f'done for {l}')


# for i in range(4, 8):
#     write_to_file(i)


def load_graph(file = "4_letter_graph.txt"):
    '''
    Reads a file in and returns a graph in the form of an adjacency list dictionary
    inputs: file name
    outputs: the adjacency list dictionary
    '''
    letter_dict = {}
    with open(file, 'r') as f:
        for line in f:
            # make line a list and split the list into 2
            letter_dict[line.split()[0]] = line.split()[1::]
    return letter_dict


for length in range(4, 8):
    exec(f"letter_graph_{length} = load_graph('{length}_letter_graph.txt')")

In [39]:
# your code 
from collections import deque


def BFS(G, s):
    '''
    Breadth First Search of a linked list Graph
    inputs: graph of words, starting index
    outputs: the set of all explored
    
    '''
    E = set([s])
    # starts with s as the starting node
    Q = deque([s])
    while Q:
        v = Q.popleft()
        for w in G[v]:
            if w not in E:
                Q.append(w)
                E.add(w)


    return E # the set of all the found nodes



def connected(G):
    '''
    takes in graph and finds the differencted connected groups
    inputs: the graph
    outputs: connected sections amount
    '''
    E = set()
    sectors = []  # making this here in case the user wants the different sectors returned
    for i in G:
        if i not in E:
            # finding all the connected
            explored = BFS(G, i)
            sectors.append(explored)
            # adding them to explored
            for e in explored:
                E.add(e)
    return len(sectors)


# your answer to the question
print(connected(letter_graph_4))  # 67

67


### Exercise 4

What is the size of the second largest Connected Component of the 7_letter_word_graph and what are all the words in that Connected Component?



In [4]:
# your code for number 4
def connected_alt(G):
    '''
    takes in graph and finds the differencted connected groups
    inputs: the graph
    outputs: connected sections
    '''
    E = set()
    sectors = []  # told you
    for i in G:
        if i not in E:
            # finding all the connected
            explored = BFS(G, i)
            sectors.append(explored)
            # adding them to explored
            for e in explored:
                E.add(e)
    return sectors


len_dict = {}
for sector in connected_alt(letter_graph_7):
    len_dict.setdefault(str(len(sector)), []).append(sector)

# your answer to number 4
len_list = []
for i in len_dict:
    len_list.append(int(i))
print(sorted(len_list)[-1])
print(len_dict['64'])

NameError: name 'letter_graph_7' is not defined

### Exercise 5:  LOOM Videos

Complete Loom videos explaining BFS, DFS and your Adjacency List Graph Representation.  Add these links to the [ALGORITHMS DOC](https://docs.google.com/spreadsheets/d/1QgLD9CET85d9O7AMwoSCk7IIS5JaTNzUc2upwfNlVxM/edit?usp=sharing)