# Graph Traversals (Cont'd)

 The Implementation of DFS went by very quickly yesterday, so we'll take a second look with a few more examples.

In [37]:
%config InteractiveShell.ast_node_interactivity="none"

In [None]:
!wget https://raw.githubusercontent.com/jamcoders/syllabus-resources-2023/main/week3/lecs/boaz_utils.ipynb
%run "boaz_utils.ipynb"

## Trees
A **tree** is an undirected graph that contains no cycles.  This means that every pair of nodes has a unique path between them.  This property makes it very suitable for trying to understand the DFS algorithm, because we do not need the visited dictionary.

So we'll take a look at some computations on trees to see the power of recursion, as well as gain a little more insight into the DFS traversal.


### Binary Trees
Let us consider directed graphs that otherwise resemble trees.  
- Each node will have at most 2 out-neighbours.  
- Each node will also have exactly 1 in-neighbour, except for one node called the **root**, which will have no in-neighbour. 
- A node with zero out-neighbours is called a **leaf**.

Here are some examples of graphs that would count as binary trees under this definition. (Developed collaboratively on the board during lecture)

#### Counting Descendants

In [38]:
def count_reachable(tree, node):
    nbrs = graph_nbrs(tree, node)
    if len(nbrs) == 0:
        return 1
    elif len(nbrs) == 1:
        c_l = count_reachable(tree, nbrs[0])
        return c_l + 1
    else:
        c_l = count_reachable(tree, nbrs[0])
        c_r = count_reachable(tree, nbrs[1])
        return c_l + c_r + 1


For expediency, we'll revert to our list representation of graphs. (It will make typing in examples faster).

In [39]:
def graph_nodes(graph):
    return range(len(graph))

def graph_nbrs(graph, node):
    return graph[node]

def graph_size(graph):
    return len(graph)

In [40]:
g1 = [[1], [2, 3], [], []]
print(count_reachable(g1, 0))

4


In [41]:
g2 = [[1, 2], [3, 4], [5], [6, 7], [], [8, 9], [], [10], [], [], [11], []]
print(count_reachable(g2, 0))

12


#### Count Leaves
Try to develop a program that would count the number of _leaves_ that were reachable from the root.

### General Trees

If you have a general tree, how would you count the number of reachable nodes?

In [42]:
def count_reachable_gen(tree, node):
    nbrs = graph_nbrs(tree, node)
    if len(nbrs) == 0:
        return 1
    else:
        s = 0
        for nbr in nbrs:
            s += count_reachable(tree, nbr)
        return s + 1
   

In [43]:
print(count_reachable_gen(g1, 0))

4


In [44]:
print(count_reachable_gen(g2, 0))

12


In [45]:
g3 = [[1, 2, 3], [4], [5, 6], [7], [8], [], [], [], []]
print(count_reachable_gen(g3, 0))

9


Observe how there is a recursive call _inside_ a for loop in `count_reachable_gen`.  It seems scary at first if we come at it without preparation, but when we see how we built up to it, we can recognise the for loop as generalizing over the number of out-neighbours, and the recursive call is just the wishful thinking that we are carrying out per out-neighbour.

What else we do inside and outside of the for loop is all about putting together the results of those "wishful thinking" processes (ie recursive calls) to produce the final answer that we were after. 

## DFS Implementation
Now we are ready to re-examine the DFS implementation. 

Here is the main code from yesterday:

In [None]:
def init_visited(graph):
    """Return a dictionary mapping each node in graph to False """
    result = {}
    for node in graph_nodes(graph):
        result[node] = False
    return result

# This is just to get started
def dfs_find_reachable(g, start):
    visited = init_visited(g)
    reachable = []
    dfs_explore_reachable(g, start, visited, reachable)
    return reachable

# This one will do the hard work
def dfs_explore_reachable(g, node, visited, reachable):
    visited[node] = True
    reachable.append(node)
    for nbr in graph_nbrs(g, node):
        if not visited[nbr]:
            dfs_explore_reachable(g, nbr, visited, reachable)

### Discussion

Counting reachable nodes is the same thing as exploring, but we are also keeping track of a count.  In our discussion so far, we have limited ourselves to trees, which did not have cycles, so it was impossible to revisit the same node twice, if we started out on different out-neighbours from the root.

In a general graph, this will not be the case though.  It might be possible that if we start at one node, $u$, and explore from its first out-neighbour, we could get to some node $v$; and then later, when exploring from $u$'s second out-neighbour, we also get to the same node $v$.  If we counted reachable nodes in the way we did for trees, we would end up double counting $v$.  So, to avoid this, we have to keep track of a "register" of which nodes have already been visited.  We need to make sure that it is available to every recursive call, so that as it is updated, other recursive calls see those updates.

Looking back at our DFS implementation, we can see a strong similarity to the process for counting reachable nodes. The main difference is the existence of a record of the nodes that have been visited (the `visited` dictionary), which is passed to every recursive call.  Notice also that we consult that record to avoid making the recursive call at any out-neighbours that have already been accounted for (i.e. visited).