# Python Algorithms
## Chapter 5 Traversal: The Skeleton Key of Algorithmics
+ traversal: discovering, and later visiting, all the nodes in a graph.   

Finding the connected components of a graph. A graph is connected if there is a path from each node to each of the others and if the connected components are the maximal subgraphs that are connected.   
One way of finding a connected component would be to start 
at some place in the graph and gradually grow a larger connected subgraph until we can’t get any further. Let’s look at the following related problem. Show that you can order the nodes in a connected graph, $v_1, v_2, . .., v_n$, so that for any $i = 1.. .n$, the subgraph over $v_1, . .. , v_i$ is connected. If we can show this and we can figure out how to do the ordering, we can go through all the nodes in a connected component and know when they’re all used up.  
we need to get from $i–1$ to $i$. We know that the subgraph over the $i–1$ first nodes is connected. Because there are paths between any pair of nodes, consider a node $u$ in the first $i–1$ nodes and a node $v$ in the remainder. On the path from $u$ to $v$, consider the last node that is in the component we’ve built so far, as well as the first node outside it. Let’s call them $x$ and $y$. Clearly there must be an edge between them, 
so adding $y$ to the nodes of our growing component keeps it connected,


In [15]:
G = {
    'a':set('e'),
    'b':set('efg'),
    'c':set('df'),
    'd':set('cf'),
    'e':set('abf'),
    'f':set('bcde'),
    'g':set('b'),
    '1':set('23'),
    '2':set('13'),
    '3':set('12')
}

In [12]:
def walk(G,start='d'):
    visited,ToVisit = dict(), set()
    ToVisit.add(start)
    visited[start] = None
    while ToVisit:
        u = ToVisit.pop()
        for v in G[u].difference(visited):
            ToVisit.add(v)
            visited[v] = u  #the parent node of this node in the traversal tree
    return visited
walk(G)


{'d': None, 'c': 'd', 'f': 'd', 'e': 'f', 'b': 'f', 'a': 'e', 'g': 'b'}

In [17]:
def componnets(G): #Find all connected componnets
    comp = []
    seen = set()
    for node in G:
        if node in seen:
            continue
        C = walk(G,node)
        seen.update(C)
        comp.append(C)
    return comp
componnets(G)

[{'a': None, 'e': 'a', 'f': 'e', 'b': 'e', 'c': 'f', 'd': 'f', 'g': 'b'},
 {'1': None, '2': '1', '3': '1'}]

### A Walk in the Park
#### No Cycles Allowed
#### How to Stop Walking in Circles
+ start walking in any direction, backtracking whenever you came to a dead end or an intersection you had already walked through

In [42]:
G = {
    'a':set('e'),
    'b':set('efg'),
    'c':set('df'),
    'd':set('cf'),
    'e':set('abf'),
    'f':set('bcde'),
    'g':set('b')
}

In [27]:
def DFS_rec(G,start,visited = []):
    visited.append(start)
    print(visited)
    for n in G[start]:
        if n in visited:
            continue
        DFS_rec(G,n,visited)
DFS_rec(G,'b')

['b']
['b', 'e']
['b', 'e', 'a']
['b', 'e', 'a', 'f']
['b', 'e', 'a', 'f', 'd']
['b', 'e', 'a', 'f', 'd', 'c']
['b', 'e', 'a', 'f', 'd', 'c', 'g']


#### Go Deep!

In [30]:
def DFS_it(G,start):
    visited, ToVisit = set(),[]
    ToVisit.append(start)
    while ToVisit:
        u = ToVisit.pop()
        if u in visited:
            continue
        visited.add(u)
        ToVisit.extend(G[u])
        yield u
list(DFS_it(G,'a'))

['a', 'e', 'b', 'g', 'f', 'c', 'd']

#### Depth-First Timestamps and Topological Sorting

In [39]:
def DFS_time(G,d=dict(),f=dict(),start='a',visited = [],t=0): #DFS with timestamp
    visited.append(start) 
    print('d:{}'.format(d))
    d[start] = t #discovery time
    t += 1
    for n in G[start]:
        if n in visited:
            continue
        t = DFS_time(G,d,f,start=n,visited=visited,t=t)
    f[start] = t # finish time
    t += 1
    print('f:{}'.format(f))
    return t
DFS_time(G)

d:{}
d:{'a': 0}
d:{'a': 0, 'e': 1}
d:{'a': 0, 'e': 1, 'f': 2}
d:{'a': 0, 'e': 1, 'f': 2, 'd': 3}
f:{'c': 5}
f:{'c': 5, 'd': 6}
d:{'a': 0, 'e': 1, 'f': 2, 'd': 3, 'c': 4}
d:{'a': 0, 'e': 1, 'f': 2, 'd': 3, 'c': 4, 'b': 7}
f:{'c': 5, 'd': 6, 'g': 9}
f:{'c': 5, 'd': 6, 'g': 9, 'b': 10}
f:{'c': 5, 'd': 6, 'g': 9, 'b': 10, 'f': 11}
f:{'c': 5, 'd': 6, 'g': 9, 'b': 10, 'f': 11, 'e': 12}
f:{'c': 5, 'd': 6, 'g': 9, 'b': 10, 'f': 11, 'e': 12, 'a': 13}


14

In [61]:
def DFS_topsort(G): 
# This can be used to sort the nodes of a general graph by decreasing finish times,
# when looking for strongly connected components
    visited,res=set(),[]
    def rec(u):
        if u in visited:
            return
        visited.add(u)
        for v in G[u]:
            rec(v)
        res.append(u) #finished exploring its children, add to res
    for n in G:
        rec(n)
    res.reverse()
    return res
print(DFS_topsort(G))

['a', 'e', 'f', 'b', 'g', 'd', 'c']


#### Infinite Mazes and Shortest (Unweighted) Paths
+ If we’re looking for the shortest paths (disregarding edge weights, for now) from our start node to all the others, DFS will, most likely, give us the wrong answer
+ iterative deepening depth-first search, or IDDFS, and it simply consists of running a depth-constrained DFS with an iteratively incremented depth limit
+ There is really only one situation where IDDFS would be preferable over BFS: when searching a huge tree (or some state space “shaped” like a tree). Because there are no cycles, we don’t need to remember which nodes we’ve visited, which means that IDDFS needs only store the path back to the starting node. BFS, on the other hand, must keep the entire fringe in memory (as its queue), and as long as there is some branching, this fringe will grow exponentially with the distance to the root. In other words, in these cases IDDFS can save a significant amount of memory, with little or no asymptotic slowdown.

In [53]:
def idDFS(G,start):
    yielded = set() #visited
    def rec(G,s,depth,visited = set()):
        if s not in yielded:
            yield s
            yielded.add(s)
        if depth == 0: 
            return  # max depth reached
        visited.add(s)
        for u in G[s]:
            if u in visited:
                continue
            for v in rec(G,u,depth=depth-1,visited=visited):
                yield v
    n = len(G)
    for d in range(n):
        if len(yielded) == n: #all nodes visited
            break
        for u in rec(G,start,d):
            yield u
list(idDFS(G,'a'))

['a', 'e', 'f', 'b']

In [58]:
from collections import deque
def bfs(G,start):
    visited,ToVisit = {start:None},deque([start])
    while ToVisit:
        u = ToVisit.popleft()
        for v in G[u]:
            if v in visited:
                continue
            visited[v] = u
            ToVisit.append(v)
    return visited
bfs(G,'b')

{'b': None, 'e': 'b', 'f': 'b', 'g': 'b', 'a': 'e', 'd': 'f', 'c': 'f'}

### Strongly Connected Component
+ A connected component is a maximal subgraph where all nodes can reach each other if you ignore edge directions (or if the graph is undirected). To get strongly connected components, though, you need to follow the edge directions; so, SCCs are the maximal subgraphs where there is a directed path from any node to any other.
+ In fact, in general, if there is an edge from any strong component X to another strong component Y, the last finish time in X will be later than the latest in Y.

In [64]:
G = {
    'a':set('bc'),
    'b':set('dei'),
    'c':set('d'),
    'd':set('ah'),
    'e':set('f'),
    'f':set('g'),
    'g':set('eh'),
    'h':set('i'),
    'i':set('h'),
}

In [67]:
def walk(G,start='d',S = set()):
    visited,ToVisit = dict(), set()
    ToVisit.add(start)
    visited[start] = None
    while ToVisit:
        u = ToVisit.pop()
        for v in G[u].difference(visited,S):
            ToVisit.add(v)
            visited[v] = u  #the parent node of this node in the traversal tree
    return visited

In [65]:
DFS_topsort(G)

['a', 'b', 'e', 'f', 'g', 'c', 'd', 'h', 'i']

In [68]:
def tr(G): #reverse all edges of g
    GT = {}
    for u in G:
        GT[u] = set()
    for u in G:
        for v in G[u]:
            GT[v].add(u)
    return GT
def scc(G):
    GT = tr(G)
    scc,seen = [],set()
    for u in DFS_topsort(G):
        if u in seen:
            continue
        C = walk(GT,start=u,S=seen)
        seen.update(C)
        scc.append(C)
    return scc
scc(G)

[{'a': None, 'd': 'a', 'c': 'd', 'b': 'd'},
 {'e': None, 'g': 'e', 'f': 'g'},
 {'h': None, 'i': 'h'}]