## Graphs boot camp

Graphs are ideal for modeling and analyzing relathonships between pairs of objects. For example, suppose you were given a list of the outcomes of matches between pairs of teams, with each outcome being a win or loss. A natural question is as follows: given teams A and B, is there a sequence of teams starting with A and ending with B such that each team in the sequence has beaten the next team in the sequence?

A slick way of solving this problem is to model the problem using a graph. Teams are vertices, and an edge from one team to another indicates that the team corresponding to the source vertex has beaten the team corresponding to the destination vertex. Now we can apply graph reachability to perform the check. Both DFS and BFS are reasonable approaches--the program below uses DFS. 

In [1]:
import collections

In [11]:
MatchResult = collections.namedtuple('MatchResult',
                                    ('winning_team', 'losing_team'))

def can_team_a_beat_team_b(matches, team_a, team_b):
    def build_graph():
        graph = collections.defaultdict(set)
        for match in matches:
            graph[match.winning_team].add(match.losing_team)
        return graph
    def is_reachable_dfs(graph, curr, dest, visited=set()):
        if curr == dest:
            return True
        elif curr in visited or curr not in graph:
            return False
        visited.add(curr)
        print(visited)
        return any(is_reachable_dfs(graph, team, dest) for team in graph[curr])
    
    return is_reachable_dfs(build_graph(),team_a, team_b)

In [12]:
mr1 = MatchResult('a', 'b')
mr2 = MatchResult('b', 'a')
mr3 = MatchResult('a', 'c')
mr4 = MatchResult('b', 'k')
mr5 = MatchResult('d', 'c')
mr6 = MatchResult('c', 'e')
mr7 = MatchResult('e', 'd')
mr8 = MatchResult('d', 'h')
mr9 = MatchResult('k', 'i')
mr10 = MatchResult('i', 'j')
mr11 = MatchResult('j', 'l')
mr12 = MatchResult('l', 'i')
mr13 = MatchResult('j', 'f')
mr14 = MatchResult('f', 'g')
mr15 = MatchResult('g', 'f')
mr16 = MatchResult('g', 'h')
mr17 = MatchResult('m', 'n')
mr18 = MatchResult('n', 'm')
matches = [mr1, mr2, mr3, mr4, mr5, mr6, mr7, mr8, mr9, mr10, 
          mr11, mr12, mr13, mr14, mr15, mr16, mr17, mr18]

In [13]:
can_team_a_beat_team_b(matches, 'a', 'm')

{'a'}
{'c', 'a'}
{'c', 'e', 'a'}
{'c', 'e', 'a', 'd'}
{'a', 'd', 'e', 'b', 'c'}
{'a', 'd', 'e', 'k', 'b', 'c'}
{'i', 'a', 'd', 'e', 'k', 'b', 'c'}
{'i', 'a', 'd', 'e', 'k', 'b', 'c', 'j'}
{'i', 'a', 'd', 'e', 'k', 'b', 'c', 'f', 'j'}
{'i', 'a', 'd', 'g', 'e', 'k', 'b', 'c', 'f', 'j'}
{'i', 'l', 'a', 'd', 'g', 'e', 'k', 'b', 'c', 'f', 'j'}


False

In [14]:
can_team_a_beat_team_b(matches, 'a', 'h')

{'a'}
{'c', 'a'}
{'c', 'e', 'a'}
{'c', 'e', 'a', 'd'}


True

In [7]:
graph = collections.defaultdict(set)

In [8]:
for match in matches:
    graph[match.winning_team].add(match.losing_team)

In [9]:
print(graph)

defaultdict(<class 'set'>, {'a': {'c', 'b'}, 'b': {'a', 'k'}, 'd': {'c', 'h'}, 'c': {'e'}, 'e': {'d'}, 'k': {'i'}, 'i': {'j'}, 'j': {'f', 'l'}, 'l': {'i'}, 'f': {'g'}, 'g': {'f', 'h'}, 'm': {'n'}, 'n': {'m'}})


In [10]:
dest = 'e'
curr = 'a'
for team in graph[curr]:
    print(team)

c
b


The time complexity and space complexity are both O(E), where E is the number of outcomes. 

## 18.1 Search a maze

Consider a black and white digitized image of a maze--while pixels represent open areas and black spaces are walls. There are two special white pixels: one is designated the entrance and the ohter is the exit. The goal in this problem is to find a way of getting from the entrance to the exit. 

Give a 2D array of black and white entries representing a maze with designated entrance and exit points, find a path from the entrance to the exit, if one exists. 

**Sol:** Modeling the maze as a graph. Each vertex corresponds to a white pixel. We will index the vertices based on the coordinates of the corresponding pixel, i.e., vertex $v_{i,j}$ corresponds to the white entry at (i,j) in the 2D array. Edges model adjacent white pixels. 

Now, run a DFS starting from the vertex corresponding to the entrance. If at some point, we discover the exit vertex in the DFS, then there exists a path from the entrance to the exit. If we implement recursive DFS then the path would consist of all the vertices in the call stack corresponding to previous recursive calls to the DFS routine. 

In [85]:
WHITE, BLACK = range(2)

Coordinate = collections.namedtuple('Coordinate', ('x', 'y') )

def search_maze(maze, s: Coordinate, e: Coordinate) -> list:
    # Perform DFS to find a feasible path
    def search_maze_helper(cur):
        # Checks cur is within maze and is a white pixel
        print(cur.x, cur.y)
        if not(0 <= cur.x < len(maze) and 0 <= cur.y < len(maze[cur.x])
              and maze[cur.x][cur.y] == WHITE):
            return False 
        path.append(cur)
        maze[cur.x][cur.y] = BLACK # has passed by this pixel
        if cur == e:
            return True
        
        search_cur = list(map(Coordinate, (cur.x - 1, cur.x+1, cur.x, cur.x),
                      (cur.y, cur.y, cur.y-1, cur.y +1)))
        if any(map(search_maze_helper,search_cur)):
            return True
        # Cannot find a path, remove the entry added in path.append(cur)
        print(path)
        del path[-1]
        return False
    
    path = []
    search_maze_helper(s)
    return path

In [68]:
a = map(Coordinate, (0,0,0,0,1,1,2,2,3,3,3,3), (0,1,2,3,1,3,1,2,0,1,2,3))

In [69]:
print(list(a))

[Coordinate(x=0, y=0), Coordinate(x=0, y=1), Coordinate(x=0, y=2), Coordinate(x=0, y=3), Coordinate(x=1, y=1), Coordinate(x=1, y=3), Coordinate(x=2, y=1), Coordinate(x=2, y=2), Coordinate(x=3, y=0), Coordinate(x=3, y=1), Coordinate(x=3, y=2), Coordinate(x=3, y=3)]


In [90]:
maze = [[0,1,0,0],
       [1,0,1,0],
       [1,0,0,1],
       [0,0,0,0]]

In [87]:
s = Coordinate(3,0)
e = Coordinate(0,3)

In [88]:
print(s)

Coordinate(x=3, y=0)


In [91]:
search_maze(maze,s,e)

3 0
2 0
4 0
3 -1
3 1
2 1
1 1
0 1
2 1
1 0
1 2
[Coordinate(x=3, y=0), Coordinate(x=3, y=1), Coordinate(x=2, y=1), Coordinate(x=1, y=1)]
3 1
2 0
2 2
1 2
3 2
2 2
4 2
3 1
3 3
2 3
4 3
3 2
3 4
[Coordinate(x=3, y=0), Coordinate(x=3, y=1), Coordinate(x=2, y=1), Coordinate(x=2, y=2), Coordinate(x=3, y=2), Coordinate(x=3, y=3)]
[Coordinate(x=3, y=0), Coordinate(x=3, y=1), Coordinate(x=2, y=1), Coordinate(x=2, y=2), Coordinate(x=3, y=2)]
2 1
2 3
[Coordinate(x=3, y=0), Coordinate(x=3, y=1), Coordinate(x=2, y=1), Coordinate(x=2, y=2)]
[Coordinate(x=3, y=0), Coordinate(x=3, y=1), Coordinate(x=2, y=1)]
4 1
3 0
3 2
[Coordinate(x=3, y=0), Coordinate(x=3, y=1)]
[Coordinate(x=3, y=0)]


[]

In [61]:
maze[0][3] = 0

In [62]:
WHITE

0

In [63]:
BLACK

1

The time complexity is the same as that for DFS, namely O(|V| + |E|). 

## 18.2 Paint a Boolean matrix 

Let A be a Boolean 2D array encoding a black-and-white image. The entry A(a,b) can be viewed as encoding the color at entry (a,b). Call two entries adjacent if one is to the left, right, above or below the other. Note that the definition implies that an entry can be adjacent to at most four ohter netries,and that adjacency is symmetry, i.e., if e0 is adjacent to entry e1, then e1 is adjacent to e0. 

Implement a routine that takes an n*m Boolean array A together with an entry (x,y) and flips the color of the region associated with (x,y). 

**Sol:** For current problem, we are searching for all vertices whose color is the same as that of (x,y) that are reachable from (x,y). Breadth-first search is natural when starting with a set of vertices. Specifically, we can use a queue to store such vertices. The queue is initialized to (x,y). The queue is popped  iteratively. Call the popped point p. First, we record p's initial color, and then flip its color. Next we examine p neighbors. Any neighbour which is the same color as p's initial color is added to the queue. The computation ends when the queue is empty. Correctness follows from the fact that any point that is added to the queue is reachable from (x,y) via a path consisting of points of the same color, and all reachable form (x,y) via points of the same color will eventually be added to the queue. 

In [113]:

def flip_color(x: int, y:int, image: list) -> None:
    color = image[x][y]
    q = collections.deque([(x,y)])
    image[x][y] ^= 1 #flips 
    while q:
        x,y = q.popleft()
        for next_x, next_y in ((x,y+1), (x,y-1), (x-1,y), (x+1,y)):
            if (0<= next_x < len(image) and 0 <= next_y < len(image[next_x])
                and image[next_x][next_y] == color):
                # Flips the color
                image[next_x][next_y] ^= 1
                q.append((next_x, next_y))
                

In [125]:
image = [[1,0,1,1,1],
        [0,0,0,1,1],
        [0,0,1,0,1],
        [0,0,1,0,1],
        [1,0,1,1,0]]

#flip_color(0,0,image)

In [119]:
for i in range(len(image)):
    print(image[i])

[0, 0, 1, 1, 1]
[0, 0, 0, 1, 1]
[0, 0, 1, 0, 1]
[0, 0, 1, 0, 1]
[1, 0, 1, 1, 0]


In [105]:
1^1

0

In [106]:
1^0

1

In [107]:
0^1

1

In [108]:
0^0

0

In [109]:
val = 1
val ^= 1

In [110]:
val

0

In [111]:
val ^=1

In [112]:
val

1

The time complexity is the same as that of BFS, i.e., O(mn). The space complexity is a little better than the worst-case for BFS, since there are at most O(m+n) vertices that are at the same distance from a given entry. 

We also provide a recursive solution which is in the spirit of DFS. It does not need a queue but implicitly uses a stack, namely the function call stack. 

In [121]:
def flip_color_2(x:int, y:int, image) -> None:
    color = image[x][y]
    image[x][y] ^= 1 # flips 
    for next_x, next_y in ((x,y+1), (x,y-1), (x+1, y), (x-1,y)):
        if (0 <= next_x < len(image) and 0 <= next_y < len(image[next_x])
           and image[next_x][next_y] == color):
            flip_color_2(next_x, next_y, image)

In [126]:
flip_color_2(4,0,image)

In [127]:
for i in range(len(image)):
    print(image[i])

[1, 0, 1, 1, 1]
[0, 0, 0, 1, 1]
[0, 0, 1, 0, 1]
[0, 0, 1, 0, 1]
[0, 0, 1, 1, 0]


In [128]:
flip_color_2(4,0,image)

In [131]:
for i in range(len(image)):
    print(image[i])

[1, 1, 1, 1, 1]
[1, 1, 1, 1, 1]
[1, 1, 1, 0, 1]
[1, 1, 1, 0, 1]
[1, 1, 1, 1, 0]


The time complexity is O(mn) and space complexity is O(1).

## 18.3 Compute enclosed regions 

This problem is concerned with computing regions within a 2D grid that are enclosed. There is no path from any of them to the boundary that only passes throught white squares. The computational problem can be formalized using 2D arrays of Bs(blacks) and Ws(whites). Let A be a 2D array whose entries are either W or B. Write a program that takes A, and replaces all Ws that cannot reach the boundary without a B. 

**Sol:** It is easier to focus on the inverse problem, namely identifying Ws that can reach the boundary. The reason that the inverse is simpler is that if a W is adjacent to a W that can reach the boundary, then the first W can reach it too. The Ws on the boudary are the initial set. Subsequently, we find Ws neighboring the boundary Ws, and iteratively grow the set. Whenever we find a new W that can reach the boundary, we need to record it, and at some stage search for new Ws from it. A queue is a reasonable data structure to track Ws to be processed. The approach amounts to breadth-first search starting with a set of vertices rather than a single vertex. 

In [132]:
def fill_surrouned_regions(board:list) -> None:
    n,m = len(board), len(board[0])
    q = collections.deque([(i,j) for k in range(n) for i,j in ((k,0), (k,m-1))]
                         + [(i,j) for k in range(m) for i, j in ((0,k), (n-1,k))])
    while q:
        x,y = q.popleft()
        if 0 <= x < n and 0 <= y < m and board[x][y] == 'W':
            board[x][y] = 'T'
            q.extend([(x-1,y), (x+1, y), (x, y-1), (x, y+1)])
    board[:] = [['B' if c != 'T' else 'W' for c in row] for row in board]

In [133]:
A = [['B', 'B',  'B', 'B'],
    ['W', 'B', 'W', 'B'],
    ['B', 'W', 'W', 'B'],
    ['B', 'B', 'B', 'B']]

In [135]:
fill_surrouned_regions(A)

In [137]:
for i in range(len(A)):
    print(A[i])

['B', 'B', 'B', 'B']
['W', 'B', 'B', 'B']
['B', 'B', 'B', 'B']
['B', 'B', 'B', 'B']


In [139]:
n = 4
m= 4

In [144]:
for k in range(n):
    for i,j in ((k,0), (k,m-1)):
        print(i,j)

0 0
0 3
1 0
1 3
2 0
2 3
3 0
3 3


In [145]:
for k in range(m):
    for i,j in ((0,k), (m-1,k)):
        print(i,j)

0 0
3 0
0 1
3 1
0 2
3 2
0 3
3 3


In [146]:
temp = [(i,j) for k in range(n) for i,j in ((k,0), (k,m-1))] + [(i,j) for k in range(m) for i,j in ((0,k), (n-1,k))]

In [147]:
print(temp)

[(0, 0), (0, 3), (1, 0), (1, 3), (2, 0), (2, 3), (3, 0), (3, 3), (0, 0), (3, 0), (0, 1), (3, 1), (0, 2), (3, 2), (0, 3), (3, 3)]


In [148]:
q = collections.deque(temp)

In [149]:
print(q)

deque([(0, 0), (0, 3), (1, 0), (1, 3), (2, 0), (2, 3), (3, 0), (3, 3), (0, 0), (3, 0), (0, 1), (3, 1), (0, 2), (3, 2), (0, 3), (3, 3)])


In [150]:
while q:
    x,y = q.popleft()
    print(x,y)

0 0
0 3
1 0
1 3
2 0
2 3
3 0
3 3
0 0
3 0
0 1
3 1
0 2
3 2
0 3
3 3


In [151]:
x = 0
y = 0
q.extend([(x-1, y), (x+1,y), (x,y-1), (x,y+1)])

In [152]:
print(q)

deque([(-1, 0), (1, 0), (0, -1), (0, 1)])


The time and space complexity are the same as those for BFS, namely O(mn), where m and n are the number of rows and columns in A. 

## 18.4 Deadlock detection 

High performance database systems use multiple processes and resource locking. These systems may not provide mechanisms to avoid or prevent deadlock: a situation in which two or more competing actions are each waiting for the ohter to finish, which precludes all these actions from progressing. Such systems must support a mechanism to detect deadlocks, as well as an algorithm for recovering from them. 

One deadlock detection algorithm makes use of a "wait-for" graph to track which other processes a process is currently blocking on. In a wait-for graph, processes are represented as nodes, and an edge from process P to Q implies Q is holding a resource that P needs and thus P is waiting for Q to release its lock on that resource. A cycle in this graph implies the possibility of a deadlock. This motivates the following problem. 

Write a program that takes as input a directed graph and checks if the graph contains a cycle. 

**Sol:** We can check for existence of a cycle in G by running DFS on G. Recall DFS maintains a color for each vertex. Initially, all vertices are white. When a vertex first discovered, it is colored gray. When DFS finishes processing a vetex, that vertex is colored black. 

As soon as we discover an edge from a gray vertex back to a gray vertex, a cycle exists in G and we can stop. Conversely, if there exists a cycle, once we first reach vertex in the cycle (call it v)f, we will visit its predecessor in the cycle (call it u) before finishing processing v, i.e., we will find an edge from a gray to a gray vertex. In summary, a cycle exists if and only if DFS discovers an edge from a gray vertex to a gray vertex. Since the graph may not be strongly connected, we must examine each vertex, and run DFS from it if it has not already been explored. 

In [12]:
class GraphVertex:
    WHITE, GRAY, BLACK = range(3)
    
    def __init__(self) -> None:
        self.color = GraphVertex.WHITE
        self.edges = []

In [28]:
def is_deadlocked(graph: list) -> bool:
    def has_cycle(cur):
        # Visiting a gray vertex means a cycle
        if cur.color == GraphVertex.GRAY:
            return True
        
        cur.color = GraphVertex.GRAY # Marks current vertex as a gray one 
        # Traverse the neighbor vertices 
        if any(next.color != GraphVertex.BLACK and has_cycle(next)
              for next in cur.edges):
            return True
        cur.color = GraphVertex.BLACK #Marks current vertex as black
        return False
    
    return any(vertex.color == GraphVertex.WHITE and has_cycle(vertex)
              for vertex in graph)

In [13]:
a = GraphVertex()
b = GraphVertex()
c = GraphVertex()
d = GraphVertex()
e = GraphVertex()
f = GraphVertex()
g = GraphVertex()
h = GraphVertex()
i = GraphVertex()
j = GraphVertex()
k = GraphVertex()
l = GraphVertex()
m = GraphVertex()
n = GraphVertex()

In [14]:
a.edges = [b,c]

In [15]:
for next in a.edges:
    print(next)

<__main__.GraphVertex object at 0x7f2a80d7edd8>
<__main__.GraphVertex object at 0x7f2a80d7ee48>


In [33]:
b.edges = [k,a]
c.edges = [e]
d.edges = [c,h]
e.edges = [d]
f.edges = [g]
g.edges = [h]
h.edges = []
i.edges = [j]
j.edges = [f]
l.edges = [i]
m.edges = [n]
n.edges = [m]

In [34]:
graph =[a,b,f,g,h,i,j,k,l]

In [35]:
is_deadlocked(graph)

False

The time complexity of DFS is O(|V| + |E|): we iterate over all vertices, and spend a constant amount of time per edge. The space complexity is O(|V|), which is the maximum stack depth --if we go deeper than |V| calls, some vertex must repeat, implying a cycle in the graph, which leads to early termination. 

## 18.5 Clone a graph

Consider a vertex type for a directed graph in which there are two fields: an integer label and a list of references to other vertices. Design an algorithm that takes a reference to a vertex u, and creates a copy of the graph on the vertices reachable from u. Return the copy of u. 

**Hint:** Maintain a map from vertices in the original graph to their counterparts in the clone.

**Sol:** We traverse the graph starting from u. Each time we encounter a vertex or an edge that is not yet in the clone, we add it to the clone. We recognize new vertices by maintaining a hash table mapping vertices in the orignal graph to their counterparts in the new graph. Any standard graph traversal algorithm works --the code below uses breadth first search. 

In [36]:
class GraphVertex:
    def __init__(self, label: int) -> None:
        self.label = label
        self.edges = []

In [37]:
def clone_graph(graph: GraphVertex) -> GraphVertex:
    if graph is None:
        return None
    
    q = collections.deque([graph])
    vertex_map = {graph: GraphVertex(graph.label)}
    
    while q:
        v = q.popleft()
        for e in v.edges:
            # Try to copy vertex e
            if e not in vertex_map:
                vertex_map[e] = GraphVertex(e.label)
                q.append(e)
            # Copy edge 
            vertex_map[v].edges.append(vertex_map[e])
    return vertex_map[graph]

In [38]:
a = GraphVertex(1)
b = GraphVertex(2)
c = GraphVertex(3)
d = GraphVertex(4)
e = GraphVertex(5)
f = GraphVertex(6)
g = GraphVertex(7)
h = GraphVertex(8)
i = GraphVertex(9)
j = GraphVertex(10)
k = GraphVertex(11)
l = GraphVertex(12)
m = GraphVertex(13)
n = GraphVertex(14)

In [39]:
a.edges = [b,c]
b.edges = [a,k]
c.edges = [a,e,d]
d.edges = [e,c,h]
e.edges = [c,d]
f.edges = [j,g]
h.edges = [g,d]
i.edges = [j,l]
j.edges = [l,f]
k.edges = [b,i]
l.edges = [i,j]
m.edges = [n]
n.edges = [m]

In [40]:
graph = [a,b,c,d,e,f,g,h,i,j,k,l,m,n]

In [41]:
copy = clone_graph(b)

In [42]:
copy.label

2

In [43]:
copy.edges

[<__main__.GraphVertex at 0x7f2a80d7e0b8>,
 <__main__.GraphVertex at 0x7f2a80d7e860>]

In [44]:
for i in copy.edges:
    print(i.edges)

[<__main__.GraphVertex object at 0x7f2a80d7e7f0>, <__main__.GraphVertex object at 0x7f2a80d7e780>]
[<__main__.GraphVertex object at 0x7f2a80d7e7f0>, <__main__.GraphVertex object at 0x7f2a80d7e748>]


In [45]:
for i in copy.edges:
    for j in i.edges:
        print(j.label)

2
3
2
9


In [46]:
for i in copy.edges:
    for j in i.edges:
        for k in j.edges:
            print(k.label)

1
11
1
5
4
1
11
10
12


Remark: did not find a good way to print graph with edges 

The space complexity is O(|V| + |E|), which is the space taken by the result. Excluding the space for the result, the space complexity is O(|V|)--this comes from the hash table, as well as the BFS queue. 

## 18.6 Making wired connections

Consider a collection of electrical pins on a printed circuit board (PCB). For each pair of pins, there may or may not be a wire joing them. 

Design an algorithm that takes a set of pins and a set of wires conecting pairs of pins, and determines if it is possible to place some pins on the left half of a PCB, and the remainder on the right half, such that each wire between left and right halves. Return such a division, if one exists. 

**Sol:** A better approach is to use connectivity information to guide the partitioning. Assume the pins are numbered from 0 to p-1. Create an undirected graph G whose vertices are pins. Add an edge between pairs of vertices if the corresponding pins are connected by a wire. For simplicity, assume G is connected; if not, the connected components can be analyzed independently. 

Run BFS on G beginning with any vertex v_0. Assign v_0 arbitrarily to lie on the left half. All vertices at an odd distance from v_0 are assigned to the right half. 

**A cycle in which the vertices can be partitioned into two sets must have an even number of edges-- it has to go back and forth between the sets and terminate at the starting vetex, and each back and forth adds two edges. Therefore, the vertices in an odd length cycle cannot be partitioned into two sets such that all edges are between the sets.**

In [48]:
class GraphVertex:
    def __init__(self) -> None:
        self.d = -1
        self.edges = []

In [57]:
def is_any_placement_feasible(graph: list) -> bool:
    def bfs(s):
        s.d = 0
        q = collections.deque([s])
        
        while q:
            for t in q[0].edges:
                if t.d == -1: # Unvisited vertex
                    t.d = q[0].d + 1
                    q.append(t)
                elif t.d == q[0].d:
                    return False
            print(q[0].d)
            del q[0]
        return True
    return all(bfs(v) for v in graph if v.d == -1)

In [58]:
a = GraphVertex()
b = GraphVertex()
c = GraphVertex()
d = GraphVertex()
e = GraphVertex()
f = GraphVertex()
g = GraphVertex()
h = GraphVertex()
i = GraphVertex()
j = GraphVertex()
k = GraphVertex()
l = GraphVertex()
m = GraphVertex()
n = GraphVertex()

In [59]:
a.edges = [b,c]
b.edges = [a,k]
c.edges = [a,e,d]
d.edges = [e,c,h]
e.edges = [c,d]
f.edges = [j,g]
h.edges = [g,d]
i.edges = [j,l]
j.edges = [l,f]
k.edges = [b,i]
l.edges = [i,j]
m.edges = [n]
n.edges = [m]

In [60]:
graph = [a,b,c,d,e,f,g,h,i,j,k,l,m,n]
is_any_placement_feasible(graph)

0
1
1
2


False

The complexity is the same as for BFS, i.e., O(p+w) time complexity, where w is the number of wires, and O(p) space complexity.

Graphs that can be partitioned as described above are known as bipartite graphs. Another term for such graphs is 2-colorable (since the vertices can be assigned one of two colors without neighboring vertices having the same color.)

## 18.7 Transform one string to another 

Let s and t be strings and D a disctioary, i.e., a set of strings. Define s to produce t if there exists a sequence of strings from the dictionary P=<s_0,s_1,...s_{n-1}> such taht the first string is s, the last string is t, and the adjacent strings have the same length and differ in exactly one character. The sequence P is called a production sequence. E.g. <cat, cot, dot, dog> is production sequence. 

Given a dictionary D and two strings s and t, write a program to determine if s produces t. Assume that all characters are lowercase alphabets. If s does product t, output the length of a shortest production sequence; otherwise, output -1. 

**Hin:** Treat strings as vertices in an undirected graph, with an edge between u and v if and only if the corresponding strings differ in one character. 

A production sequence is simply a path in G, so what we need is a shortest path from s to t in G. Shortest paths in an undirected graph are naturally computed using BFS. 

In [61]:
import string

In [72]:
# Uses BFS to find the least steps of transformation.

def transform_string(D:set, s:str, t:str) -> int:
    StringWithDistance = collections.namedtuple(
    'StringWithDistance', ('candidate_string', 'distance'))
    
    q = collections.deque([StringWithDistance(s,0)])
    D.remove(s) #Marks s being visited by erasing it in D
    
    while q:
        f = q.popleft()
        # Return if we find a match
        if f.candidate_string == t:
            return f.distance # Number of steps reaches t. 
        
        # Tries all possible transformations of f.candidate_string
        for i in range(len(f.candidate_string)):
            for c in string.ascii_lowercase: # Iterates throuhg 'a'-'z'.
                cand = f.candidate_string[:i] + c + f.candidate_string[i+1:]
                if cand in D:
                    D.remove(cand)
                    q.append(StringWithDistance(cand, f.distance+1))
        print(q)
    return -1 # Cannot find a possible transformations. 

In [92]:
D = set(['bat', 'cot', 'dog', 'dag','dot','cat','fag','bet', 'cug', 'ceg'])

In [88]:
s = 'bat'

In [75]:
t = 'cot'

In [76]:
transform_string(D,s,t)

deque([StringWithDistance(candidate_string='cat', distance=1)])
deque([StringWithDistance(candidate_string='cot', distance=2)])


2

In [78]:
t= 'dog'

In [82]:
transform_string(D,s,t)

deque([StringWithDistance(candidate_string='cat', distance=1)])
deque([StringWithDistance(candidate_string='cot', distance=2)])
deque([StringWithDistance(candidate_string='dot', distance=3)])
deque([StringWithDistance(candidate_string='dog', distance=4)])


4

In [85]:
t = 'fag'

In [86]:
transform_string(D,s,t)

deque([StringWithDistance(candidate_string='cat', distance=1)])
deque([StringWithDistance(candidate_string='cot', distance=2)])
deque([StringWithDistance(candidate_string='dot', distance=3)])
deque([StringWithDistance(candidate_string='dog', distance=4)])
deque([StringWithDistance(candidate_string='dag', distance=5)])
deque([StringWithDistance(candidate_string='fag', distance=6)])


6

In [89]:
t = 'bet'

In [90]:
transform_string(D,s,t)

deque([StringWithDistance(candidate_string='cat', distance=1), StringWithDistance(candidate_string='bet', distance=1)])
deque([StringWithDistance(candidate_string='bet', distance=1), StringWithDistance(candidate_string='cot', distance=2)])


1

In [93]:
t= 'cug'

In [94]:
transform_string(D,s,t)

deque([StringWithDistance(candidate_string='cat', distance=1), StringWithDistance(candidate_string='bet', distance=1)])
deque([StringWithDistance(candidate_string='bet', distance=1), StringWithDistance(candidate_string='cot', distance=2)])
deque([StringWithDistance(candidate_string='cot', distance=2)])
deque([StringWithDistance(candidate_string='dot', distance=3)])
deque([StringWithDistance(candidate_string='dog', distance=4)])
deque([StringWithDistance(candidate_string='dag', distance=5)])
deque([StringWithDistance(candidate_string='fag', distance=6)])
deque([])


-1

In [95]:
s

'bat'

In [96]:
D

{'ceg', 'cug'}

The number of vertices is d, the number of words in the dictionary. The number of edges is, in the worst-case, O(d^2). The time complexity is that of BFS, namely O(d+d^2) = O(d^2). If the string length n is less than d then the maximum number of edges out of the vertex is O(n), implying an O(dn) space complexity bound. 