# <center> **Depth-first Search vs Breadth-first Search**

### **Important Notes:**
*   **Deadline: 12PM (Noon) Feb 27, 2023** 
*   Any questions should be sent to [minhducl@uark.edu](https://) (author) and [sangt@uark.edu](https://)
*   Please **start early** and read the instructions as well as examples. 
*   **We recommend running this file in Google Colab**
*   Run the following code cells for data







In [None]:
# the data will be downloaded to content directory
! gdown 14En8oUe2Upott1zQGZ3okLLIYrB1jAS- --quiet
! unzip -qq data.zip -d ./
! rm data.zip

If you click on the Files icon on the left panel of Google Colab, you will see a folder named *data* that contains all the test cases and examples. 

#**I. Introduction**
In this homework, we will explore how to implement 2 most popular search algorithms on *undirected graph*, i.e. Depth-first Search (DFS) and Breadth-first Search (BFS), and their real-world applications. But before we could lay our hands on search, we should have a proper definition of Graph and start writing its API first. 


> **Definition:** A *graph* is a set of vertices and a collection of *edges* that each connect a pair of vertices.





<center><img src="https://drive.google.com/uc?export=view&id=1ERKwXnrEXm4DWjJrxwZqg67jGHMrqm-g" width="35%" height="30%">
<figcaption>Figure 1: Example of a graph. This graph has 6 vertices with 7 edges.</figcaption> </center>

An example of graph can be seen in Fig 1, which has 6 vertices, named 0 to 5, and 7 edges connecting them. Within the scope of this homework, we use the names 0 through V-1 for the vertices in a V-vertex graph. Normally, a vertex name is not necessarily a number. This naming convention will simplify our implementation given that we have not covered symbol table concept. We use the notation **v-w** to refer to an edge that connects v and w. For example: 0-1 is an edge connecting 0 and 1, and 1-0 refers to the same edge in undirected graph. 

Before developing any graph-processing algorithms, we want an API that defines the fundamental graph operations. The Graph API below contains the very basic operations for undirected graph. It includes one constructor that will process the file containing graph information, methods to return the number of vertices and edges, a method to add an edge, a method to print the content of the graph in string, and method *adj* to iterate through all the vertices adjacent to a given vertex.


**Graph API**
```
class Graph:
    __init__(in_file: str)->None  # read a graph from file
    V()->int                      # number of vertices
    E()->int                      # number of edges
    addEdge(v: int, w: int)->None # add edge v-w to this graph
    adj(v: int)->Bag              # vertices adjacent to v
    __str__()->int                # string representation
```

In [1]:
# import the bag implementation from homework 1
from data.adt import Bag

class Graph:
    def __init__(self, in_file: str) -> None:
        # read the input file
        with open(in_file, 'r') as f:
            lines = f.readlines()

        # the line is V
        self.vertex_num = int(lines[0])  # number of vertices
        self.edge_num = int(lines[1])  # number of edges

        # create adjacency lists, each element in this list is a Bag object from HW1
        self.graph_adjacency = [Bag() for _ in range(self.vertex_num)]

        # read each lines to get the edges
        for line in lines[2:]:
            v, w = line.split()
            self.__addEdge(int(v), int(w))

    # return the number of vertices
    @property
    def V(self):
        return self.vertex_num

    # return the number of edges
    @property
    def E(self):
        return self.edge_num

    # add edge v-w to this graph
    def __addEdge(self, v: int, w: int) -> None:
        self.graph_adjacency[v].add(w)
        if v != w:  # if self-loop, only add edge one time
            self.graph_adjacency[w].add(v)

    # return all the vertices adjacent to v
    def adj(self, v: int) -> Bag:
        return self.graph_adjacency[v]

    # string representation of the graph - print()
    def __str__(self):
        s = f"{self.vertex_num} vertices, {self.edge_num} edges \n"
        for v in range(self.V):
            s += f"{v} : "
            for w in self.adj(v):
                s += f"{w} "
            s += "\n"
        return s

Let's create the graph given in Fig 1

In [2]:
# construct a small graph in figure 1
graph = Graph("data/example_G.txt")

print("Information about the graph")
print(graph)

Information about the graph
6 vertices, 7 edges 
0 : 4 3 1 2 
1 : 5 4 0 
2 : 3 0 
3 : 2 0 
4 : 1 0 
5 : 1 



To iterate through all the vertices adjacent to vertex 0, we simply do

In [None]:
print("Vertices adjacent to vertex 0 are", end=' ')
for v in graph.adj(0):
  print(v, end=' ')

Vertices adjacent to vertex 0 are 4 3 1 2 

It looks very simple, right? Now it's your turn to explore what else we could do with this API and graph. The following problems are good warmup before we work on DFS and BFS.

## **Problem 1: (5pts)** 
Write a function *degree* that takes a graph and a vertex as input, and returns the degree of that vertex. 

> **Definition:** The degree of a vertex is the number of edges incident to it


In [None]:
def degree(graph: Graph, v: int) -> int:
    return len(graph.graph_adjacency[v])

### **Test Cases**

In [None]:
graph_ex = Graph("/]/data/example_G.txt")
assert(degree(graph_ex, 1) == 3)

graph_tc_1 = Graph("/content/data/tc_1.txt")
assert(degree(graph_tc_1, 11) == 2)

graph_tc_2 = Graph("/content/data/tc_2.txt")
assert(degree(graph_tc_2, 2) == 4)

graph_tc_3 = Graph("/content/data/tc_3.txt")
assert(degree(graph_tc_3, 1) == 1)


## **Problem 2: (5pts)** 
Write a function *maxDegree* that takes a graph as input, and returns the maximum degree of that graph. 


In [None]:
def maxDegree(graph: Graph)->int:
    max_degree = 0
    for item in graph.graph_adjacency:
        if len(item) > max_degree:
            max_degree=len(item)
    return max_degree

### **Test Cases**

In [None]:
graph_ex = Graph("data/example_G.txt")
assert(maxDegree(graph_ex) == 4)

graph_tc_1 = Graph("data/tc_1.txt")
assert(maxDegree(graph_tc_1) == 4)

graph_tc_2 = Graph("data/tc_2.txt")
assert(maxDegree(graph_tc_2) == 4)

graph_tc_3 = Graph("data/tc_3.txt")
assert(maxDegree(graph_tc_3) == 2)

## **Problem 3: (5pts)** 
Write a function *numberOfSelfLoops* that takes a graph as input, and returns the number of self-loops.


<center><img src="https://drive.google.com/uc?export=view&id=1Vi-VvGhQcENPrKw8IEeS4jEqG7Ey2ANz" width="25%" height="25%">
<figcaption>Figure 2: Example of a self loop.</figcaption> </center>

In [None]:
def numberOfSelfLoops(graph: Graph) -> int:
    count = 0
    for item in range(len(graph.graph_adjacency)):
        if item in graph.adj(item):
            count += 1

    return count

### **Test Cases**

In [None]:
graph_ex = Graph("data/example_G.txt")
assert(numberOfSelfLoops(graph_ex) == 0)

graph_tc_1 = Graph("data/tc_1.txt")
assert(numberOfSelfLoops(graph_tc_1) == 0)

graph_tc_2 = Graph("data/tc_2.txt")
assert(numberOfSelfLoops(graph_tc_2) == 0)

graph_tc_3 = Graph("data/tc_3.txt")
assert(numberOfSelfLoops(graph_tc_3) == 3)

# **II. Depth-first Search (DFS)**

In this section, we will learn how to implement DFS. This algorithm is super simple with only 2 steps:


1.   Visit a vertex and mark it as having been visited
2.   Visit all the vertices that are adjacent to it and that have not yet been marked. 

The important keyword is **marked**, thus we should have an attribute in our API to keep track of visited vertices. A very interesting note about DFS is we can implement it in a recursive fashion or by using stack. We will provide you with the API that implement recursive DFS, which serves as a good template for you to implement DFS the other way. 

---
##**Q&A**:

**[Q]**: What exactly are we using DFS for in this section?

**[A]**: DFS can tackle several graph-processing problems. Within this section, we will write a common API that answer the following 2 queries:


1.   **Connectivity:** Are the two given vertices connected? How many vertices are connected to a source vertex s
2.   **Single-source paths:** Is there a path from a source vertex s to a given target vertex v? If so, return the path.
---
It can be seen that we need 2 variables to track the connectivity and the path. we will call them *marked* and *edgeTo*. *marked* tracks if search algorithm has visited the vertex or not. It is just a simple array of boolean values which are initially set to false. *edgeTo* tracks the last vertex of known path to a particular vertex. It is also an array but of integer values. An animation of how Recursive DFS works is provided in Fig 3 below. The source vertex does not have to be 0. We could start at any vertex but we choose 0 as an example. Make sure you see the changes in *marked* and *edgeTo*.  

<center><img src="https://drive.google.com/uc?export=view&id=1M-K5tBpPQnbA4oqQmW-sw4mByqw49kfp" width="60%" height="60%">
<figcaption>Figure 3: Implementation of Recursive Depth-first Search</figcaption> </center>





**Recursive Depth-first Search API**
```
class RecursiveDFS:
    __init__(graph: Graph, s: int)->None  # constructor of DFS
    dfs(graph: Graph, v: int)->None       # find all the paths
    hasPathTo(v: int)->bool               # is there path between s and v
    pathTo(v: int)->Stack                 # return the path from s to v
```

In [None]:
from data.adt import Stack 

class RecursiveDFS:
  def __init__(self, g: Graph, s: int)->None:
    self.__marked = [False]*g.V   # has dfs() visited this vertex?
    self.__edgeTo = [None]*g.V    # last vertex on known path to this vertex
    self.__s      = s             # source vertex
    self.dfs(g, s)                # doing dfs right in the constructor

  def dfs(self, g: Graph, v: int)->None:
    self.__marked[v] = True       # mark the vertex
    for w in g.adj(v):            # see the vertices adjacent to v
      if self.__marked[w] is False:  # if not visited yet
        self.__edgeTo[w] = v         
        self.dfs(g, w)

  def hasPathTo(self, v: int)->bool: return self.__marked[v]

  def pathTo(self, v: int)->Stack:
    if self.hasPathTo(v) is False: return None
    path = Stack()
    while v != self.__s:
      path.push(v)
      v = self.__edgeTo[v]
    path.push(self.__s)
    return path


Let's create the graph from the example and see if we have the similar paths found in Fig 4.
<center><img src="https://drive.google.com/uc?export=view&id=1DHe05r6oerX9czwAK00NsvG2_Q9CHIua" width="40%" height="40%">
<figcaption>Figure 4: Paths found by DFS</figcaption> </center>


In [None]:
# create the graph given in the example
graph_ex = Graph("/content/data/example_G.txt")

# run dfs 
dfs_ex = RecursiveDFS(graph_ex, 0)

# print the path to vertex i
print("The path to vertex ")
for i in range(1,6):
  print(f" - {i} is {dfs_ex.pathTo(i)}")

The path to vertex 
 - 1 is [0,4,1]
 - 2 is [0,3,2]
 - 3 is [0,3]
 - 4 is [0,4]
 - 5 is [0,4,1,5]


Our implementation matches exactly the found paths in Fig 4. Woohoo! 

Now it's your turn to implement another kind of DFS that utilizes Stack. 

## **Problem 4: (30pts)** 
Write an API for StackDFS that utilizes Stack to implement DFS. 



```
class StackDFS:
    __init__(graph: Graph, s: int)->None  # constructor of DFS
    dfs(graph: Graph, v: int)             # find all the paths
    hasPathTo(v: int)->bool               # is there path between s and v
    pathTo(v: int)->Stack                 # return the path from s to v
```

**Hint:** StackDFS is very similar to RecursiveDFS. The only difference is in the method *dfs(graph: Graph, v: int)*. Instead of recursive DFS call to the next vertex, we put the next vertex in stack instead. Stack follows LIFO policy, that means whichever vertex comes in last will leave the stack first.

In [None]:
class StackDFS:
    def __init__(self, g: Graph, s: int) -> None:
        self.__marked = [False] * g.V  # has dfs() visited this vertex?
        self.__edgeTo = [None] * g.V  # last vertex on known path to this vertex
        self.__s = s  # source vertex
        self.dfs(g, s)

    def dfs(self, g: Graph, s: int) -> None:
        stack = Stack()
        stack.push(s)
        push_back = len(g.adj(s))
        counter = 0
        self.__marked[s] = True

        while not stack.isEmpty() and counter < push_back:
            v = stack.pop()
            for w in g.adj(v):  # see the vertices adjacent to v
                if self.__marked[w] is False:  # if not visited yet
                    stack.push(w)
                    self.__edgeTo[w] = v
                    print(f"spot W {w} initializing v with {v}")
                    self.__marked[w] = True  # mark the vertex
                    break
                else:
                    stack.push(s)
                    counter +=1

    def hasPathTo(self, v: int) -> bool:
        return self.__marked[v]

    def pathTo(self, v: int):
        if self.hasPathTo(v) is False:
            return None
        path = Stack()
        while v != self.__s:
            path.push(v)
            v = self.__edgeTo[v]
        path.push(self.__s)
        return path


### **Test Cases**  

In [None]:
# create the graph given in the example
graph_ex = Graph("data/example_G.txt")

# run dfs
dfs_ex = StackDFS(graph_ex, 0)

# print the path to vertex i
print("The path to vertex ")
for i in range(1, 6):
    print(f" - {i} is {dfs_ex.pathTo(i)}")

The path to vertex 
 - 1 is [0,4,1]
 - 2 is [0,3,2]
 - 3 is [0,3]
 - 4 is [0,4]
 - 5 is [0,4,1,5]


# **III. Breadth-first Search (BFS)**



DFS seems like enough for searching problem but turns out it cannot find a shortest path (the one with a minimal number of edges). BFS extends the querry of DFS a bit by answering the single-source shortest path question. Fig 5 illustrates how this algorithm works.

<center><img src="https://drive.google.com/uc?export=view&id=1uZk__ZDlYrqrZ0zfG0z2-VYwAR94BFgq" width="60%" height="60%">
<figcaption>Figure 5: Implementation of Breadth-first Search</figcaption> </center>


## **Problem 5: (30pts)** 
Write an API for BFS.


**Breadth-first Search API**
```
class BFS:
    __init__(graph: Graph, s: int)->None  # constructor of BFS
    bfs(graph: Graph, v: int)             # find all the paths
    hasPathTo(v: int)->bool               # is there path between s and v
    pathTo(v: int)->Stack                 # return the path from s to v
```

**Hint:** Make use of queue. 



In [None]:
# Write the API for BFS below
from data.adt import Queue

class BFS:
  def __init__(self, g: Graph, s: int)->None:
        self.__marked = [False] * g.V  # has dfs() visited this vertex?
        self.__edgeTo = [None] * g.V  # last vertex on known path to this vertex
        self.__s = s  # source vertex


  def bfs(self, g: Graph, v: int)->None:
    # Write your code here

      
    def hasPathTo(self, v: int) -> bool:
        return self.__marked[v]

    def pathTo(self, v: int):
        if self.hasPathTo(v) is False:
            return None
        path = Stack()
        while v != self.__s:
            path.push(v)
            v = self.__edgeTo[v]
        path.push(self.__s)
        return path



### **Test Cases**  


In [None]:
# create the graph given in the example
graph_ex = Graph("/content/data/example_G.txt")

# run dfs 
bfs_ex = BFS(graph_ex, 0)

# print the path to vertex i
print("The path to vertex ")
for i in range(1,6):
  print(f" - {i} is {bfs_ex.pathTo(i)}")

The path to vertex 
 - 1 is [0,1]
 - 2 is [0,2]
 - 3 is [0,3]
 - 4 is [0,4]
 - 5 is [0,1,5]


# **IV. Applications**

**Problem 6: (25 pts)** <br>

---


Write a function *hasCycle* that takes a graph as input, and return True if the graph has cycle or false if no cycle detected.

**Hint:** Given a vertex, use DFS to find a path that will go back to that vertex. Mind that self-loop and cycle are 2 different things!


In [None]:
def hasCycle(graph: Graph)->bool:
  # Write your code here
  pass



---

