# Algorithms by Yandex 3.0

## Lesson 5. Depth-first search traversal of graphs

### Depth-first search traversal

"Depth-first search traversal" is a method used to traverse or search through a graph data structure. In this method, the algorithm starts at a particular node in the graph and explores as far as possible along each branch before backtracking.  

The basic idea is to visit each node in the graph only once, marking each visited node to avoid visiting it again. This approach is often used in algorithms such as finding connected components, determining if a path exists between two nodes, or finding a cycle in a graph.  

During the traversal process, the algorithm maintains a stack of nodes to be visited next. It pops a node from the stack, visits it, and then pushes all of its unvisited neighbors onto the stack. This process continues until the stack is empty.  

The time complexity of the depth-first search algorithm is O(V + E), where V is the number of vertices in the graph and E is the number of edges. This makes it a popular choice for graph traversal when the graph is not too large.  

Overall, depth-first search traversal is a fundamental algorithmic technique used in various domains such as graph theory, network analysis, and artificial intelligence.  

In [3]:
# This function implements depth-first search (DFS) algorithm on a graph
def dfs(graph, visited, now):
    # Mark the current node as visited
    visited[now] = True
    # Traverse all neighbors of the current node in the graph
    for neig in graph[now]:
        # If a neighbor node is not visited yet, 
        # recursively call DFS on it
        if not visited[neig]:
            dfs(graph, visited, neig)

In [6]:
# example run of the dfs function
graph = {
    1: [2],
    2: [4, 5],
    3: [5],
    4: [],
    5: []
}

visited = [False] * (len(graph) + 1)

# Call dfs starting from node 1
dfs(graph, visited, 1)

print(visited)

[False, True, True, False, True, True]


Graph connectivity refers to the ability to travel from one node in a graph to another by following a path of edges.  

In graph theory, a connected graph is a graph in which there is a path between any two nodes. In other words, for any two nodes in a connected graph, there exists a sequence of edges that connects them.  

On the other hand, a graph that is not connected is said to be disconnected. A disconnected graph can be broken down into two or more connected components, where a connected component is a subgraph of the original graph in which every pair of nodes is connected by a path.

Connected component labeling:

In [14]:
# modified dfs with connected component labeling
def dfs(graph, visited, now, comp):
    visited[now] = comp
    for neig in graph[now]:
        if not visited[neig]:
            dfs(graph, visited, neig, comp)


graph = {
    1: [2],
    2: [4, 5],
    3: [5],
    4: [],
    5: []
}

visited = [False] * (len(graph) + 1)


comp = 1
for i in range(1, len(graph)):
    if not visited[i]:
        dfs(graph, visited, i, comp)
        comp += 1
        
print(visited)

[False, 1, 1, 2, 1, 1]


### Cycles in a directed graph

 A cycle is a path that leads from a vertex to itself.

0 - a white vertex (we have never been there)  
1 - a grey vertex (we have just been there)  
2 - a black vertex (we have exited from dfs and we have already visited all childs of this vertex)  

if we have all vertices that are grey in our loop - this is a cycle. 

A **bipartite graph (bigraph)** is a special type of graph in which the vertices can be divided into two disjoint sets such that no two vertices within the same set are adjacent. In other words, a bipartite graph is a graph whose vertices can be partitioned into two sets such that all edges go between the two sets.

This property can be restated in terms of graph coloring: A graph is bipartite if and only if it can be colored with two colors such that no two adjacent vertices have the same color.

Bipartite graphs have many applications, including:

- Modeling social networks, where one set of vertices represents people and the other set represents events, groups, or activities that people can participate in.
- Solving problems in combinatorial optimization, such as the assignment problem or the transportation problem.
- Solving problems in electronic circuit design, where one set of vertices represents input terminals and the other set represents output terminals.
- In mathematics, bipartite graphs have connections to algebraic graph theory and topological graph theory.  

One interesting fact about bipartite graphs is that they do not contain any odd-length cycles. This is a consequence of the fact that a cycle in a bipartite graph must alternate between the two sets of vertices, and so its length must be even.

### Topological sorting

"Topological sorting" is an algorithm used to sort the nodes of a directed acyclic graph (DAG) in a linear order. In other words, it arranges the nodes in such a way that for every directed edge (u, v), node u comes before node v in the sorted order.  

The algorithm starts by finding a node with no incoming edges, which is called a "source" node. It then removes this node and all its outgoing edges from the graph, and adds it to the sorted list. The process is then repeated with the remaining nodes until all nodes have been added to the sorted list.  

If the graph contains cycles, then no topological ordering is possible, since a cycle creates a situation where a node would have to come before itself in the ordering. Therefore, a prerequisite for topological sorting is that the graph must be acyclic.  

Topological sorting has numerous applications in computer science and engineering, such as task scheduling, dependency resolution, and circuit design. It also plays an important role in compilers, where it is used to order the instructions in a program to optimize its execution speed.