# Course Schedule
There are a total of `numCourses` courses you have to take, labeled from `0` to `numCourses - 1`. You are given an array `prerequisites` where <code>prerequisites[i] = [a<sub>i</sub>,b<sub>i</sub>]</code> indicates that you must take course <code>b<sub>i</sub></code> first if you want to take course <code>a<sub>i</sub></code>.

For example, the pair `[0, 1]`, indicates that to take course `0` you have to first take course `1`.

Return `true` if you can finish all courses. Otherwise, return `false`.

# Examples

**Example 1:**
```Python
Input: numCourses = 2, prerequisites = [[1,0]]
Output: true
```
Explanation: There are a total of 2 courses to take. To take course 1 you should have finished course 0. So it is possible.

**Example 2:**
```Python
Input: numCourses = 2, prerequisites = [[1,0],[0,1]]
Output: false
```
Explanation: There are a total of 2 courses to take. To take course 1 you should have finished course 0, and to take course 0 you should also have finished course 1. So it is impossible.

**Example 3:**
```Python
Input: numCourses = 3, prerequisites =[[0,2],[1,2],[2,0]]
Output: false
```

# Analysis

Consider a directed graph `(G,E)` where the nodes represent all the courses. There is a directed edge from course `a` to course `b` (not necessarily different) if and only if `a` is a prerequisite for `b`. 

We define a type of order: two nodes `a` and `b` (not necessarily different) are said to have a relationship `a < b` if there exists a directed path from `a` to `b`. It is evident that this relationship is transitive.

However, it should be noted that this relationship generally does not satisfy reflexivity. In fact, if there is an arrow from `a` to `b` and an arrow from `b` to `a`, then we simultaneously have `a < b` and `b < a`, but `b != a`.

With this order, our goal is to determine whether there exists a permutation (usually not unique) of all the nodes <code>v<sub>1</sub>, ... ,v<sub>n</sub></code> such that if <code>v<sub>i</sub> &lt; v<sub>j</sub></code>, then `i < j`. If such a permutation is possible, then this relationship is called a topological order. 

We claim that the following two conditions are equivalent:
1. The relationship `<` is a topological order;
2. For any node `a`, `a < a` does not hold.

Note that condition 2 is equivalent to saying that there are no cycles in the directed graph `(G,E)`. Therefore, determining whether a directed graph defines a topological order is equivalent to checking whether the graph contains a cycle.

Note also that condition 2 implies that the relationship `<` is reflexive. Indeed,
assume that there exist two distinct nodes `a` and `b` such that `a <= b` and `b <= a`. Since there are different nodes, we have `a < b` and `b < a`. By the transitivity, we have `a < a`, which contradicts the assumption. However, conversely, reflexivity cannot imply condition 2, as it cannot exclude the existence of isolated nodes with self-loops.

We now prove our claim.

`1 => 2`: Assume first that this relationship is a topological order. Then there exists a permutation (usually not unique) of all the nodes <code>v<sub>1</sub>, ... ,v<sub>n</sub></code> such that if <code>v<sub>i</sub> &lt; v<sub>j</sub></code>, then `i < j`. If `a < a` holds, then in the sequence, there are at least two `a`, which contradicts the fact that the sequence is a permutation.

`2 => 1`: We perform induction on the number of nodes in `|G|=n`. Assume that the conclusion holds for any directed graph with `n-1` nodes. We claim that there exists a node `c` such that for any node `b`, `b < c` does not hold. Suppose such a node does not exist. Starting from any node <code>x<sub>0</sub></code>, by assumption, there exists a node <code>x<sub>1</sub></code> such that <code>x<sub>1</sub> &lt; x<sub>0</sub></code>. Again, by assumption, there exists a node <code>x<sub>2</sub></code> such that <code>x<sub>2</sub> &lt; x<sub>1</sub></code>. Continuing in this way, we obtain an infinite sequence <code>..., x<sub>n</sub>, ..., x<sub>1</sub>, x<sub>0</sub></code> such that <code>x<sub>i+1</sub> &lt; x<sub>i</sub></code> for any `i >= 0`. Since `a < a` does not hold for any node `a`, all nodes in this sequence are distinct. However, since the graph has only a finite number of nodes, this is impossible. Therefore, there exists a node `c` such that for any node `b`, `b < c` does not hold.

Now we remove the node `c` from the graph `G`, resulting in a directed subgraph `G'`. By the induction hypothesis, `G'` defines a topological order, i.e., there exists a permutation <code>v<sub>1</sub>, ... ,v<sub>n-1</sub></code> of all the nodes of `G'` such that if <code>v<sub>i</sub> &lt; v<sub>j</sub></code>, then `i < j`. We now put `c` at the beginning of this sequence. For any node `a` in `G` with `c < a`, it must be in `G'`, since `a < a` does not hold. Therefore, the sequence <code>c, v<sub>1</sub>, ..., v<sub>n-1</sub></code> is a topological order of `G`.

This completes the proof of the claim.

Note that the proof of `2 => 1` above actually provides a constructive method for obtaining a topological order.

# Method 1: Check the Existence of Cycles
The first method is to check whether the graph contains a cycle, which can be done using Depth-First Search (DFS).

In [14]:
from collections import defaultdict


class Solution:
    def canFinish(self, numCourses: int, prerequisites: list[list[int]]) -> bool:
        graph = defaultdict(set)
        for a, b in prerequisites:
            graph[a]  # if a is not in graph, set graph[a] = set()
            graph[b].add(a)

        visited = set()
        rec_stack = set()

        def dfs(node):
            if node in rec_stack:
                return True
            if node in visited:
                return False
            '''
                node not in rec_stack but node in visited means that the node has been in the recursion stack before but has already been popped out. According to the principles of Depth-First Search (DFS), all directed paths starting from node have already been checked, so it should be skipped at this point.
            '''
            visited.add(node)
            rec_stack.add(node)

            for neighbor in graph[node]:
                if dfs(neighbor):
                    return True

            rec_stack.remove(node)  # Backtrack

            return False

        for a in graph:
            if a not in visited:
                if dfs(a):
                    return False

        return True

# Method 2: Check the Existence of Topological Order
From the above analysis, we only need to find a node with an in-degree of 0. Once such a node is identified, the problem can be reduced to a subgraph with one fewer element (this can be implemented using a recursive function).

Returning to the proof, we can actually find all nodes with an in-degree of 0 once and place them at the beginning of the sequence. This approach can accelerate the algorithm.

In [15]:
from collections import defaultdict


class Solution:
    def canFinish(self, numCourses: int, prerequisites: list[list[int]]) -> bool:
        source_graph = defaultdict(set)
        target_graph = defaultdict(set)
        for a, b in prerequisites:
            source_graph[b]
            source_graph[a].add(b), target_graph[b].add(a)

        def helper():
            if len(source_graph) == 0:
                return True

            beginning_pt = None
            for node in source_graph:
                if len(source_graph[node]) == 0:
                    beginning_pt = node

            if beginning_pt == None:
                return False

            del source_graph[beginning_pt]
            for target in target_graph[beginning_pt]:
                source_graph[target].remove(beginning_pt)

            return helper()

        return helper()

However, this algorithm is not very efficient. After modifying it to the following form, the efficiency will be much higher.

In [16]:
from collections import defaultdict, deque


class Solution:
    def canFinish(self, numCourses: int, prerequisites: list[list[int]]) -> bool:
        # Step 1: Construct the graph
        graph = defaultdict(set)
        for a, b in prerequisites:
            graph[a]
            graph[b].add(a)

        # Step 2: Calculate in-degrees of all nodes
        in_degree = {node: 0 for node in graph}
        for neighbors in graph.values():
            for neighbor in neighbors:
                in_degree[neighbor] += 1

        # Step 3: Collect all nodes with in-degree 0
        queue = deque([node for node in graph if in_degree[node] == 0])

        # Step 4: Process nodes
        processed_count = 0
        while queue:
            current = queue.popleft()
            # We need to update in_degree[current] manually since current is not the neighbor of any other node. If it is the neighbor of itself, then the sentence `in_degree[neighbor] -= 1` will update in_degree[node] to -1. 
            processed_count += 1
            for neighbor in graph[current]:
                in_degree[neighbor] -= 1
                if in_degree[neighbor] == 0:
                    queue.append(neighbor)

        # Step 5: Check if all nodes are processed
        return processed_count == len(graph)

Note that the introduction of the `indegree` dictionary eliminates the need for the two graphs `source_graph` and `target_graph`. We can build the `in_degree` dictionary at the same time as constructing `graph`.

In [None]:
from collections import defaultdict, deque


class Solution:
    def canFinish(self, numCourses: int, prerequisites: list[list[int]]) -> bool:
        # Step 1: Construct the graph and calculate in-degrees of all nodes simultaneously
        graph = defaultdict(set)
        in_degree = defaultdict(int)
        for a, b in prerequisites:
            graph[a]
            graph[b].add(a)
            in_degree[a] += 1

        # Step 2: Collect all nodes with in-degree 0
        queue = deque([node for node in graph if in_degree[node] == 0])

        # Step 3: Process nodes
        processed_count = 0
        while queue:
            current = queue.popleft()
            # We need to update in_degree[current] manually since current is not the neighbor of any other node. If it is the neighbor of itself, then the sentence `in_degree[neighbor] -= 1` will update in_degree[node] to -1.
            processed_count += 1
            for neighbor in graph[current]:
                in_degree[neighbor] -= 1
                if in_degree[neighbor] == 0:
                    queue.append(neighbor)

        # Step 4: Check if all nodes are processed
        return processed_count == len(graph)

Here is the revised version of the previously inefficient algorithm. However, it is still very slow, and based on the analysis, it seems that the statement 

<div style="text-align: center;">
    <code>begin_pts = [node for node in graph if in_degree[node] == 0]</code>
</div>

might be the cause of the inefficiency, since it traverses the entire graph. So, it is better to switch to using a queue.

In [None]:
from collections import defaultdict


class Solution:
    def canFinish(self, numCourses: int, prerequisites: list[list[int]]) -> bool:
        graph = defaultdict(set)
        in_degree = defaultdict(int)
        for a, b in prerequisites:
            graph[a]
            in_degree[b]
            graph[b].add(a)
            in_degree[a] += 1

        visited = set()

        def helper():
            if len(visited) == len(graph):
                return True

            begin_pts = [node for node in graph if in_degree[node] == 0]

            if len(begin_pts) == 0:
                return False

            for begin_pt in begin_pts:
                in_degree[begin_pt] -= 1
                visited.add(begin_pt)
                for neighbor in graph[begin_pt]:
                    in_degree[neighbor] -= 1

            return helper()

        return helper()