<h1>Creating Graph</h1>

The question will usually give us:
- An integer `n` as the number of nodes
- A matrix as the edges, where `edges[i][0]` is the start node, `edges[i][1]` is the end node, and `edges[i][2]` is the

<h3>Create Graph with Adjacency List (Use This!)</h3>

In [5]:
from collections import defaultdict

class Graph:
    graph = defaultdict(list)
    
    # Initialize the adjacency list
    def build(n):
        graph = defaultdict(list)

    # Directed weighted graph
    def directed_weighted_graph(edges):
        for u, v, w in edges:
            graph[u].append((v, w))

    # Undirected weighted graph
    def undirected_weighted_graph(edges):
        for u, v, w in edges:
            graph[u].append((v, w))
            graph[v].append((u, w))

<h1>Topological Sort (Key Word: Dependency)</h1>

Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such that:           
- **for every directed edge u-v, vertex u comes before v in the ordering.**   

**Important Notes:**
- The graph must be a **DAG** if we want to use Topological Sort
- Topological order is **NOT** unique

### Intuition:
Topological sorting is a **dependency problem** in which completion of one task depends upon the completion of several other tasks whose order can vary    

### Therefore we can use it to:
- Helps in **scheduling tasks** or **events based on dependencies**.- 
Detects cycles in a directed graph
- Solve problems with **precedence constraint**raphsurse)


### Applications:
- Course scheduling in universities
- Task scheduling and project management.
- Dependency resolution in package management systems.
- Determining the order of compilation in software build systems.
- Deadlock detection in operating systems.

### Algorithm for Topological Sorting：
define:           
**in-degree**: the number of incoming edges a node has            
**out-degree**: the number of outgoing edges a node has          
1. Add all nodes with in-degree 0 to a queue.
2. While the queue is not empty:
   - Remove a node from the queue.
   - For each outgoing edge from the removed node, decrement the in-degree of the destination node by 1.
   - If the in-degree of a destination node becomes 0, add it to the queue.
3. If the queue is empty and there are still nodes in the graph, the graph contains a cycle and cannot be topologically sorted.
4. The nodes in the queue represent the topological ordering of the graph.


---
<h2>Q1:Topological Sort Template---Course Schedule (LC.210)</h2>

*There are a total of numCourses courses you have to take, labeled from 0 to numCourses - 1. You are given an array prerequisites where prerequisites[i] = [ai, bi] indicates that you must take course bi first if you want to take course ai.*

*For example, the pair [0, 1], indicates that to take course 0 you have to first take course 1.*              

*Return the ordering of courses you should take to finish all courses. If there are many valid answers, return any of them. If it is impossible to finish all courses, return an empty array.*

In [52]:
class Solution(object):
    def findOrder(self, numCourses, prerequisites):
        # Calculate in-degrees and build the adjacency list
        in_degree = [0] * numCourses
        graph = defaultdict(list)

        # note that in this question the parent vertex is prerequisite[1] and child is prerequisite[0]
        for u, v in prerequisites:
            graph[v].append(u)
            in_degree[u] += 1

        # Add all nodes with in-degree 0 to the queue
        queue = deque([i for i in range(numCourses) if in_degree[i] == 0])
        topo_order = []

        # Process the queue
        while queue:
            node = queue.popleft()
            topo_order.append(node)

            for neighbor in graph[node]:
                in_degree[neighbor] -= 1
                if in_degree[neighbor] == 0:
                    queue.append(neighbor)

        # Check for cycles
        if len(topo_order) != numCourses:
            return []
            
        return topo_order       

---
<h2>Q2:Alien Dictionary (LC.269)</h2>

*There is a new alien language that uses the English alphabet. However, the order of the letters is unknown to you.*   

*You are given a list of strings words from the alien language's dictionary. Now it is claimed that the strings in words are 
sorted lexicographically by the rules of this new language.*     

*If this claim is incorrect, and the given arrangement of string in words cannot correspond to any order of letters, return "".*     

*Otherwise, return a string of the unique letters in the new alien language sorted in lexicographically increasing order by the new language's rules. If there are multiple solutions, return any of them.*

**Solution:**


In [None]:
class Solution(object):
    def alienOrder(self, words):
        in_degree = {}
        for word in words:
            for ch in word:
                in_degree[ch] = 0

        graph = defaultdict(list)

        for i in range(len(words) - 1):
            cur = words[i]
            next = words[i + 1]
            min_len = min(len(cur), len(next))
            for j in range(min_len):
                if cur[j] != next[j]:
                    graph[cur[j]].append(next[j])
                    in_degree[next[j]] += 1
                    break
            # When used with a for loop, the else block executes only if the loop completes normally, meaning it did not encounter a break statement
            else:
                if len(next) < len(cur):
                    return ""

        # Perform topo sort
        queue = deque(ch for ch in in_degree if in_degree[ch] == 0)
        topo_sort = []
        while queue:
            vertex = queue.popleft()
            topo_sort.append(vertex)
            for child in graph[vertex]:
                in_degree[child] -= 1
                if in_degree[child] == 0:
                    queue.append(child)

        if len(topo_sort) == len(in_degree):
            return "".join(topo_sort)
        else:
            return ""

---
<h1>Minimal Spannning Tree</h1>

A minimum spanning tree (MST) or minimum weight spanning tree for a **weighted, connected, and undirected graph** is a spanning tree with a weight less than or equal to the weight of every other spanning tree.

**Key Properties:**
- **Spanning**: It includes all the vertices of the original graph.
- **Tree**: Every vertices is connected and there is no cycle.
  - Therefore if the graph has `v` vertices, the MST must have `v-1` edges
- **Minimum Weight**: The sum of the edge weights in the MST is the smallest among all possible spanning trees of the graph.          

**Note:** MST may not be unique

### Kruskal's Algorithm for MST
**Procedure:**
- Sort all edges based on their weights
- Iterate through the sorted edges:
  - If taking the current edge will not create a cycle, take it
  - Otherwise don't take it and continue

**How do we know whether taking a edge will result in a cycle?**
- Use Union-Find
- Before we run the algorithm, first create a union find where each node is in a separate set
- Whenever we want to take a edge, check two nodes connected by this edge
  - if the two nodes are in the same set, taking this edge will result in a cycle in the MST
  - otherwise, we can safely take this edge

<h3>Q1: MST Template</h3>
<p>https://www.luogu.com.cn/problem/P3366</p>

In [2]:
import sys

MAXN = 5001 #max number of nodes
MAXM = 200001 #max number of edges
parent = [0] * MAXN
edges = []

def build(n):
    for i in range(1, n + 1):
        parent[i] = i

def find(x):
    if parent[x] != x:
        parent[x] = find(parent[x])
    return parent[x]

def isSameSet(x, y):
    return find(x) == find(y)
    
def union(x, y):
    root_x = find(x)
    root_y = find(y)
    if root_x != root_y:
        parent[root_x] = root_y

def kruskal(n, m):
    """
    Kruskal's algorithm for finding the Minimum Spanning Tree (MST).
    Returns the total weight of the MST, or 'orz' if the MST cannot be formed.
    """
    build(n)
    edges.sort(key=lambda edge: edge[2])
    total_weight = 0
    edge_count = 0

    # Process each edge in sorted order
    for u, v, w in edges:
        if not isSameSet(u, v):  # If u and v are not already connected
            union(u, v)
            total_weight += w
            edge_count += 1

    # If we connected n-1 edges, we formed a valid MST
    return total_weight if edge_count == n - 1 else "orz"

def main():
    input = sys.stdin.read
    data = input().split()
    index = 0

    while index < len(data):
        # Read number of nodes and edges
        n = int(data[index])
        index += 1
        m = int(data[index])
        index += 1

        # Read each edge
        global edges
        edges = []
        for _ in range(m):
            u = int(data[index])
            index += 1
            v = int(data[index])
            index += 1
            w = int(data[index])
            index += 1
            edges.append((u, v, w))

        # Find the MST weight using Kruskal's algorithm
        result = kruskal(n, m)

        # Print the result
        print(result)

if __name__ == "__main__":
    main()

### Prim's Algorithm for MST

In [None]:
later...

### Optimization of Prim's Algorithm with Reverse Index Heap

In [None]:
later...

---
<h3>Q1: Optimize Water Distribution In A Village (LC.1168)</h3>

*There are n houses in a village. We want to supply water for all the houses by building wells and laying pipes.*         

*For each house i, we can either build a well inside it directly with cost wells[i - 1] (note the -1 due to 0-indexing), or pipe in water from another well to it. The costs to lay pipes between houses are given by the array pipes where each pipes[j] = [house1j, house2j, costj] represents the cost to connect house1j and house2j together using a pipe. Connections are bidirectional, and there could be multiple valid connections between the same two houses with different costs.*           

*Return the minimum total cost to supply water to all houses.*

**Solution:**       
We know that this can be solved by MST, but in our MST algorithm there isn't a cost at each node. How do we transform the problem to a classical MST template?
- Easy. We suppose that there exist a water source by creating a new node, then we think of the cost to build a well is the cost of traversing the edge from the house to the water source, then this become a classical MST template

In [12]:
class Solution(object):
    def minCostToSupplyWater(self, n, wells, pipes):
        # create a imaginary water source(node 0)
        # then connect all house to it with weight wells[i + 1] since houses are 1-indexed
        for i in range(n):
            pipes.append([0, i + 1, wells[i]])
        
        # Create union find
        parent = list(range(n + 1))

        def find(x):
            if parent[x] != x:
                parent[x] = find(parent[x])
            return parent[x]

        def is_same_set(x, y):
            return find(x) == find(y)

        def union(x, y):
            px = find(x)
            py = find(y)
            if px != py:
                parent[px] = py

        # Kruskal's Algo for MST
        pipes.sort(key=lambda x: x[2])
        total_cost = 0
        for u, v, w in pipes:
            if not is_same_set(u, v):
                union(u, v)
                total_cost += w

        return total_cost

---
<h3>Q2: Checking Existence of Edge Length Limited Path (LC.1967)</h3>

*An undirected graph of `n` nodes is defined by `edgeList`, where edgeList`[i] = [ui, vi, disi]` denotes an edge between nodes `ui` and `vi` with distance `disi`. Note that there may be multiple edges between two nodes.*

*Given an array queries, where queries`[j] = [pj, qj, limitj]`, your task is to determine for each `queries[j]` whether there is a path between `pj` and `qj` such that each edge on the path has a distance strictly less than `limitj`.*

*Return a boolean array `answer`, where `answer.length == queries.length` and the jth value of answer is true if there is a path for `queries[j]` is true, and false otherwise.*

**Solution**:     
This is a Union-Find + MST problem.                     
Note that the query asks whether **every edge** on a path is shorter than `limit`. Therefore we can sort both the queries and the edges then iterate through the queries array:
- when we have a query, **connect all edges** in the graph that are strictly shorter than the `limit`
- If after connecting those edges, p and q are connected(we can tell whether they are connected by using union find), then the path is valid

**Time Complexity**: O(nlogn) + O(mlogm), where m is the number of queries
- Although there is a nested loop of query and edges, note that we only operate on each edges once no matter how many queries are there, so the nested loop part has time complexity O(n + m), and our bottleneck is actually the sorting part

In [25]:
class Solution(object):
    def distanceLimitedPathsExist(self, n, edgeList, queries):
        
        m = len(queries)
        # add an id to each query since we need to sort the queries, O(nlogn + mlogm)
        for i in range(m):
            queries[i].append(i)

        # sort the queries and edges based on distance, 
        queries.sort(key=lambda x: x[2])
        edgeList.sort(key=lambda x: x[2])
        
        # Create union-find
        parent = list(range(n))

        def find(x):
            if parent[x] != x:
                parent[x] = find(parent[x])
            return parent[x]

        def is_same_set(x, y):
            return find(x) == find(y)

        def union(x, y):
            px = find(x)
            py = find(y)
            if px != py:
                parent[px] = py
                
        # Connect edges and fill answer array, O(n + m)
        ans = [i for i in range(m)]
        j = 0
        for p, q, limit, i in queries:
            # connect all edges < limit
            while j < len(edgeList) and edgeList[j][2] < limit:
                union(edgeList[j][0], edgeList[j][1])
                j += 1
            ans[i] = is_same_set(p, q)

        return ans

---
<h2> Theorem: MST must also be the Minimal Bottleneck Spanning Tree(MBST) </h2>

In [None]:
later...