---
<h1>Bread First Search(BFS)</h1>

Breadth First Search (BFS) is a fundamental graph traversal algorithm. It begins with a node, then first traverses all its adjacent nodes. Once all adjacent nodes are visited, it proceeds to traverse their adjacent nodes.         
- **Given a graph  G = (V, E)  and a source vertex \( S \)**:
  - It computes the **distance from \( S \) to each reachable vertex**.
  - BFS also produces a **breadth-first tree with root \( S \)** that contains all reachable vertices.
- **In an unweighted graph**, BFS finds the **shortest distance from a given vertex \( S \)** to each vertex it can reach.
- BFS can start from multiple vertices(check Q1)

**Notes:**
- BFS can only be used in **unweighted graph**
- We usually do BFS level by level, and the value of level when BFS finishes is the distance from S to the furtherst vertex it can reach
- We need an array/table to keep track of visited nodes

**Time Complexity: O(V + E)**

<span style="color: red;">BFS is not a hard algorithm, the crucial part is identifying what are our nodes and what are our edges!</span>


---
<h3>Q1. As Far from Land As Possible (LC.1162)---BFS Template</h3>

*Given an n x n grid containing only values 0 and 1, where 0 represents water and 1 represents land, find a water cell such that its distance to the nearest land cell is maximized, and return the distance. If no land or water exists in the grid, return -1.*

*The distance used in this problem is the Manhattan distance: the distance between two cells (x0, y0) and (x1, y1) is |x0 - x1| + |y0 - y1|.*

**Solution:**                          
Read the question carefully, we need to find "a water cell whose distance to the nearest land cell is maximized", then return the distance.Who is this cell? What exactly are we looking for?                 
BFS explores all nodes at the current "distance" level before moving on to nodes at the next distance level. This means that if we start BFS from a set of nodes, we can discover the closest nodes first, followed by nodes that are further away:
- By starting the BFS from all land cells simultaneously, we treat each land cell as a "starting point" with a distance of 0.
- The BFS then proceeds to visit neighboring water cells level by level, effectively calculating the minimum distance from any water cell to the nearest land cell
- The last level of nodes in the BFS will be cells farthest from any land cell.
- And when the BFS completes, the current level is this maximum distance

**Note:**
we don't need to explicitly declare a visited table. We can simply set the cell we visited to 1(land) so that they won't be visited again, since we only need the original grid to locate the starting nodes

In [40]:
class Solution(object):
    def maxDistance(self, grid):
        move = [-1, 0, 1, 0, -1]  # For traversing in 4 directions
        n = len(grid)
        q = deque()

        for i in range(n):
            for j in range(n):
                if grid[i][j] == 1:
                    q.append((i, j))  # Start BFS from all land cells

        # If there are no land cells or no water cells, return -1
        if len(q) == 0 or len(q) == n * n:
            return -1

        # Perform BFS
        level = 0
        while q:
            level += 1
            for _ in range(len(q)):
                x, y = q.popleft()
                # Explore in 4 possible directions
                for j in range(4):
                    nx, ny = x + move[j], y + move[j + 1]
                    # Check if the new position is within bounds and is unvisited water
                    if 0 <= nx < n and 0 <= ny < n and grid[nx][ny] == 0:
                        grid[nx][ny] = 1  # Mark as visited by setting it to land
                        q.append((nx, ny))

        return level - 1

---
<h3>Q2: Sticker To Spell Words (LC.691)</h3>

*We are given n different types of stickers. Each sticker has a lowercase English word on it.*

*You would like to spell out the given string target by cutting individual letters from your collection of stickers and rearranging them. You can use each sticker more than once if you want, and you have infinite quantities of each sticker.*

*Return the minimum number of stickers that you need to spell out target. If the task is impossible, return -1.*

*Note: In all test cases, all words were chosen randomly from the 1000 most common US English words, and target was chosen as a concatenation of two random words.*

**Solution:**  
There are multiple ways to do this (you can also use DP), but here we use BFS.  

Think of this example:  
- Target: `"aaabbbccc"`  
- Stickers available: `"abc"` and `"bbb"`  

We can use either sticker:  
- If we use the first sticker, we get `"aabbcc"`  
- If we use the second sticker, we get `"bbb"`        

Isn't this like a graph?
- **Each string is a node, and using a sticker is traversing an edge**
- our goal is to find the node representing the empty string `""`

Since all the strings can be considered as nodes, we can use BFS:
- For each level, we try all the edges (i.e., all possible stickers).
- We continue this process to find the level at which we reach the empty string `""`. This level represents the minimum number of stickers required to form the target string.

Therefore, the answer is the level at which we first reach the empty string in our BFS traversal.

**Pruning:**  
If we use a simple BFS, we have to try every edge for every node. However, there will be many useless edges for a node.  
For example, if our current target is `"aabb"`, and an edge is `"ccc"`, we don't need to try it because the sticker doesn't even contain the letter `'a'`.  
If you want to change the target to `""`, eventually you will have to take a sticker that contains the letter `'a'`, so why not take it right away?  
By putting stickers into a table, we can prune those useless edges:  
- Since we only have 26 letters, we prepare a table with 26 rows.  
- Each row contains all the stickers that contain the corresponding letter.  
- For example, the sticker `"aabc"` will be put into row `'a'`, row `'b'`, and row `'c'`.  
- When we are at a node that starts with the letter `'a'`, we only need to try the stickers in row `'a'`.  

**How do we know this is correct?**  
- Because we are using BFS, and BFS explores all paths level by level, so we won't miss any solution!  
- Our pruning strategy only changes the **order** in which paths are taken, leading to a faster lgorithm.  


In [64]:
from collections import defaultdict, deque

class Solution(object):
    def minStickers(self, stickers, target):
        #preprocess stickers and generate table for pruning
        stickers_map = defaultdict(list)
        for sticker in stickers:
            sticker = sorted(sticker)
            for i in range(len(sticker)):
                if i == 0 or sticker[i] != sticker[i - 1]:
                    stickers_map[sticker[i]].append(sticker)

        queue = deque([target])
        visited = set([target])
        level = 0

        while queue:
            level += 1
            n = len(queue)
            for _ in range(n):
                cur = queue.popleft()
                cur = "".join(sorted(cur)) # sort the current string for pruning
                # select stickers based on the first letter in current string
                for sticker in stickers_map[cur[0]]:
                    processed_str = self.process(cur, sticker)
                    if processed_str == "":
                        return level    # If we reach the empty string, success
                    if processed_str not in visited:
                        visited.add(processed_str)
                        queue.append(processed_str)

        return -1  # If we exhaust the BFS without finding an empty string, fail


    # if you find this function confusing, try it in your mind with the example target="abc" and sticker="acd" and you will get it
    def process(self, target, sticker):
        processed_str = []
        i, j = 0, 0
        while i < len(target):
            if j == len(sticker):
                processed_str.append(target[i])
                i += 1
            else:
                if target[i] < sticker[j]:
                    processed_str.append(target[i])
                    i += 1
                elif target[i] > sticker[j]:
                    j += 1
                else:
                    i += 1
                    j += 1
        return "".join(processed_str)

---
<h1>01-BFS</h1>

**0-1 BFS** is an algorithm used for finding the shortest path in a **Binary Weighted Graph** where the edge weights are restricted to either 0 or 1. It efficiently computes the shortest distance from a source vertex to all other vertices in such a graph.          

Infact, 0-1 BFS is just **a special case of Dijkstra's Algorithm**          

#### Algorithm:
1. **Initialization:**
   - Create a distance array `distance[]`, where `distance[i]` represents the shortest distance from the source vertex to vertex `i`. Initialize all distances to infinity (`∞`), except the distance to the source itself, which is set to 0.
   - Use a deque because we need to insert node in both front and end of the queue.
   - Insert the source vertex into the deque and set `distance[src] = 0`.

2. **Procedure:**
   - While the deque is not empty:
     - Pop a vertex `cur` from the front of the deque.
     - If `cur` is the target vertex, return `distance[cur]` (shortest distance to the target is found).
     - For each outgoing edge from `cur` (leading to vertex `v` with weight `w`):
       - If `distance[v] <= distance[cur] + w`, do nothing (no shorter path is found).
       - If `distance[v] > distance[cur] + w`, update the distance: `distance[v] = distance[cur] + w`.
       - If the edge weight `w` is 0, push vertex `v` to the front of the deque.
       - If the edge weight `w` is 1, push vertex `v` to the back of the deque.

#### Correctness:
- Whenever a vertex `cur` is popped from the deque, `distance[cur]` represents the shortest distance from the source to `cur`.
- This is guaranteed because vertices with lower distances are prioritized in the deque (using the 0-weight edges to push from the front).
- The algorithm guarantees finding the shortest path to all reachable vertices, even if no specific target is provided.

#### Why No `visited` Array is Needed:
- In a 0-1 graph, a vertex is only pushed into the deque if a shorter path to that vertex is found.
- Each vertex is "fixed" once when its shortest distance is determined. If a vertex needs to be updated (fixed) again, it will only happen once more.
- This characteristic ensures that each vertex is processed at most twice, making a `visited` array redundant.

#### Time Complexity O(V + E), because:
- Each vertex can be pushed into and popped from the deque at most twice (once for updating and potentially a second time if a shorter path is found).
- The algorithm processes each edge once

#### Key Insight:
- The efficiency of 0-1 BFS comes from pushing vertices to the front of the deque for 0-weight edges, ensuring that the shortest paths are quickly propagated through the graph.


---
<h3>Q1: Minimum Obstacle Removal (LC.2290) --- 01-BFS Template</h3>

*You are given a 0-indexed 2D integer array grid of size m x n. Each cell has one of two values:*      

*0 represents an empty cell,1 represents an obstacle that may be removed.*

*You can move up, down, left, or right from and to an empty cell.*

*Return the minimum number of obstacles to remove so you can move from the upper left corner (0, 0) to the lower right corner (m - 1, n - 1).*

In [4]:
from collections import deque

class Solution(object):
    def minimumObstacles(self, grid):
        move = [-1, 0, 1, 0, -1]

        q = deque([[0, 0]])     # Make sure it is not deque([0, 0])
        m = len(grid)
        n = len(grid[0])

        distance = [[float('inf')] * n for _ in range(m)]
        distance[0][0] = 0
        visited = [[False] * n for _ in range(m)]
        visited[0][0] = True

        # Perform 0-1 BFS
        while q:
            x, y = q.popleft()
            for i in range(4):
                nx = x + move[i]
                ny = y + move[i + 1]
                if 0 <= nx < m and 0 <= ny < n and not visited[nx][ny]:
                    new_distance = distance[x][y] + grid[nx][ny]
                    if new_distance < distance[nx][ny]:
                        distance[nx][ny] = new_distance
                        # Push to front for weight 0, back for weight 1
                        if grid[nx][ny] == 0:
                            q.appendleft([nx, ny])
                        else:
                            q.append([nx, ny])
                    visited[nx][ny] = True

        return distance[m - 1][n - 1]

---
<h3>Q2: Minimum Cost to Make at Least One Valid Path in a Grid (LC.1368)</h3>

*Given an `m x n` grid. Each cell of the grid has a sign pointing to the next cell you should visit if you are currently in this cell. The sign of `grid[i][j]` can be:*

- *`1` which means go to the cell to the **right**. (i.e., go from `grid[i][j]` to `grid[i][j + 1]`)*
- *`2` which means go to the cell to the **left**. (i.e., go from `grid[i][j]` to `grid[i][j - 1]`)*
- *`3` which means go to the **lower** cell. (i.e., go from `grid[i][j]` to `grid[i + 1][j]`)*
- *`4` which means go to the **upper** cell. (i.e., go from `grid[i][j]` to `grid[i - 1][j]`)*

*Notice that there could be some signs on the cells of the grid that point **outside the grid**.*

*You will initially start at the upper-left cell `(0, 0)`. A **valid path** in the grid is a path that starts from the upper-left cell `(0, 0)` and ends at the bottom-right cell `(m - 1, n - 1)` following the signs on the grid. The valid path does not have to be the shortest.*

*You can **modify the sign** on a cell with `cost = 1`. You can modify the sign on a cell **one time only**.*

*Return the **minimum cost** to make the grid have at least one valid path.*


---
<h1>Dijkstra's Algorithm</h1>

Dijsktra's Algorithm is used to find the shortest path from a **source vertex** to **all other vertices** in a given weighted graph

Basically Dijkstra is just **BFS + Cost Function**    

**Cost Function g(n)** = path cost from source vertex to node n

**Limitation:**
- The graph must not have any negative edges.
- Only a single source                     

**Time Complexity**: O(V\* logV), where V is the number of edges

**Procedure:**

1. `distance[i]` represents the shortest distance from the source node to node `i`, and `visited[i]` indicates whether node `i` has been popped from the priority queue.

2. Prepare a priority queue(Heap). The priority queue stores records in the form `(x, distance from the source)`, and the priority is determined based on the distance.

3. Set `distance[source] = 0`, and push `(source, 0)` into the priority queue.

4. While the priority queue is not empty, repeat the following steps:
   - Pop `(u, distance[u])` from the priority queue.
   - If `visited[u]` is `true`, skip
   - If `visited[u]` is `false`, set `visited[u] = true`, indicating node `u` has been processed.
    - Then, consider each edge of node `u`. Suppose an edge leads to node `v` with weight `w`:
       - If `visited[v]` is `false` and `distance[u] + w < distance[v]`:
         - Update `distance[v]`
         - Push `(v, distance[v])` into the priority queue.

---
<h3>Q1: Network Delay Time(LC.743)---Dijsktra Temmplate1(Given Edges)</h3>

*You are given a network of n nodes, labeled from 1 to n. You are also given times, a list of travel times as directed edges times[i] = (ui, vi, wi), where ui is the source node, vi is the target node, and wi is the time it takes for a signal to travel from source to target.*          

*We will send a signal from a given node k. Return the minimum time it takes for all the n nodes to receive the signal. If it is impossible for all the n nodes to receive the signal, return -1.*

In [120]:
import heapq
from collections import defaultdict

class Solution(object):
    def networkDelayTime(self, times, n, k):
        distances = [float('inf')] * (n + 1) #index start at 1
        distances[k] = 0
        visited = [False] * (n + 1)

        graph = defaultdict(list)
        for u, v, w in times:
            graph[u].append((v, w))

        heap = []
        heapq.heappush(heap, (0, k))    # (distance, node) since python heap will sort based on first element

        # Dijkstra's algorithm
        while heap:
            d, u = heapq.heappop(heap)  # d: distance, u: current node
            if visited[u]:
                continue
            # u not popped(visited)
            visited[u] = True
            # Relax edges
            for v, w in graph[u]:       # v: next node, w: distance between u and v (weight)
                nd = d + w              # nd: new distance
                if nd < distances[v]:
                    distances[v] = nd
                    heapq.heappush(heap, (nd, v))

        max_distance = max(distances[1:])

        # According to question, if there's a node that is unreachable, return -1
        return max_distance if max_distance != float('inf') else -1

---
<h3>Q2: Path With Minimum Effort(LC.1631)---Dijsktra Template 2(Given Grid)</h3>

*You are a hiker preparing for an upcoming hike. You are given heights, a 2D array of size rows x columns, where heights[row][col] represents the height of cell (row, col). You are situated in the top-left cell, (0, 0), and you hope to travel to the bottom-right cell, (rows-1, columns-1) (i.e., 0-indexed). You can move up, down, left, or right, and you wish to find a route that requires the minimum effort.*           

*A route's effort is the maximum absolute difference in heights between two consecutive cells of the route.*             

*Return the minimum effort required to travel from the top-left cell to the bottom-right cell.*

In [34]:
class Solution(object):
    def minimumEffortPath(self, heights):
        """
        :type heights: List[List[int]]
        :rtype: int
        """
        move = [-1, 0, 1, 0, -1]

        n = len(heights)
        m = len(heights[0])

        efforts = [[float('inf')] * m for _ in range(n)]
        efforts[0][0] = 0
        visited = [[False] * m for _ in range(n)]
        
        heap = []
        heapq.heappush(heap, (0, 0, 0))

        while heap:
            d, x, y = heapq.heappop(heap)
            if visited[x][y]:
                continue
            visited[x][y] = True
            for i in range(4):
                nx = x + move[i]
                ny = y + move[i + 1]
                if 0 <= nx < n and 0 <= ny < m and not visited[nx][ny]:
                    w = abs(heights[nx][ny] - heights[x][y])
                    # Here we update distance like this because the problem is asking for the max distance(effort) along a route, not the total distance
                    new_effort = max(w, d)
                    if new_effort < efforts[nx][ny]:
                        efforts[nx][ny] = new_effort
                        heapq.heappush(heap, (new_effort, nx, ny))
        
        return distances[n - 1][m - 1]

---
<h3>Q3: Swimming In Rising Water (LC.778)</h3>

*You are given an n x n integer matrix grid where each value grid[i][j] represents the elevation at that point (i, j).*       

*The rain starts to fall. At time t, the depth of the water everywhere is t. You can swim from a square to another 4-directionally adjacent square if and only if the elevation of both squares individually are at most t. You can swim infinite distances in zero time. Of course, you must stay within the boundaries of the grid during your swim.*         

*Return the least time until you can reach the bottom right square (n - 1, n - 1) if you start at the top left square (0, 0).*

---
<h1>Shortest Path Problems in Layered Graph(Graph With States)</h1>

<h3>Q1: Shortest Path To Get All Keys (LC.864)</h3>

*You are given an m x n grid where:*

- *`'.'` is an empty cell.*
- *`'#'` is a wall.*
- *`'@'` is the starting point.*
- *Lowercase letters represent keys.*
- *Uppercase letters represent locks.*

*You start at the starting point and one move consists of walking one space in one of the four cardinal directions. You cannot walk outside the grid, or walk into a wall.*

*If you walk over a key, you can pick it up, and you cannot walk over a lock unless you have its corresponding key.*

*For some `1 <= k <= 6`, there is exactly one lowercase and one uppercase letter of the first k letters of the English alphabet in the grid. This means that there is exactly one key for each lock, and one lock for each key; and also that the letters used to represent the keys and locks were chosen in the same order as the English alphabet.*

*Return the lowest number of moves to acquire all keys. If it is impossible, return -1.*

---
<h1>A* Algorithm</h1>

A* is an **Informed Search** algorithm aiming to find the shortest path from a **source vertex** to a **destination vertex**    

Thus, the priority function used in A* can be expressed as:
- **Priority function f(n) = g(n) + h(n)**, where
- **Cost function g(n)** = The actual cost from the source node to the current node.
- **Heuristic Function h(n)** = Estimated cost of the cheapest path from node n to destination vertex

So basically A* is just **Dijkstra + Heuristic Function**

**Procedure**:
Almost the same as Dijskra, the only difference is that instead of putting `g(n)` into the heap, we put `f(n)` in to the heap for A*
**Time Complexity**: O(V* logV), where V is the number of edges


## Common Heuristic Functions

### Manhattan Distance:
h(n) = |Xj - Xi| + |Yj - Yi|            
This is commonly used when movement is restricted to four directions (left, right, up, down) on a grid.

### Diagonal Distance:
h(n) = max(|Xj - Xi|, |Yj - Yi|)             
This heuristic is used when diagonal movement (45-degree angles) is allowed.

### Euclidean Distance:
h(n) = sqrt{(Xj - Xi)^2 + (Yj - Yi)^2}                
This is appropriate when movement is unrestricted, allowing for any angle.


In [40]:
move = [-1, 0, 1, 0, -1]

def min_distance_astar(grid, startX, startY, targetX, targetY):
    if grid[startX][startY] == 0 or grid[targetX][targetY] == 0:
        return -1
    
    n = len(grid)
    m = len(grid[0])
    distance = [[float('inf')] * m for _ in range(n)]
    distance[startX][startY] = 1
    visited = [[False] * m for _ in range(n)]
    heap = []
    
    # Push (f(n), x, y) into heap, using manhattan distance here
    heapq.heappush(heap, (1 + manhattan_distance(startX, startY, targetX, targetY), startX, startY))
    
    while heap:
        cur_f, x, y = heapq.heappop(heap)
        
        if visited[x][y]:
            continue
        visited[x][y] = True
        
        if x == targetX and y == targetY:
            return distance[x][y]
        
        for i in range(4):
            nx = x + move[i]
            ny = y + move[i + 1]
            
            if 0 <= nx < n and 0 <= ny < m and grid[nx][ny] == 1 and not visited[nx][ny]:
                if distance[x][y] + 1 < distance[nx][ny]:
                    distance[nx][ny] = distance[x][y] + 1
                    f_n = distance[nx][ny] + manhattan_distance(nx, ny, targetX, targetY)
                    heapq.heappush(heap, (f_n, nx, ny))
    
    return -1

def manhattan_distance(x, y, targetX, targetY):
    return abs(targetX - x) + abs(targetY - y)

def diagonal_distance(x, y, targetX, targetY):
    return max(abs(targetX - x), abs(targetY - y))

def euclidean_distance(x, y, targetX, targetY):
    return math.sqrt((targetX - x) ** 2 + (targetY - y) ** 2)

---
<h1>Floyd Warshall Algorithm</h1>

Floyd Warshal Algo is used to find the shortest paths between **all pairs of nodes** in a ANY weighted graph.            

- It can be used in  handle **directer or undirected** graphs, with both **positive and negative edge weights**, the only constraint is that there must not be a negative weight cycle(or the notion of "shortest path" doesn't apply)

**Time Comlexity**: O(N^3), so we should only consider using it when the number of vertices is small

**Aux Space Complexity**: O(N^2)

**Basic Idea**:
- for all pairs of nodes `SRC` and `DES`, does there exist a intermediate node `MED` that make their distance smaller?
- using only **enumeration**

In [62]:
def floyd(n):
    distance = [[float('inf')] * n for _ in range(n)]
    for i in range(n):
        for j in range(n):
            distance[i][j] = float('inf')
            
    # O(N^3) procedure
    # Enumerate each possible bridge
    for bridge in range(n):
        for i in range(n):
            for j in range(n):
                # If there's a path i -> ...bridge... -> j
                # Check if distance[i][j] can be shortened
                if distance[i][bridge] != float('inf') and distance[bridge][j] != float('inf'):
                    distance[i][j] = min(distance[i][j], distance[i][bridge] + distance[bridge][j])


---
<h1>Bellman Ford Algorithm and SPFA</h1>

<p>I'll do it later</p>