# Graphs 🥰

**A graph G is simply a set V of vertices (nodes) and a collection E of pairs of vertices (nodes) from V , called edges.** 

Thus, a graph is a way of representing connections or relationships between pairs of objects from some set V.


## Definitions about Graphs

- Edges in a graph are either **directed or undirected.** An edge $(u, v)$ is said to be directed from $u$ to $v$ if the pair $(u, v)$ is ordered, with $u$ preceding $v$. An edge $(u, v)$ is said to be undirected if the pair $(u, v)$ is not ordered.

- If all the edges in a graph are undirected, then we say the graph is an **undirected graph**. Likewise, a **directed graph**, also called a digraph, is a graph whose edges are all directed. A graph that has both directed and undirected edges is often called **a mixed graph.** 

- If an edge is directed, its first endpoint is its **origin** and the other is the **destination** of the edge.

- Two vertices u and v are said to be **adjacent** if there is an edge whose end vertices are u and v. 

- An edge is said to be **incident** to a vertex if the vertex is one of the edge’s endpoints. 

- **The outgoing edges** of a vertex are the directed edges whose origin is that vertex. 

- **The incoming edges** of a vertex are the directed edges whose destination is that vertex. 

- The degree of a vertex v, denoted **deg(v)**, is the number of incident edges of v. 

- The in-degree and out-degree of a vertex v are the number of the incoming and outgoing edges of v, and are denoted **indeg(v)** and **outdeg(v)** , respectively.

A flight network is a great example!

## Data Structures for Graphs 🥰

We will see four data structures for representing a graph.

In each representation, we maintain a collection to store the **vertices** (nodes) of a graph. 

However, the four representations differ greatly in the way they organize the edges.

• In an **edge list**, we maintain an unordered list of all edges. This minimally suffices, but there is ***no efficient way to locate a particular edg***e $(u, v)$, or the set of all edges incident to a vertex v.

• In an **adjacency list**, we maintain, for each vertex, a separate list containing those edges that are incident to the vertex. The complete set of edges can be determined by taking the union of the smaller sets, while the organization allows us to more efficiently find all edges incident to a given vertex.

• An **adjacency map** is very similar to an adjacency list, but the secondary container of all edges incident to a vertex is organized as a map, rather than as a list, with the adjacent vertex serving as a key. This allows for access to a specific edge $(u, v)$ in $O(1)$ expected time.

• An **adjacency matrix** provides worst-case $O(1)$ access to a specific edge $(u, v)$ by maintaining an n × n matrix, for a graph with n vertices. Each entry is dedicated to storing a reference to the edge $(u, v)$ for a particular pair of vertices u and v; if no such edge exists, the entry will be `None`. Uses a LOT of memory.

### Edge List Structure 💙

The edge list structure is possibly the simplest, though not the most efficient, representation of a graph G. All vertex objects are stored in an unordered list V, and all edge objects are stored in unordered list E.

Performance:

The space usage, which is $O(n + m)$ for representing a graph with n vertices and m edges.

The most significant limitations of an edge list structure, especially when compared to the other graph representations, are the $O(m)$ running times of methods `get_edge(u,v)`, `degree(v)`, and `incident_edges(v)`. 

The problem is that with all edges of the graph in an unordered list E, the only way to answer those queries is through an exhaustive inspection of all edges. The other data structures introduced in this section will implement these methods **more efficiently**.



### Adjacency List Structure 💚

In contrast to the edge list representation of a graph, the adjacency list structure groups the edges of a graph by storing them in smaller, secondary containers that are associated with each individual vertex.

Specifically, for each vertex v, we maintain a collection I(v), called the incidence collection of v, whose entries are edges incident to v. (In the case of a directed graph, outgoing and incoming edges can be
respectively stored in two separate collections, $I_{out}(v)$ and $I_{in}(v)$.) Traditionally, the incidence collection I(v) for a vertex v is a list, which is why we call this way of representing a graph the adjacency list structure.

The primary **benefit** of an ==adjacency list== is that the collection $I(v)$ contains exactly those edges that should be reported by the method `incident_edges(v)`. Therefore, we can implement this method by iterating the edges of $I(v)$ in $O(deg(v))$ time, where $deg(v)$ is the degree of vertex v. This is the best possible outcome for any graph representation, because there are $deg(v)$ edges to be reported.


#### Performance:
 
Asymptotically, the space requirements for an adjacency list are the same as an edge list structure, using $O(n + m)$ space for a graph with n vertices and m edges. 

The primary list of vertices uses $O(n)$ space.  The sum of the lengths of all secondary lists is $O(m)$, for reasons that were formalized in Propositions 14.8 and 14.9. In short, an undirected edge (u, v) is referenced in both $I(u)$ and $I(v)$, but its presence in the graph results in only a constant amount of additional space.

We have already noted that the `incident_edges(v)` method can be achieved in $O(deg(v))$ time based on use of $I(v)$. We can achieve the `degree(v)` method of the graph ADT to use $O(1)$ time, assuming collection $I(v)$ can report its size in similar time. 

To locate a specific edge for implementing `get_edge(u,v)`, we can search through either $I(u)$ and $I(v)$. By choosing the smaller of the two, we get $O(min(deg(u), deg(v)))$ running time.

### Adjacency Map Structure 💛 `{"vertices" : edges}`

Expected $o(1)$ time for edges. I think this is the one! (Vertices as keys, edges as values)

In the adjacency list structure, we assume that the secondary incidence collections are implemented as unordered linked lists. Such a collection $I(v)$ uses space proportional to $O(deg(v))$, allows an edge to be added or removed in $O(1)$ time, and allows an iteration of all edges incident to vertex v in $O(deg(v))$ time. 

However, the best implementation of `get_edge(u,v)` requires $O(min(deg(u), deg(v)))$ time, because we must search through either $I(u)$ or $I(v)$.

We can improve the performance by using a hash-based map to implement $I(v)$ for each vertex v. Specifically, we let the opposite endpoint of each incident edge serve as a key in the map, with the edge structure serving as the value. We call such a graph representation an adjacency map. (See Figure 14.6.) The space usage for an adjacency map remains $O(n + m)$, because I(v) uses $O(deg(v))$ space for each
vertex v, as with the adjacency list.

Adjacency map essentially achieves optimal running times for all methods, making it an excellent all-purpose choice as a graph representation.

### Adjacency Matrix Structure 🧡 `$O(n^2)$ space`

$O(1)$ worst case access FOR EDGES. Cool but lots of memory 😯 (Vertices as integers - edges being pairs of integers. Problem is adding and removing vertices. (resize matrix)) 
 
The adjacency matrix structure for a graph G augments the edge list structure with a matrix A (that is, a two-dimensional array, as in Chapter 5.6), which allows us to locate an edge between a given pair of vertices in worst-case constant time. In the adjacency matrix representation, we think of the vertices as being the integers in the set ${0, 1, . . . , n − 1}$ and the edges as being pairs of such integers. This allows
us to store references to edges in the cells of a two-dimensional $n × n$ array A.

The most significant advantage of an adjacency matrix is that any edge (u, v) can be accessed in worst-case $O(1)$ time; recall that the adjacency map supports that operation in $O(1)$ expected time. However, several operation are less efficient with an adjacency matrix. For example, to find the edges incident to vertex v, we must presumably examine all n entries in the row associated with v; recall that an adjacency list or map can locate those edges in optimal $O(deg(v))$ time. Adding or removing vertices from a graph is problematic, as the matrix must be resized.

Furthermore, the $O(n^2)$ space usage of an adjacency matrix is typically far worse than the $O(n + m)$ space required of the other representations. Although, in the worst case, the number of edges in a dense graph will be proportional to $n^2$ , most real-world graphs are sparse.

### Here is an real world example as a teaser.

We will go deeper on this example later.

In [1]:
"""
Given an m x n 2D binary grid grid which represents a map 
of '1's (land) and '0's (water), return the number of islands.

An island is surrounded by water and is formed by connecting 
adjacent lands horizontally or vertically. 

You may assume all four edges of the grid are all 
surrounded by water.

Example 1:

    Input: grid = [
      ["1","1","1","1","0"],
      ["1","1","0","1","0"],
      ["1","1","0","0","0"],
      ["0","0","0","0","0"]
    ]
    
    Output: 1

Example 2:

    Input: grid = [
      ["1","1","0","0","0"],
      ["1","1","0","0","0"],
      ["0","0","1","0","0"],
      ["0","0","0","1","1"]
    ]
    Output: 3
    

Constraints:

    m == grid.length
    n == grid[i].length
    1 <= m, n <= 300
    grid[i][j] is '0' or '1'.

Takeaway:

    To solve this question, think like a kindergardener.

    How can you find a single island?

    You look.

    For each 1, you need to check neighbours, this is clearly bfs or dfs.

    Setting the cell to 0 in order to not visit it again is pretty cool.
"""

class Solution:
    
    def numIslands(self, grid: list[list[str]]) -> int:
        m = len(grid)
        n = len(grid[0])
        result = 0

        def dfs(i, j):
            if (i < 0 or 
                j < 0 or 
                i >= m or 
                j >= n or 
                grid[i][j] == '0'):
                    return
			# make the current tile 0
			# so you will not count this tile again
            grid[i][j] = '0'

			# go to every direction possible
            dfs(i - 1, j)
            dfs(i + 1, j)
            dfs(i, j - 1)
            dfs(i, j + 1)
            
        for i in range(m):
            for j in range(n):
                if grid[i][j] == '1':
                    result += 1
                    dfs(i, j)
        
        return result

# Graph Traversals 💗

A `traversal` is a systematic procedure for exploring a graph by examining all of its vertices and edges. 

A traversal is efficient if it visits all the vertices and edges in time proportional to their number, that is, in linear time.

Graph traversal algorithms are key to answering many fundamental questions about graphs involving the `notion of reachability`, that is, in determining how to travel from one vertex to another while following paths of a graph.

### Problems to deal with reachability for **undirected graphs**:

• Computing a path from vertex u to vertex v, or reporting that no such path exists.

• Given a start vertex s of G, computing, for every vertex v of G, a path with the minimum number of edges between s and v, or reporting that no such path exists.

• Testing whether G is connected.

• Computing a spanning tree of G, if G is connected.

• Computing the connected components of G.

• Computing a cycle in G, or reporting that G has no cycles.

### Problems to deal with reachability for **directed graphs**:

• Computing a directed path from vertex u to vertex v, or reporting that no such path exists.

• Finding all the vertices of G{arrow} that are reachable from a given vertex s.

• Determine whether G{arrow} is acyclic.

• Determine whether G{arrow} is strongly connected.

## DFS

Wisdom: Sending single agent with a rope in a labyrinth. We go as deep as possible, we backtrack from dead ends.

### Properties of DFS

Proposition 14.12: Let G be an undirected graph on which a DFS traversal starting at a vertex s has been performed. Then the traversal visits all vertices in the connected component of s, and the discovery edges form a spanning tree of the connected component of s.

Depth-first search visits each vertex in the connected component of s.

For G{arrow} DFS visits all reachable.

### Running time of DFS

Every edge is examined at most twice.

In terms of its running time, depth-first search is an efficient method for traversing a graph. 


### When is it good too use DFS?

Proposition 14.14: Let G be an undirected graph with n vertices and m edges. A DFS traversal of G can be performed in $O(n + m)$ time, and can be used to solve the following problems in $O(n + m)$ time:

• Computing a path between two given vertices of G, if one exists.

• Testing whether G is connected.

• Computing a spanning tree of G, if G is connected.

• Computing the connected components of G.

• Computing a cycle in G, or reporting that G has no cycles.


## BFS

In this section, we consider another algorithm for traversing a connected component of a graph, known as a breadth-first search (BFS). 

The BFS algorithm is more akin to sending out, in all directions, many explorers who collectively traverse a graph in coordinated fashion.

- A path in a breadth-first search tree rooted at vertex s to any other vertex v is guaranteed to be the shortest such path from s to v in terms of the number of edges.

Proposition 14.16: Let G be an undirected or directed graph on which a BFS traversal starting at vertex s has been performed. Then

• The traversal visits all vertices of G that are reachable from s.

• For each vertex v at level i, the path of the BFS tree T between s and v has i edges, and any other path of G from s to v has at least i edges.

• If $(u, v)$ is an edge that is not in the BFS tree, then the level number of v can  be at most 1 greater than the level number of u.

# For the Curious

## Shortest Paths

Some edges are better than others.

We might want to use a graph to represent the roads between cities, and we might be interested in finding the fastest way to travel cross-country. In this case, it is probably not appropriate for all the edges to be equal to each other, for some inter-city distances will likely be much larger than others.

## Weighted Graphs

A weighted graph is a graph that has a numeric (for example, integer) label $w(e)$ associated with each edge $e$, called the weight of edge $e$. For $e = (u, v)$, we let notation $w(u, v) = w(e)$

If there is a negative weight (ORD - 70 JFK: someone paying us to go to JFK) that would be infinite money glitch 😅

If the special case of computing a shortest path when all weights are equal to one was solved with the BFS traversal algorithm.

## Dijkstra's Algorithm 🥳

The main idea in applying the greedy method pattern to the single-source shortest-path problem is to perform a “weighted” breadth-first search starting at the source vertex s. 

In particular, we can use the greedy method to develop an algorithm that iteratively grows a “cloud” of vertices out of s, with the vertices entering the cloud in order of their distances from s. Thus, in each iteration, the next vertex chosen is the vertex outside the cloud that is closest to s. 

The algorithm terminates when no more vertices are outside the cloud (or when those outside the cloud are not connected to those within the cloud), at which point we have a shortest path from  s to every vertex of G that is reachable from s. 

This approach is a simple, but nevertheless powerful, example of the greedy method design pattern. Applying the greedy method to the single-source, shortest-path problem, results in an algorithm known as **Dijkstra’s algorithm**.

## Minimum Spanning Trees

Find the tree T that contains all vertices of G and has min total Weight.

## Prim Jarnik Algorithm

Make a single tree like Dijkstra.

In the Prim-Jarnı́k algorithm, we grow a minimum spanning tree from a single cluster starting from some “root” vertex s. The main idea is similar to that of Dijkstra’s algorithm.

## Kruskal's Algorithm

Make a forest , make discards and merges if needed UNTIL you find the ADEQUATELY BIG tree.


# Examples are here! 🐦

In [1]:
"""
Given an m x n 2D binary grid grid which represents a map 
of '1's (land) and '0's (water), return the number of islands.

An island is surrounded by water and is formed by connecting 
adjacent lands horizontally or vertically. 

You may assume all four edges of the grid are all 
surrounded by water.

Example 1:

    Input: grid = [
      ["1","1","1","1","0"],
      ["1","1","0","1","0"],
      ["1","1","0","0","0"],
      ["0","0","0","0","0"]
    ]
    
    Output: 1

Example 2:

    Input: grid = [
      ["1","1","0","0","0"],
      ["1","1","0","0","0"],
      ["0","0","1","0","0"],
      ["0","0","0","1","1"]
    ]
    Output: 3
    

Constraints:

    m == grid.length
    n == grid[i].length
    1 <= m, n <= 300
    grid[i][j] is '0' or '1'.

Takeaway:

    To solve this question, think like a kindergardener.

    How can you find a single island?

    You look.

    For each 1, you need to check neighbours, this is clearly bfs or dfs.

    Setting the cell to 0 in order to not visit it again is pretty cool.
"""

from collections import deque

class Solution:
    def numIslands(self, grid: "list[list[str]]") -> int:
        # it is just bfs man. We got this.
        if not grid:
            return 0
        
        rows , cols = len(grid), len(grid[0])

        visited = set()
        islands = 0

        def bfs(r ,c):
            q = deque()
            visited.add((r,c))
            # add the coordinates to the queue
            q.append((r,c))

            while q:
                row, col = q.popleft()
                directions = [[1, 0], [-1, 0], [0, 1], [0, -1]]

                for deltar, deltac in directions:
                    r , c = row + deltar , col + deltac

                    if (r in range(rows) and
                        c in range(cols) and
                        grid[r][c] == "1" and
                        (r, c) not in visited):
                        # add the coordinate to queue
                        q.append((r, c)) 
                        visited.add((r,c))

        for r in range(rows):
            for c in range(cols):
                if grid[r][c] == "1" and (r, c) not in visited:
                    # we found a new Island!
                    # lets see how big it is?
                    bfs(r,c)
                    islands += 1

        return islands
    
    def numIslands_(self, grid: list[list[str]]) -> int:
        # dfs solution

        row = len(grid)
        column = len(grid[0])
        result = 0

        def dfs(i, j):

            # return if not proper island
            if (i < 0 or 
                j < 0 or 
                i >= row or 
                j >= column or 
                grid[i][j] == '0'):
                    return

            #  Otherwise, it marks the current cell 
            # as visited by setting it to '0' and recursively 
            # calls dfs on its neighbors 
            # (up, down, left, and right).
            grid[i][j] = '0'

            # 4 direction dfs. cool
            dfs(i - 1, j)
            dfs(i + 1, j)
            dfs(i, j - 1)
            dfs(i, j + 1)
            
        for i in range(row):
            for j in range(column):
                if grid[i][j] == '1':
                    result += 1
                    dfs(i, j)
        
        return result

In [2]:
"""
Given a reference of a node in a connected undirected graph.

Return a deep copy (clone) of the graph.

Each node in the graph contains a value (int) 
and a list (List[Node]) of its neighbors.

    class Node {
        public int val;
        public List<Node> neighbors;
    }

Test case format:

    For simplicity, each node's value is the same as the 
    node's index (1-indexed). For example, the first node 
    with val == 1, the second node with val == 2, and so on. 
    The graph is represented in the test case using an adjacency list.

    An adjacency list is a collection of unordered lists used to represent 
    a finite graph. Each list describes the set of neighbors of a node in the graph.

    The given node will always be the first node with val = 1. You must 
    return the copy of the given node as a reference to the cloned graph.

Example 1:

    Input: adjList = [[2,4],[1,3],[2,4],[1,3]]
    
    Output: [[2,4],[1,3],[2,4],[1,3]]
    
    Explanation: There are 4 nodes in the graph.
        
        1st node (val = 1)'s neighbors are 2nd node (val = 2) and 4th node (val = 4).
        2nd node (val = 2)'s neighbors are 1st node (val = 1) and 3rd node (val = 3).
        3rd node (val = 3)'s neighbors are 2nd node (val = 2) and 4th node (val = 4).
        4th node (val = 4)'s neighbors are 1st node (val = 1) and 3rd node (val = 3).

Example 2:

    Input: adjList = [[]]
    
    Output: [[]]
    
    Explanation: Note that the input contains one empty list. The 
    graph consists of only one node with val = 1 and it does not 
    have any neighbors.

Example 3:

    Input: adjList = []
    Output: []
    Explanation: This an empty graph, it does not have any nodes.

Constraints:

    The number of nodes in the graph is in the range [0, 100].
    
    1 <= Node.val <= 100
    
    Node.val is unique for each node.
    
    There are no repeated edges and no self-loops in the graph.
    
    The Graph is connected and all nodes can be visited starting from the given node.

Takeaway:

    Use a mapping to clone every old node to new ones.

    As you would need to clone every vertices, you need a dfs 
    or bfs approach
"""

# Definition for a Node.
class Node:
    def __init__(self, val = 0, neighbors = None):
        self.val = val
        self.neighbors = neighbors if neighbors is not None else []

from typing import Optional

class Solution:
    def cloneGraph_(self, node: Optional['Node']) -> Optional['Node']:
        # this does not work.
        # there is no code herE LOL
        
        # result is also going to be an adjecency list
        # result = []
        # TypeError Node object is not iterable!
        
        # for index, elem in enumerate(node):
        #     all_neighbours = elem.neighbors
        #     result.append(Node(val = index, neighbors = list(all_neighbours)))
        
        # i = 1
        # while node:
        #     all_neighbours = node.neighbors
        #     result.append(Node(val = i , neighbors = list(all_neighbours)))
        
        # return result        
        pass


    def cloneGraph(self, node: Optional['Node']) -> Optional['Node']:
        # we can see that we need to make some sort of traverse operation
        # for all nodes we can visit. - DFS is cool
        
        # this map holds the old nodes mapped to new nodes
        old_to_new = {}
        
        def dfs_cloner(node):
            if node in old_to_new:
                # we already made a clone of node
                return old_to_new[node]
            
            # no clone is made yet, lets make it
            
            # we have to keep making new nodes, 
            # this is the recursive case
            temp = Node(node.val)
            # the mapped new node is temp
            old_to_new[node] = temp
            # for all neighbors of node
            for nei in node.neighbors:
                # append all expected neighbors to new node with calls to dfs
                temp.neighbors.append(dfs_cloner(nei))
            
            return temp
        
        return dfs_cloner(node) if node else None

In [3]:
"""
You are given an m x n binary matrix grid. 

An island is a group of 1's (representing land) connected 4-directionally 
(horizontal or vertical.) 

You may assume all four edges of the grid are surrounded by water.

The area of an island is the number of cells with a value 1 
in the island.

Return the maximum area of an island in grid. If there is no 
island, return 0.

Example 1:

    Input: grid = [[0,0,1,0,0,0,0,1,0,0,0,0,0],
                    [0,0,0,0,0,0,0,1,1,1,0,0,0],
                    [0,1,1,0,1,0,0,0,0,0,0,0,0],
                    [0,1,0,0,1,1,0,0,1,0,1,0,0],
                    [0,1,0,0,1,1,0,0,1,1,1,0,0],
                    [0,0,0,0,0,0,0,0,0,0,1,0,0],
                    [0,0,0,0,0,0,0,1,1,1,0,0,0],
                    [0,0,0,0,0,0,0,1,1,0,0,0,0]]
    Output: 6

    Explanation: 
    
        The answer is not 11, because the island must be connected 4-directionally.

Example 2:

    Input: grid = [[0,0,0,0,0,0,0,0]]
    
    Output: 0
 

Constraints:

    m == grid.length
    
    n == grid[i].length
    
    1 <= m, n <= 50
    
    grid[i][j] is either 0 or 1.

Takeaway:

    DFS is wonderful. 

    There is a condition on calling dfs

    AND

    We do not have to call dfs on every tile.
"""

class Solution:
    def maxAreaOfIsland_(self, grid: list[list[int]]) -> int:
        # out of bounds is water
        # works
        
        rows = len(grid)
        cols = len(grid[0])
        
        def dfs_area(i, j):
            if (i < 0 or
                j < 0 or
                i >= rows or
                j >= cols or 
                grid[i][j] == 0):
                return 0
            
            grid[i][j] = 0

            area = 1 
            area += dfs_area(i + 1, j)
            area += dfs_area(i, j + 1)
            area += dfs_area(i -1, j)
            area += dfs_area(i, j - 1)
            
            return area
            
        max_area = 0
        for i in range(rows):
            for j in range(cols):
                if grid[i][j] == 1:
                    max_area = max(max_area, dfs_area(i, j))
        
        return max_area
    
    
    def maxAreaOfIsland(self, grid: list[list[int]]) -> int:
        # uses a visited set for the dfs
        # this way, we will not run dfs on the same island twice or more
        
        # if we bump into a 1 value, we will run dfs
        rows, cols = len(grid), len(grid[0])
        visit = set()
        
        def dfs(r, c):
            # base case first
            if (r < 0 or r == rows or c < 0 or c == cols or 
                grid[r][c] == 0 or (r, c) in visit):
                return 0
            
            visit.add((r,c))
            
            return (1 + dfs(r + 1, c) +
                        dfs(r - 1, c) +
                        dfs(r, c + 1) +
                        dfs(r, c - 1))
        area = 0
        for r in range(rows):
            for c in range(cols):
                area = max(area, dfs(r, c))
        
        return area

In [4]:
"""
There is an m x n rectangular island that borders both 
the Pacific Ocean and Atlantic Ocean. 

The Pacific Ocean touches the island's left and top edges, and the Atlantic Ocean 
touches the island's right and bottom edges.

The island is partitioned into a grid of square cells. 

You are given an m x n integer matrix heights where heights[r][c] 
represents the height above sea level of the cell at coordinate (r, c).

The island receives a lot of rain, and the rain water 
can flow to neighboring cells directly north, south, east, 
and west if the neighboring cell's height is less than or equal 
to the current cell's height. 

Water can flow from any cell adjacent to an ocean into the ocean.

Return a 2D list of grid coordinates result where 
result[i] = [ri, ci] denotes that rain water can flow from 
cell (ri, ci) to both the Pacific and Atlantic oceans.

Example 1:

    Input: heights = [[1,2,2,3,5],
                      [3,2,3,4,4],
                      [2,4,5,3,1],
                      [6,7,1,4,5],
                      [5,1,1,2,4]]

    Output: [[0,4],[1,3],[1,4],[2,2],[3,0],[3,1],[4,0]]

    Explanation: 
    
        The following cells can flow to the Pacific and Atlantic 
            oceans, as shown below:

        [0,4]: [0,4] -> Pacific Ocean 
               [0,4] -> Atlantic Ocean
        [1,3]: [1,3] -> [0,3] -> Pacific Ocean 
               [1,3] -> [1,4] -> Atlantic Ocean
        [1,4]: [1,4] -> [1,3] -> [0,3] -> Pacific Ocean 
               [1,4] -> Atlantic Ocean
        [2,2]: [2,2] -> [1,2] -> [0,2] -> Pacific Ocean 
               [2,2] -> [2,3] -> [2,4] -> Atlantic Ocean
        [3,0]: [3,0] -> Pacific Ocean 
               [3,0] -> [4,0] -> Atlantic Ocean
        [3,1]: [3,1] -> [3,0] -> Pacific Ocean 
               [3,1] -> [4,1] -> Atlantic Ocean
        [4,0]: [4,0] -> Pacific Ocean 
               [4,0] -> Atlantic Ocean

        Note that there are other possible paths for these cells 
        to flow to the Pacific and Atlantic oceans.

Example 2:

    Input: heights = [[1]]

    Output: [[0,0]]

    Explanation: 
    
        The water can flow from the only cell to the 
            Pacific and Atlantic oceans.

Constraints:

    m == heights.length
    n == heights[r].length
    1 <= m, n <= 200
    0 <= heights[r][c] <= 10^5
    
Takeaway:

    Oh what a surprise, we will use DFS!

    Compartmentalize the code.
"""

class Solution:
    def pacificAtlantic_(self, heights: list[list[int]]) -> list[list[int]]:
        # could not make this work
        # pacific left - top
        # atlantic - bottom - right
        # return cells where there can be bidirectional flow
        
        rows , cols = len(heights), len(heights[0])
        
        # for every cell, check every neighbour
        # return True if dual flow
        
        def dfs(i, j):
            # this method has to return the [i][j] 
            # if conditions hold
            # otherwise returns None
            if (i < 0 or
               j < 0 or
               i >= rows or
               j >= cols):
                return
            return
        
        result = []
        for i in range(rows):
            for j in range(cols):
                result.append(dfs(i, j))
        
        return result
    
    def pacificAtlantic(self, heights: list[list[int]]) -> list[list[int]]:
        # instead of checking every cell in the grid, which would result in O(m*m)**2
        # seperately, check the reach for pacific, and atlantic
        # for intersection of those sets, we will find the result
        
        rows, cols = len(heights), len(heights[0])
        pacific, atlantic = set(), set()
        
        def dfs(r, c, visit, previous_height):
            # already visited
            # out of bounds
            # or height is smaller than previous height
            if ((r, c) in visit or 
               r < 0 or c < 0 or r == rows or c == cols or 
               heights[r][c] < previous_height):
                return 
            
            # add the tile into set
            visit.add((r,c))
            
            # call dfs on neighbours
            dfs(r + 1, c, visit, heights[r][c])
            dfs(r - 1, c, visit, heights[r][c])
            dfs(r, c + 1, visit, heights[r][c])
            dfs(r, c - 1, visit, heights[r][c])
            
        # go for every single position in the first row
        for c in range(cols):
            # first row, we are checking for pacific
            dfs(0, c, pacific, heights[0][c])
            # bottom row, we are cheking for atlantic
            dfs(rows - 1, c, atlantic, heights[rows - 1][c])
            
        # first column and last column
        for r in range(rows):
            # first col, we are checking for pacific
            dfs(r, 0, pacific, heights[r][0])
            # last col, we are checking for atlantic
            dfs(r, cols - 1, atlantic, heights[r][cols - 1])
        
        result = []
        for r in range(rows):
            for c in range(cols):
                if (r, c) in pacific and (r, c) in atlantic:
                    result.append([r, c])
                    
        return result

    def pacificAtlantic__(self, heights: list[list[int]]) -> list[list[int]]:
        # another approach
        if not heights:
            return []

        rows, cols = len(heights), len(heights[0])
        
        # Define directions for moving to neighboring cells (up, down, left, right)
        directions = [(-1, 0), (1, 0), (0, -1), (0, 1)]

        # Helper function to perform DFS
        def dfs(i, j, visited):
            visited[i][j] = True
            
            for direction in directions:
                ni, nj = i + direction[0], j + direction[1]
                
                # Check if the neighbor is within bounds and the height is greater than or equal
                # Also, check if the neighbor cell has not been visited
                if 0 <= ni < rows and 0 <= nj < cols and heights[ni][nj] >= heights[i][j] and not visited[ni][nj]:
                    dfs(ni, nj, visited)

        # Create matrices to track cells that can reach Pacific and Atlantic Oceans
        pacific_reachable = [[False] * cols for _ in range(rows)]
        atlantic_reachable = [[False] * cols for _ in range(rows)]

        # Check cells in the first and last columns (Atlantic and Pacific Oceans)
        for i in range(rows):
            dfs(i, 0, pacific_reachable)  # Pacific Ocean
            dfs(i, cols - 1, atlantic_reachable)  # Atlantic Ocean

        # Check cells in the first and last rows (Pacific and Atlantic Oceans)
        for j in range(cols):
            dfs(0, j, pacific_reachable)  # Pacific Ocean
            dfs(rows - 1, j, atlantic_reachable)  # Atlantic Ocean

        # Find cells that can reach both Pacific and Atlantic Oceans
        result = []
        for i in range(rows):
            for j in range(cols):
                if pacific_reachable[i][j] and atlantic_reachable[i][j]:
                    result.append([i, j])

        return result

In [5]:
"""
Given an m x n matrix board containing 'X' and 'O', capture all 
regions that are 4-directionally surrounded by 'X'.

A region is captured by flipping all 'O's into 'X's in that surrounded region.

Example 1:

    Input: board = [["X","X","X","X"],
                    ["X","O","O","X"],
                    ["X","X","O","X"],
                    ["X","O","X","X"]]

    Output: [["X","X","X","X"],
             ["X","X","X","X"],
             ["X","X","X","X"],
             ["X","O","X","X"]]

    Explanation: 
        Notice that an 'O' should not be flipped if:
            - It is on the border, or
            - It is adjacent to an 'O' that should not be flipped.
        The bottom 'O' is on the border, so it is not flipped.
        The other three 'O' form a surrounded region, so they are flipped.

Example 2:

    Input: board = [["X"]]
    
    Output: [["X"]]

Constraints:

    m == board.length
    n == board[i].length
    1 <= m, n <= 200
    board[i][j] is 'X' or 'O'.

Takeaway:

    reverse thinking ?

    we will run thourgh the border and change O's into T's
    so when we run dfs from start, we will change every O into X

    1. (DFS) CAPTURE UNSURRONDED REGIONS - MARK T'S
    2. CHANGE O'S INTO X'S
    3. MAKE T'S INTO O'S AGAIN
"""
class Solution:
    
    def solve(self, board: list[list[str]]) -> None:
        """
        Do not return anything, modify board in-place instead.
        """
        # reverse thinking ?
        # Capture surrounded regions
        # is equal to
        # capture everything except unsurrunded regions
        
        # we will run thourgh the border and change O's into T's
        # so when we run dfs from start, we will change every O into X
        
        # 1. (DFS) CAPTURE UNSURRONDED REGIONS - MARK T'S
        # 2. CHANGE O'S INTO X'S
        # 3. MAKE T'S INTO O'S AGAIN
        
        rows, cols = len(board), len(board[0])
        
        def dfs(r, c):
            # take rows and columns
            if (r < 0 or c < 0 or 
                r == rows or c == cols or
                board[r][c] != "O"):
                return
            # change the "O" into "T"
            board[r][c] = "T"
            # go other directions
            dfs(r + 1, c)
            dfs(r - 1, c)
            dfs(r, c + 1)
            dfs(r, c - 1)
            
        # 1. (DFS) CAPTURE UNSURRONDED REGIONS - MARK T'S
        for r in range(rows):
            for c in range(cols):
                if (board[r][c] == "O" and 
                   (r in [0, rows -1] or c in [0, cols - 1])):
                    dfs(r, c)
                    
        # 2. CHANGE O'S INTO X'S             
        for r in range(rows):
            for c in range(cols):
                if board[r][c] == "O":
                    board[r][c] = "X"
                    
        # 3. MAKE T'S INTO O'S AGAIN
        for r in range(rows):
            for c in range(cols):
                if board[r][c] == "T":
                    board[r][c] = "O"

In [6]:
"""
You are given an m x n grid where each cell can have one of three values:

    0 representing an empty cell,
    1 representing a fresh orange, or
    2 representing a rotten orange.

Every minute, any fresh orange that is 4-directionally adjacent 
to a rotten orange becomes rotten.

Return the minimum number of minutes that must elapse until no cell 
has a fresh orange. If this is impossible, return -1.

Example 1:

    Input: grid = [[2,1,1],[1,1,0],[0,1,1]]
    
    Output: 4

Example 2:

    Input: grid = [[2,1,1],[0,1,1],[1,0,1]]
    
    Output: -1
    
    Explanation: 
    
        The orange in the bottom left corner (row 2, column 0) 
            is never rotten, because rotting only happens 4-directionally.

Example 3:

    Input: grid = [[0,2]]
    
    Output: 0
    
    Explanation: 
    
        Since there are already no fresh oranges at minute 0, the answer is just 0.
 
Constraints:

    m == grid.length
    n == grid[i].length
    1 <= m, n <= 10
    grid[i][j] is 0, 1, or 2.

Takeaway:

    DFS wont work because, oranges rot at the same time
    if there are more than 1 rotten oranges
    we cannot count time rightfully

    we should do a multi source BFS
    BFS is implemented with, usually using a queue
    if after BFS is done, 
    there are still fresh oranges
    we cannot rot the table, Return -1
"""

from collections import deque

class Solution:
    def orangesRotting__(self, grid: list[list[int]]) -> int:
        # just try
        # kinda like a fool
        # a fool is the precursor to savior
        
        counter = 0
        rows , cols = len(grid), len(grid[0])
        
        # if we bump into a rotten orange,
        # we will rot all 4 other directions
        
        # if all rotten we stop
        
        def dfs(r, c, rot):
            if (r < 0 or c < 0 or
                r == rows or c == cols or             
                grid[r][c] in [0, 1]):
                return
            
            # rotten orange
            counter += 1
            dfs(r + 1, c)
            dfs(r - 1, c)
            dfs(r, c + 1)
            dfs(r, c - 1)
                
        dfs(0, 0)
        return counter
                
    def orangesRotting_(self, grid: list[list[int]]) -> int:
        # this works

        # Time Complexity: O(m * n)
        # Space Complexity: O(m * n)
        
        fresh, rotten = set(), deque()

        # iterate through the grid to get all 
        # the fresh and rotten oranges
        for row in range(len(grid)):
            for col in range(len(grid[0])):
                # if we see a fresh orange, put its position in fresh
                if grid[row][col] == 1:
                    fresh.add((row, col))

                # if we see a rotten orange, put its position in rotten
                elif grid[row][col] == 2:
                    rotten.append((row, col))

        minutes = 0
        # If there are rotten oranges in the queue and 
        # there are still fresh oranges in the grid 
        # keep looping
        while fresh and rotten:

            minutes += 1

            # iterate through rotten, popping off the (row, col) that's currently in rotten
            # we don't touch the newly added (row, col) that are added during the loop until the next loop
            for rot in range(len(rotten)):
                row, col = rotten.popleft()

                for direction in ((row - 1, col), (row + 1, col), (row, col - 1), (row, col + 1)):
                    if direction in fresh:
                        # if the (row, col) is in fresh, remove it then add it to rotten
                        fresh.remove(direction)
                        # we will perform 4-directional checks on each (row, col)
                        rotten.append(direction)

        # if fresh is not empty, then there is an orange we were not able to reach 4-directionally    
        return -1 if fresh else minutes

        
    def orangesRotting(self, grid: list[list[int]]) -> int:
        # dfs wont work because, oranges rot at the same time
        # if there are more than 1 rotten oranges
        # we cannot count time rightfully
        
        # we should do a multi source bfs
        # bfs is implemented with, usually using a queue
        # if after bfs is done, 
        # there are still fresh oranges
        # we cannot rot the table
        
        q = deque()
        time, fresh = 0, 0
        
        rows, cols = len(grid), len(grid[0])
        for r in range(rows):
            for c in range(cols):
                if grid[r][c] == 1:
                    fresh += 1
                if grid[r][c] == 2:
                    q.append([r, c])
                    
        # make directions list
        directions = [[0, 1], [0, -1], [1, 0], [-1, 0]]
        
        while q and fresh > 0:
            
            for i in range(len(q)):
                # this is a queue so not pop, popleft
                r, c = q.popleft()
                for dr, dc in directions:
                    row, col = dr + r, dc + c
                    # if in bounds and fresh, make rotten
                    if (row < 0 or row == len(grid) or
                       col < 0 or col == len(grid[0])
                       or grid[row][col] != 1):
                        continue
                    # else rot it
                    grid[row][col] = 2
                    # add the new rotten orange to queue
                    q.append([row, col])
                    fresh -= 1
            time += 1
            
        return time if fresh == 0 else -1

In [7]:
"""
You are given a m x n 2D grid initialized with these three possible values.

    -1 - A wall or an obstacle.

    0 - A gate.

    INF - Infinity means an empty room. We use the 
       value 2^31 - 1 = 2147483647 to represent INF as 
       you may assume that the distance to a gate 
       is less than 2147483647.

Fill each empty room with the distance to its nearest gate. 

If it is impossible to reach a Gate, that room should remain filled with INF

Example 1:

    Input: [[2147483647, -1, 0, 2147483647],
            [2147483647, 2147483647, 2147483647, -1],
            [2147483647, -1, 2147483647,- 1],
            [0, -1, 2147483647, 2147483647]]

    Output: [[3, -1, 0, 1],
             [2, 2, 1, -1],
             [1, -1, 2, -1],
             [0, -1, 3, 4]]

    Explanation:

        the 2D grid is:

        INF  -1  0  INF

        INF INF INF  -1

        INF  -1 INF  -1

          0  -1 INF INF

    the answer is:

          3  -1   0   1

          2   2   1  -1

          1  -1   2  -1

          0  -1   3   4

Example 2:

    Input: [[0, -1],
            [2147483647, 2147483647]]

    Output: [[0, -1],
             [1, 2]]

Takeaway:

    BFS is cool to use. Because if we tried to do DFS
    calculations would mix from gates at different coordinates.

    With BFS, there comes a deque and set :)

"""
from collections import deque

class Solution:
    def wallsAndGates(self, rooms: list[list[int]]) -> None:
        # we can try a dfs solution where we run 
        # dfs on every single cell
        # and update min distance to gate
        # can we do better than this complexity ?

        # we can use bfs
        # simultainously start from all gates 
        # and mark islands with distances
        
        # we will stop when all islands are marked
        # so we do not do repeated work 

        # for this, we will use a queue with 
        # all positions of gates
        rows, cols = len(rooms), len(rooms[0])
        visit = set()
        q = deque()

        def add_room(r, c):
            # if out of bounds OR visited OR wall
            if (r < 0 or r == rows or 
                c < 0 or c == cols or
                (r, c) in visit or rooms[r][c] == -1):
                return
            visit.add((r, c))
            q.append([r, c])

        # add all gates into the q and visited set
        for r in range(rows):
            for c in range(cols):
                if rooms[r][c] == 0:
                    q.append([r, c])
                    visit.add((r, c))
        
        dist = 0
        while q:
            for i in range(len(q)):
                r, c = q.popleft()
                rooms[r][c] = dist
                add_room(r + 1, c)
                add_room(r - 1, c)
                add_room(r, c + 1)
                add_room(r, c - 1)
            dist += 1

if __name__ == "__main__":
    sol = Solution()
    a = [[2147483647,-1,0,2147483647],
         [2147483647,2147483647,2147483647,-1],
         [2147483647,-1,2147483647,-1],
         [0,-1,2147483647,2147483647]]

    sol.wallsAndGates(a)
    print(a) # [[3, -1, 0, 1], [2, 2, 1, -1], [1, -1, 2, -1], [0, -1, 3, 4]]

[[2147483647, -1, 0, 2147483647], [2147483647, 2147483647, 2147483647, -1], [2147483647, -1, 2147483647, -1], [0, -1, 2147483647, 2147483647]]
[[3, -1, 0, 1], [2, 2, 1, -1], [1, -1, 2, -1], [0, -1, 3, 4]]


In [8]:
"""
There are a total of numCourses courses you have to take, labeled 
from 0 to numCourses - 1. 

You are given an array prerequisites where prerequisites[i] = [ai, bi] 
indicates that you must take course bi first if you want to take course ai.

    For example, the pair [0, 1], indicates that to take course 
        0 you have to first take course 1.

Return true if you can finish all courses. Otherwise, return false.

Example 1:

    Input: numCourses = 2, prerequisites = [[1,0]]
    
    Output: true

    Explanation: 
    
        There are a total of 2 courses to take. 
            To take course 1 you should have finished course 0. So it is possible.

Example 2:

    Input: numCourses = 2, prerequisites = [[1,0],[0,1]]

    Output: false

    Explanation: 
    
        There are a total of 2 courses to take. 
        To take course 1 you should have finished course 0, and to take 
        course 0 you should also have finished course 1. 
        So it is impossible.

Constraints:

    1 <= numCourses <= 2000
    
    0 <= prerequisites.length <= 5000
    
    prerequisites[i].length == 2
    
    0 <= ai, bi < numCourses
    
    All the pairs prerequisites[i] are unique.

Takeaway:

    We can make an adjecancy list using a hashmap

    we will check the possible cycles with a set.

    Because the graph is not connected, we have to run 
    dfs on every course =)

"""
from collections import defaultdict

class Solution:
    def canFinish_(self, numCourses: int, prerequisites: list[list[int]]) -> bool:
        # this was my first attempt
        # failed
        
        """Return True if one can 
        finish their courses"""
        # prerequisites
        # [a, b]
        # we must take b before a
        
        # we will have paths from prerequisites
        # and if there are disagreements, we cannot finish it.
        
        result = []
        # kinda like a directed arrows
        # [[1, 0], [4, 5], [5, 3], [2, 1]]
        # we can concatenate paths that are intersecting 
        for end, start in prerequisites:
            if [end, start] not in result:
                result.append([end, start])
            if start in [prerequisites[i][1] for i in range(len(prerequisites))]:
                # concatenate paths and add it to result
                result.append()
               

        def is_paths_crossing(seq):
            pass

        return False if is_paths_crossing(result) else True
    
    def canFinish__(self, numCourses: int, prerequisites: list[list[int]]) -> bool:
        """Return True if one can finish their courses
        This solution works but it is complex"""
        # Make a graph to represent the 
        # courses and their prerequisites
        graph = defaultdict(list)
        for end, start in prerequisites:
            graph[start].append(end)

        # Helper function to perform DFS and detect cycles
        def is_cyclic(course, visited, path):
            visited[course] = True
            path[course] = True

            # Explore neighbors
            for neighbor in graph[course]:
                if not visited[neighbor]:
                    if is_cyclic(neighbor, visited, path):
                        return True
                elif path[neighbor]:
                    return True

            # Backtrack
            path[course] = False
            return False

        # Check for cycles in each course's prerequisites
        visited = [False] * numCourses
        path = [False] * numCourses

        for course in range(numCourses):
            if not visited[course]:
                if is_cyclic(course, visited, path):
                    return False
                
        # If there are no cycles, it is possible to finish all courses
        return True            

    def canFinish(self, numCourses: int, prerequisites: list[list[int]]) -> bool:
        """Return True if one can finish their courses
        This one is the simplest approach"""
        # WORKS
        
        # all courses and it's prequisites are edges 
        # time complexity is O(n + p) because we will move from 
        # every single node using every single edge
        
        # for [[0, 1], [0, 2], [1, 3], [1, 4], [3, 4]]
        # we will make an adjacency list
        # which we can implement with a hashmap
        
        # map = { course: prequisites}
        # 0 - [1, 2]
        # 1 - [3, 4]
        # 2 - []
        # 3 - [4]
        # 4 - []
        pre_map = {i: [] for i in range(numCourses)}
        
        for course, pre in prerequisites:
            pre_map[course].append(pre)
        
        # how can we detect loops?
        # we can use a set
        # just to check the visited courses
        # if we bump into already visited set, that means we found a loop!
        visited = set()
        
        def dfs(crs):
            if crs in visited:
                # found a loop
                return False
            if pre_map[crs] == []:
                # no prerequisites
                # we can complete this course
                return True
            
            visited.add(crs)
            
            # otherwise, we have some work to do on prerequisites
            for elem in pre_map[crs]:
                if not dfs(elem):
                    # we do not have to wait if 
                    # we find 1 False
                    # just return False
                    return False
                
            # we finished visiting this course
            visited.remove(crs)
            
            # set it to an empty list so that we 
            # do not have to do all the work again
            pre_map[crs] = []
            return True
        
        # we have to manually run dfs on every single course
        # because the graph can be not connected
        for course in range(numCourses):
            if not dfs(course):
                return False
        
        return True

In [9]:
"""
There are a total of numCourses courses you have to take, 
labeled from 0 to numCourses - 1. 

You are given an array prerequisites where prerequisites[i] = [ai, bi] 
indicates that you must take course bi first if you want to take course ai.

    For example, the pair [0, 1], indicates that to take course 
        0 you have to first take course 1.

Return the ordering of courses you should take to finish all courses. 

If there are many valid answers, return any of them. 

If it is impossible to finish all courses, return an empty array.

Example 1:

    Input: numCourses = 2, prerequisites = [[1,0]]

    Output: [0,1]

    Explanation: 
        There are a total of 2 courses to take. To take 
        course 1 you should have finished course 0. So the 
        correct course order is [0,1].

Example 2:

    Input: numCourses = 4, prerequisites = [[1,0],[2,0],[3,1],[3,2]]
    
    Output: [0,2,1,3]

    Explanation: 
        There are a total of 4 courses to take. To 
        take course 3 you should have finished both courses 1 and 2. 
        Both courses 1 and 2 should be taken after you finished course 0.

        So one correct course order is [0,1,2,3].
        Another correct ordering is [0,2,1,3].

Example 3:

    Input: numCourses = 1, prerequisites = []
    Output: [0]

Constraints:

    1 <= numCourses <= 2000
    
    0 <= prerequisites.length <= numCourses * (numCourses - 1)
    
    prerequisites[i].length == 2
    
    0 <= ai, bi < numCourses
    
    ai != bi
    
    All the pairs [ai, bi] are distinct.

Takeaway:

    Fantastic way to learn topological sort.

    We can make an adjacency list with a dict
"""
class Solution:
    
    def findOrder(self, numCourses: int, prerequisites: list[list[int]]) -> list[int]:
        """Return the order of courses based on prerequisites"""
        
        # this question teaches topological sort
        # which is a graph algorithm
        # vertice - node & edge - path
        # if we detect a cycle we will stop the run immediately,
        # no need to continue
        
        # we will make an adjacency list
        pre_map = {i: [] for i in range(numCourses)}
        
        # populate the map
        for course, pre in prerequisites:
            pre_map[course].append(pre)

        # a course has 3 possible states
        # visited: course has been added to output
        # visiting: course has not been added to output, but it is in the cycle
        # unvisited: course not added to output or cycle
        
        output = []
        visited, cycle = set(), set()
            
        # we can use dfs 
        # which we will use on every course
        def dfs(crs):
            if crs in cycle:
                # detected a cycle
                return False
            if crs in visited:
                # we do not need to visit this again
                # this is because we will run dfs on 
                # every course, which might result in repeated work
                return True
            
            cycle.add(crs)
            for pre in pre_map[crs]:
                if dfs(pre) == False:
                    # we detected a cycle
                    return False
                
            # now we can remove the course from cycle
            cycle.remove(crs)
            # we went through all prerequisites
            # we can add this to visited
            visited.add(crs)
            # now that we actually visited all prerequisites
            # we can add it to output
            output.append(crs)
            
            return True
        
        # run dfs on all courses
        for c in range(numCourses):
            if dfs(c) == False:
                return []
            
        # if all dfs run successfully
        # return the output we built!
        return output
    
if __name__ == "__main__":
    sol = Solution()
    print(sol.findOrder(4, [[1,0],[2,0],[3,1],[3,2]])) 
    # [0, 1, 2, 3]

[0, 1, 2, 3]


In [10]:
"""
In this problem, a tree is an undirected graph that is 
connected and has no cycles.

You are given a graph that started as a tree with n nodes 
labeled from 1 to n, with one additional edge added. 

The added edge has two different vertices chosen from 1 to n, and was not 
an edge that already existed. 

The graph is represented as an array edges of length n where 
edges[i] = [ai, bi] indicates that there is an edge between 
nodes ai and bi in the graph.

Return an edge that can be removed so that the resulting graph 
is a tree of n nodes. 

If there are multiple answers, return the answer that occurs last in the input.

Example 1:

    Input: edges = [[1,2],[1,3],[2,3]]
    
    Output: [2,3]

Example 2:

    Input: edges = [[1,2],[2,3],[3,4],[1,4],[1,5]]
    
    Output: [1,4]
 
Constraints:

    n == edges.length
    
    3 <= n <= 1000
    
    edges[i].length == 2
    
    1 <= ai < bi <= edges.length
    
    ai != bi
    
    There are no repeated edges.
    
    The given graph is connected.

Takeaway:

    This question is a great opportunity to learn about UnionFind

    We need to find parent of nodes and union them if they 
        do not have the same parents. While doing that we will 
        update their ranks in order to union on a condition.
"""
class Solution:

    def findRedundantConnection(self, edges: list[list[int]]) -> list[int]:
        # tree - undirected graph that is connected and has no cycles.

        # Return an edge that can be removed so that the resulting graph is a tree of n nodes. 
        # If there are multiple answers, return the answer that occurs 
        # last in the input.

        # we will start from the initial edge
        
        # we can use dfs to get O(n^2) solution
        # but we can use "Union Find" algorithm to get the solution in O(n)
        
        # in the beginning the graph will be connected anyway
        # because we will have n edges and n nodes
        
        # how can we decide if we made a cycle with adding an edge ?
        # when we add an edge, the nodes are ALREADY Connected
        
        # when we add a redundant connection (which is why the question 
        # is called Redundant Connection)

        # for every node, the parent is itself at the start
        parent = [i for i in range(len(edges) + 1)]
        
        # ranks is the amount of children every node has
        # initially 1 for all of them
        ranks = [1] * (len(edges) + 1)
        
        def find(n):
            """Given a node n, find what it's parent is."""
            # there could be multiple links to get to the root parent
            p = parent[n]
            
            # n could be the parent of itself, 
            # so we will keep going until we find the parent equals to self
            while p != parent[p]:
                # path compression
                # shorten the path as we go up the link
                parent[p] = parent[parent[p]]
                # go up the link
                p = parent[p]
            
            # once we got to the root parent
            return p
        
        def union(n1, n2):
            """Union two given nodes"""
            # to union two nodes we need to find 
            # both of the parents first
            p1, p2 = find(n1), find(n2)
            
            if p1 == p2:
                # they already have the same parent
                # we cannot union these two
                # we found a redundant connection
                return False
            
            # union them by ranks
            if ranks[p1] > ranks[p2]:
                # p1 is going to be the parent
                parent[p2] = p1
                # update the rank of p1
                ranks[p1] = ranks[p1] + ranks[p2]
            else:
                # do the opposite - p2 is parent
                parent[p1] = p2
                # update the rank of p2
                ranks[p2] += ranks[p1]
            
            # successfully union them 
            return True
        
        for n1, n2 in edges:
            # call union on n1, n2
            if not union(n1, n2):
                # if cannot union it 
                return [n1, n2]

In [11]:
"""
There is an undirected graph with n nodes. 

There is also an edges array, where edges[i] = [a, b] means that there 
is an edge between node a and node b in the graph.

Return the total number of connected components in that graph.

Example 1:

    Input: n=3, edges=[[0,1], [0,2]]
    
    Output: 1

Example 2:

    Input:  n=6, edges=[[0,1], [1,2], [2, 3], [4, 5]]
    
    Output: 2

Constraints:

    1 <= n <= 100
    
    0 <= edges.length <= n * (n - 1) / 2

Takeaway:

    Another chance to learn UnionFind.

    Both DFS and UnionFind will work.

    If you have the edges you can make you own adjacency list
"""

class Solution:

    def countComponents(self, n: int, edges: list[list[int]]) -> int:
        """The dfs solution =) 
        """
        # [[0, 1], [0, 2]] 
        # single portion that is connected, result is 1

        # [[0, 1], [1, 2], [2, 3], [4, 5]]
        # 2 seperate connected portions, result is 2

        # Make an adjacency list
        adj_list = [[] for i in range(n)]

        # populate adj_list with edges:
        for edge in edges:
            adj_list[edge[0]].append(edge[1])
            adj_list[edge[1]].append(edge[0])

        def dfs(node):
            # Mark the current node as visited
            visited[node] = True
            # Recur for all the vertices adjacent to this vertex
            for neighbor in adj_list[node]:
                if not visited[neighbor]:
                    dfs(neighbor)

        # Mark all the vertices as not visited
        visited = [False for _ in range(n)]

        # Store the number of connected components
        count = 0
         
        for v in range(n):
            if not visited[v]: # if (visited[n] == False):
                dfs(v)
                count += 1
                 
        return count
   
    def countComponents_(self, n: int, edges: list[list[int]]) -> int:
        """Union Find Solution"""
        # we will find parents of nodes

        # if two union condidate has same root parent
        # they are already connected
        # we will use rank as we union two nodes.

        # parent and rank will change as we 
        # iterate through nodes

        # initially all parents are nodes themselves
        parent = [i for i in range(n)]

        # rank is initially 1 for all
        rank = [1] * n

        def find(n):
            # initially the result is n
            res = n

            # until you get to node that is parent to itself
            while res != parent[res]:
                # path compression - shorten the path
                parent[res] = parent[parent[res]]
                # set result to the parent
                res = parent[res]
            # the node that is parent to itself
            return res

        def union(n1, n2):
            p1 = find(n1)
            p2 = find(n2)

            if p1 == p2:
                return 0
            
            if rank[p2] > rank[p1]:
                # p2 is the parent
                parent[p1] = p2
                rank[p2] += rank[p1]
            else:
                # p1 is the parent
                parent[p2] = p1
                rank[p2] += rank[p1]

            # we actually did the union
            return 1

        # the result is initially n different nodes
        res = n
        for n1, n2 in edges:
            # every time we make a union, we decrease the result
            res -= union(n1, n2)

        return res

In [12]:
"""
Given n nodes labeled from 0 to n - 1 and a list of undirected 
edges (each edge is a pair of nodes), write a function to 
check whether these edges make up a valid tree.

Example 1:

    Input: n = 5, edges = [[0, 1], [0, 2], [0, 3], [1, 4]]
    
    Output: True

Example 2:

    Input: n = 5, edges = [[0, 1], [1, 2], [2, 3], [1, 3], [1, 4]]

    Output: False

Note:

    You can assume that no duplicate edges will appear in edges. 
    
    Since all edges are undirected, [0, 1] is the same as [1, 0] and 
        thus will not appear together in edges.

Constraints:

    1 <= n <= 100
    0 <= edges.length <= n * (n - 1) / 2

Takeaway:

    We do not have loops in trees.
    
    A tree needs to be connected.

    DFS will help us out.
"""

class Solution:
    def validTree(self, n: int, edges: list[list[int]]) -> bool:
        # we do not have loops in trees.
        # a tree needs to be connected.

        # in a dfs traversal, number of visited nodes
        # should match the input node size

        # edge case, empty graph is a tree
        if not n:
            return True

        # make an adjaceny list
        adj = {i:[] for i in range(n)}
        for n1, n2 in edges:
            adj[n1].append(n2)
            adj[n2].append(n1)

        print(adj)

        visit = set()
        def dfs(i, prev):
            if i in visit:
                # node i was seen before!
                return False
            
            visit.add(i)
            
            for j in adj[i]:
                # for every neighbor
                if j == prev:
                    # go on
                    continue
                # make i as the new previous and call the dfs
                if not dfs(j, i):
                    # we detected a loop
                    return False
            return True

        # because -1 is not a node AND if no loop the lengths are equal
        return dfs(0, -1) and n == len(visit)
    
sol = Solution()
print(sol.validTree(n=5, edges=[[0,1],[0,2],[0,3],[1,4]]))

{0: [1, 2, 3], 1: [0, 4], 2: [0], 3: [0], 4: [1]}
True


In [13]:
"""
A transformation sequence from word beginWord to word endWord using 
a dictionary wordList is a sequence of words 
beginWord -> s1 -> s2 -> ... -> sk such that:

    Every adjacent pair of words differs by a single letter.

    Every si for 1 <= i <= k is in wordList. 
        Note that beginWord does not need to be in wordList.
    
    sk == endWord

Given two words, beginWord and endWord, and a dictionary wordList, return 
the number of words in the shortest transformation sequence from 
beginWord to endWord, or 0 if no such sequence exists.

Example 1:

    Input: 
        beginWord = "hit", endWord = "cog", 
        wordList = ["hot","dot","dog","lot","log","cog"]
    
    Output: 5

    Explanation: 
    
    One shortest transformation sequence 
        is "hit" -> "hot" -> "dot" -> "dog" -> cog", which is 5 words long.

Example 2:

    Input: 
        beginWord = "hit", endWord = "cog", 
        wordList = ["hot","dot","dog","lot","log"]
    
    Output: 0

    Explanation: 
        The endWord "cog" is not in wordList, therefore 
            there is no valid transformation sequence.

Constraints:

    1 <= beginWord.length <= 10
    
    endWord.length == beginWord.length
    
    1 <= wordList.length <= 5000
    
    wordList[i].length == beginWord.length
    
    beginWord, endWord, and wordList[i] consist of lowercase English letters.
    
    beginWord != endWord
    
    All the words in wordList are unique.

Takeaway:

    BFS - A SET AND A DEQUE IS A CLASSIC

    Make a adjacency map with a defaultdict

    For shortest path, BFS is GOOD
"""

from collections import deque
from collections import defaultdict
from collections import Counter

class Solution:
    def ladderLength_(self, beginWord: str, endWord: str, wordList: list[str]) -> int:
        # every word with one letter difference is connected.
        # if there is a path from start to end, we found the solution!
        
        # brute force
        # DOES NOT WORK 
        
        if beginWord not in wordList or endWord not in wordList:
            return 0
        
        adj = {}
        
        for i in range(len(wordList)):
            for j in range(i, len(wordList)):
                if self.one_diff(wordList[i], wordList[j]):
                    adj[wordList[i]] = wordList[j]
            
        visited = set()
        
        def bfs(word):
            if word in visited:
                return 
            
            visited.add(word)
            
            pass
        
        return bfs(beginWord)
            
    def one_diff_(self, word1, word2):
        # part of what does not work 
        return True if len(Counter(word1) - Counter(word2)) == 1 else False
    
    def ladderLength__(self, beginWord: str, endWord: str, wordList: list[str]) -> int:
        # expert help, works
        if endWord not in wordList:
            return 0

        wordList = set(wordList)

        queue = deque([(beginWord, 1)])  # Initial word and its level

        while queue:
            current_word, level = queue.popleft()

            if current_word == endWord:
                return level

            for i in range(len(current_word)):
                for char in 'abcdefghijklmnopqrstuvwxyz':
                    next_word = current_word[:i] + char + current_word[i + 1:]

                    if next_word in wordList:
                        wordList.remove(next_word)
                        queue.append((next_word, level + 1))

        return 0  # No transformation sequence found
    
    def ladderLength(self, beginWord: str, endWord: str, wordList: list[str]) -> int:
        # we need to make the graph ourselves,
        # beginning word might not be in wordList
        
        # making the graph in a nested loop takes O(N^2*M) time where
        # m is the length of word, n is the number of words
        # this does not cut it
        
        # To make the adjacency list, use patters:
        
        #      / *it
        #   hit- h*t  
        #      \ hi*
        
        # for every pattern, make a dictionary
        
        # To find the shortest path, BFS is really good
        
        if endWord not in wordList:
            return 0
        
        # a defaultdict for adjacency map
        neigh = defaultdict(list)
        wordList.append(beginWord)
        
        # make the adjacency list
        for word in wordList:
            for j in range(len(word)):
                # every word is same len
                pattern = word[:j] + "*" + word[j+1:]
                neigh[pattern].append(word)
        
        visit = set(beginWord) # to mark visit
        
        # BFS THING
        # add the beginning word, 
        #  pop elements and go layer by layer
        q = deque([beginWord]) 
        res = 1

        while q:
            for i in range(len(q)):
                word = q.popleft()
                if word == endWord:
                    # the end!
                    return res
                # using the pattern, visit all neighbours in BFS
                for j in range(len(word)):
                    pattern = word[:j] + "*" + word[j + 1:]
                    for neigh_word in neigh[pattern]:
                        if neigh_word not in visit:
                            visit.add(neigh_word)
                            q.append(neigh_word)
            res += 1
            
        # could not reach
        return 0