# Oral Exam preparation:


### Algorithms to know
- Merge sort, quicksort
- Longest common subsequence
- Huffman encoding
- Signals, BFS, DFS, Shortest paths, topological sort
- Dijkstra, Bellman-Ford, Floyd-Warshall, Prim, PageRank






### Helpfull links :

- PythonTutor : Step-by-step visualization of any python code ([here](https://pythontutor.com/render.html#mode=edit)).

- BigO Calculator : helps to understand time and space complexities of any algorithm ([here](https://www.bigocalc.com/)).

### Big O

In [1]:
# Why slicing and concatenation ar O(n) and not O(k) ?
#
#
# Slicing :
#       k is the length of the substring being sliced or concatenated.
#       k is different for different subproblems (between 1...n).
#       k ultimately depends on n.
#   
#       But worst case, k = n (k grows linearly with the input size.)
#       
#       Slicing is O(n).
#
# Concatenation :
#       Strings are immutable in Python.
#       So adding a char (concatenation) you must create a NEW string.
#       Which means iterating over the entire string (len k).
#       
#       Again, worst case could be k = n.
#       
#       So concatenation is O(n).


## 1. Divide and Conquer

### Merge Sort

In [2]:
# Merge Sort :
#       To visualize : https://pythontutor.com/
#
# Methodology :
#       - Goal : sort a list of numbers
#       - Idea : halve the list recursively until we have lists of size 1 or 0 (which are sorted by definition)
#               then merge the sorted lists back together
# 
#       1. Base case : Return immediately if list is size 1 (sorted).
#       2. Divide : Split input into two halves, 
#       3. Conquer : 
#           recursively sort each half. 
#           compare the 1st two elems of each list, add the smallest one to the sorted_list

#   Example : [3, 7, 4, 1]
#       Split into [3, 7] and [4, 1]
#       split [3, 7] into [3] and [7] 
#        -> while A and B are not empty, compare the first elements of each, pop them and add them to sorted_list.
#        -> result is [3, 7]
#       same for [4, 1] -> [1, 4]
#
#       finally merge [3, 7] and [1, 4] 
#         -> compare A[0]=3 and B[0]=1, pop B[0], add to sorted_list = [1]
#         -> compare A[0]=3 and B[0]=4, pop A[0], add to sorted_list = [1, 3]
#         -> compare A[0]=7 and B[0]=4, pop B[0], add to sorted_list = [1, 3, 4]
#         -> B is empty, add remaining A to sorted_list = [1, 3, 4, 7]

# ====================================
def merge_sort(numbers):
    # Simple cases
    if len(numbers) < 2: return numbers         # O(1)
        
    # Divide
    M = len(numbers) // 2                       # O(1)
    L1 = numbers[:M]                            # O(1)
    L2 = numbers[M:]                            # O(1)
    
    # Conquer - assuming merge_sort worked on A and B
    A = merge_sort(L1)                          # T(n/2)
    B = merge_sort(L2)                          # T(n/2)
    sorted_numbers = []                         # O(1)
    while A and B:                              # O(n) - iterations expected
        sorted_numbers += [
            A.pop(0) if A[0] < B[0] else B.pop(0)   
        ]
    return sorted_numbers + A + B              # O(n) for concatenation

cards = [3, 7, 4, 1, 2, 7, 3]
sorted_cards = merge_sort(cards)
print(sorted_cards)


# ======== COMPLEXITY =========
#
#       Base case : O(1)
#       Divide    : O(1)
#       Conquer :
#           -Recursion : 2 T(n/2)
#           -Merging   : O(n)
#
#       Total : 
#           T(n) = 2 T(n/2) + O(n) 
#
#       Dividing the list in half cost O(logn) because 8->4->2->1 is 3 divisions (log2(8)=3)
#       Merging at each level costs O(n)
#       So total cost is O(n) * O(logn) = O(n log n)
#
#       => T(n) = O( n log n )      (or prove it by Master Theorem)
#
#
#       The function divides the list into halves recursively until sublists of size 1 are reached. 
#       This division occurs log n times, where n is the length of the original list.
#       pop is O(n), we could replace it by easy list indexes (see below) for O(1) access.
#       
#       Combining these, the overall time complexity is O(n log n) for the divide-and-conquer process, but due to the inefficient use of pop(0), the actual complexity becomes O(n^2) in the worst case.
#
#
#
#

# -----------------------------------
# This one avoids pop(0) which adds O(n) complexity due to shifting elements
# Instead we use indices for both sublists.
def merge_sort_efficient(numbers):
    if len(numbers) < 2:
        return numbers                                   # O(1)
    m = len(numbers) // 2
    left = merge_sort_efficient(numbers[:m])             # T(n/2) (slicing cost O(m))
    right = merge_sort_efficient(numbers[m:])            # T(n/2) (slicing cost O(n-m))
    i = j = 0
    out = []
    while i < len(left) and j < len(right):              # O(n) total for merge
        if left[i] < right[j]:
            out.append(left[i]); i += 1                  # append is amortized O(1)
        else:
            out.append(right[j]); j += 1
    if i < len(left):
        out.extend(left[i:])                             # O(k) copy of remainder
    if j < len(right):
        out.extend(right[j:])                            # O(k) copy of remainder
    return out



[1, 2, 3, 3, 4, 7, 7]


### Quick Sort

In [3]:
# Quick Sort :
#
# Methodology :
#    Goal : sort a list of numbers
#    Idea : pick a pivot (median, first/last elem, random, ...),
#             put the smaller elems than pivot toleft,bigger ones to right, 
#             recursively sort low and high, then combine: low + pivots + high
#
#   1. Base case : Return list immediately if list size is 0 or 1 (by definition, sorted).
#   2. Divide    : Partition the array around a pivot into `low`, `pivots`, `high`.
#   3. Conquer   : Recursively sort `low` and `high`, then return `low + pivots + high`.
#
#   Example : [3, 7, 4, 1]
#       pivot = 4 (middle)
#       low = [3,1], pivots=[4], high=[7]
#       quicksort(low) -> [1,3]; quicksort(high) -> [7]
#       result -> [1,3,4,7]


# ====================================
def quicksort(numbers):
    # Simple cases
    if len(numbers) < 2: return numbers               # O(1)
    
    # CHOOSE pivot type
    # pivot = numbers[0]                              # First elem : O(1)
    pivot = numbers[len(numbers) // 2]                # Middle elem : O(1)

    # Divide
    low    = [x for x in numbers if x <  pivot]       # O(n) - each list scans ALL the elems of the original list -> O(n).
    pivots = [x for x in numbers if x == pivot]       # O(n)
    high   = [x for x in numbers if x >  pivot]       # O(n)
        # ---> Total: 3 * O(n) = O(n)
    
    # Conquer
    return quicksort(low) + pivots + quicksort(high)  # O(n) for concatenation


cards = [1, 2, 6, 5, 3, 7, 4]
res = quicksort(cards)
print(res)



# ======== COMPLEXITY =========
#
#   Best and average case: O(n log n)
#   - When the pivot divides the list into roughly equal parts at each recursive step, the depth of recursion is about log n.
#   - Each level of recursion processes all n elements (for partitioning into low, pivots, high), leading to O(n) work per level.
#     Total: O(n log n).
#
#   Worst case: O(n^2)
#   - When the pivot is always the smallest or largest element, resulting in highly unbalanced partitions.
#   - The recursion depth becomes n, and each level still processes all elements.
#   Total: O(n^2).
#
#
#
#
#   Details : 
#   
#   1. Base case : O(1)
#   2. Divide    : O(1)
#
#   3. Conquer :
#       Recursion : 
#           Average : divide list equally = 2 T(n/2)
#           Worst   : unbalanced lists    = T(n-1) + T(0)     _(one list has all elements except pivot, other is empty)
#       Merging :
#           Dividing lists : 3 * O(n) = O(n)
#           Concatenation (low + pivots + high): O(n)
#           Total : O(n)
#
#   Total : 
#       Average : T(n) = 2 T(n/2) + O(n)
#                    => T(n) = O(n log n)   (Master Theorem)
#       Worst   : T(n) = T(n-1) + T(0) + O(n)
#                   => T(n) = O(n^2)       (arithmetic series sum)



# Conclusion :
    # Time Complexity:
    # - Best and average case: O(n log n)
    # - When the pivot divides the list into roughly equal halves at each recursive step, the depth of recursion is about log n.
    # - Each level of recursion processes all n elements to partition into low, pivots, and high lists, resulting in O(n) work per level.
    # - Total: O(n log n).

    # - Worst case: O(n^2)
    # - When the pivot is always the smallest or largest element, leading to highly unbalanced partitions.
    # - The recursion depth becomes n, and each level processes all remaining elements, resulting in O(n^2).




[1, 2, 3, 4, 5, 6, 7]


## 2. Dynamic Programming :

#### LCS - Return Int:

In [5]:
# Longest Common Subsequence (LCS) between two strings A and B
# 
#
# Methodology :
#      Goal : return the NUMBER of common subsequent characters between A and B
#      Idea : if first letters match -> take them and recurse on A[1:], B[1:]
#             else -> try skipping A[0] or B[0] and take the max
#   
#   
#   Example : see complexity below.   
#   
# ================================================================
import functools
@functools.lru_cache(maxsize=None)
def LCS(A, B):
    # Base case:
    if len(A) == 0 or len(B) == 0: return 0         # O(1)

    # check if first letters match
    if A[0] == B[0]: return 1 + LCS(A[1:], B[1:])   # O(slicing) + T(n-1, m-1)

    l1 = LCS(A, B[1:])                              # O(slicing) + T(n, m-1)
    l2 = LCS(A[1:], B)                              # O(slicing) + T(n-1, m) 

    return max(l1, l2)                     # O(1)

A = "ACE"
B = "ABCDE"
print(LCS(A, B))  # Output: 3

A = "HYPERLINKING"
B = "DOLPHINSPEAK"
print(LCS(A, B))  # Output: 4


# ======== COMPLEXITY =========
#
# Idea :
#   At each call LCS(A,B) we either:
#     - match first chars -> one recursive call on (A[1:],B[1:])
#     - or mismatch -> two recursive calls: (A, B[1:]) and (A[1:], B)
# 
# 
#    T(n, m) : length of A is n, length of B is m
# 
#
# No memoization : 
#
#      Best Case : 
#         All characters match, leading to a single chain of recursive calls.
#         -> T(n, m) = 1 + T(n-1, m-1)
#
#         Each recursive call reduces both strings by 1.
#         Linear chain of calls, not a tree.
#         We call at most the len(A) or len(B), whichever the smallest.
#         
#         Slicing is at worst O( min(n, m) ).
#         
#         Total Cost : O( min(n,m)^2 ) 
#                    
#         if n ~ m :
#                -> O(n^2) - Quadratic time.
#         
#      Worst Case : 
#           No characters match. Each recursion branches into two further calls, leading to an exponential number of calls.
#           -> T(n, m) = 1 + T(n-1, m  ) + T(n  , m-1) 
#
#           Each “mismatch” node branches into 2 subproblems.
#           Each branch decreases either n or m by 1.
#           Height of tree is at most min(n,m) for reaching the base case where either string is empty.
#           
#           Binary tree nodes is by definition : 2^{height} = 2^{p}, p = min(n,m)
#           
#           Slicing is at worst O(min(n,m)).
#           
#           Total Cost : O( min(n,m) * 2^{min(n,m)} ) (= slicing cost * number of nodes)
#           
#           if n ~ m :
#                    -> O(n * 2^n) - Exponential time.


# With memoization :
#       
#      Example: 
#           LCS("ABC","AC") branches into :
#                   -> LCS("BC","AC") and 
#                   -> LCS("ABC","C") 
#           eventually both compute LCS("C","C") twice.
# 
#       Memoization: 
#           Store results of each subproblem (i,j) in a table (dictionary or 2D array).
#           If we ever reach (i,j) again, return the stored value instead of recomputing
#           -> This removes redundant computations entirely.
#           
#        How many subproblems are there in total? 
#           Since we compare "ABCDE" and "ACE", i and j will range over the lengths of A and B.
#
#           A = "ABCDE" (length n=5)
#           B = "ACE"   (length m=3)
#
#           (0, 0) -> comparing "ACE" with "ABCDE" (full strings)
#           (0, 1) -> comparing "CE"  with "ABCDE"  
#           (1, 0) -> comparing "ACE" with "BCDE"
#           ...
#
#           i can be 0..n (length of A) -> n+1 possibilities 
#           j can be 0..m (length of B) -> m+1 possibilities
#
#           Total unique subproblems = (n+1) * (m+1) = O(n*m)
#
#           Each subproblem (i,j) takes O(1) time to compute (just a few comparisons and additions).
#           But slicing add O(min(n,m)) time.
#           -> we could replace slicing with list indexes to make it O(1) !
#           
#           T(n, m) = (Number of unique subproblems) * (cost per subproblem)
#                   = O(n * m) * O(min(n,m)) 
#                   = O(n * m  * min(n,m))
#           
#           if n ~ m :
#                    -> O(n^2) - Quadratic time. (NO slicing)
#                    -> O(n^3) - Cubic time.     (slicing)
#


def LCS_inverted(A, B):
    if len(A) == 0 or len(B) == 0: return 0

    if A[-1] == B[-1]: return 1 + LCS_inverted(A[:-1], B[:-1])

    return max(LCS_inverted(A, B[:-1]), LCS_inverted(A[:-1], B))

print(LCS_inverted("HYPERLINKING", "DOLPHINSPEAK"))


3
4
4


#### LCS - Return String:

In [None]:
import functools

@functools.cache
def LCS(A, B):
    if len(A) == 0 or len(B) == 0:
        return ""
    if A[-1] == B[-1]:
        return LCS(A[:-1], B[:-1]) + A[-1]
    guesses = [LCS(A, B[:-1]), LCS(A[:-1], B)]
    return max(guesses, key=len)

LCS("HYPERLINKING", "DOLPHINSPEAK")

In [7]:
# LCS + return the actual subsequence string, not just the length of the matched subsequences.
#
#
#
#
import functools
@functools.lru_cache(maxsize=None)
def LCS(A, B):
    # Base case:
    if len(A) == 0 or len(B) == 0: return ""           # O(1) - Return empty string

    # check if first letters match
    if A[0] == B[0]: return A[0] + LCS(A[1:], B[1:])   # O(concat) + O(slicing)

    else:
        l1 = LCS(A, B[1:])
        l2 = LCS(A[1:], B)
        return max([l1, l2], key=len)


# def LCS(A, B):
#     if len(A) == 0 or len(B) == 0: return ""
#     if A[-1] == B[-1]: return LCS(A[:-1], B[:-1]) + A[-1]
    
#     guesses = [LCS(A, B[:-1]), LCS(A[:-1], B)]
#     return max(guesses, key=len)



A = "ACE"
B = "ABCDE"
# A = "HYPERLINKING"
# B = "DOLPHINSPEAK"
print(LCS(A, B))  # Output: 4




# ======== COMPLEXITY =========
# 
#   Here, instead of returning lengths, we return actual subsequence strings.
# 
#   Idea :
#     At each call LCS(A,B):
#       - match first chars -> return matched char + one recursive call on (A[1:],B[1:])
#       - or mismatch -> two recursive calls: (A, B[1:]) and (A[1:], B)
# 
#   Example:
#       Iteration 1:
#           "ACE" vs. "ABCDE".
#           match 'A' -> return 'A' + LCS("CE", "BCDE")
#
#       Iteration 2:
#           "CE" vs. "BCDE".
#           no match 'A' 
#           -> return max(  LCS("CE", "CDE"),  LCS("E", "BCDE")  )
#               eventually those LCS will return "CE" and "E" respectively.
#           So we return max("CE", "E") = "CE"
# 
# 
#   Important Note:
#       time complexity INCREASES with strings compared to just numbers.
#       previously : return 1    + LCS(...)
#       now        : return char + LCS(...)  (string concatenation)
#       
#       String concatenation takes O(n) time. (technically O(k) where k ≤ min(n−i, m−j) but we always consider the worst case).
#       So the cost per subproblem increases.
#       
#       Slicing takes Θ((n−i) + (m−j)) time.
#       
#       But in the worst case, both slicing and concatenation are Θ(min(n,m)).
#       
#       Cost per subproblem = concat + slicing = 2 * O(min(n,m)) = O(min(n,m))
#       
#       Same : Total unique subproblems = O(n * m)
#       
#       Total Cost : O(n * m) * O(min(n,m)) = O(n * m * min(n,m))
#       
#       if n ~ m :
#                -> O(n^3) - Cubic time.
#       
#
#
#

ACE


## 3. Greedy Algorithms

### Huffman Encoding:

for more details, theory and explanations, see `3_Greedy_Algorithms.ipynb`

A heap is a binary tree where each element has a relation with its parent and children node :
- **max-heap:** parent is always `greater` than its children in value.
- **min-heap:** parent is always `lesser`  than its children in value.

#### Operations :
- Simple python sorting is O(n) because we need to iterate over the complete list.
- Since the height of the tree is logarithmic, adding/removing an element causes log(n) steps for simple O(1)comparisons. --> O(log n)


- BUT creating a heap is always O(n), here we have to iterate over the complete list.


In [None]:
# Huffman Coding Algorithm
#
# Methodology :
#    Goal : To send text over a channel, we encode it in binary. To be more efficient, we can also compress the data.
#           given a text, assign binary codes to each character based on their frequencies,
#           such that more frequent characters have shorter codes, minimizing the total length of the encoded text
#
#
#   Pseudo-code :
#
#       1. Count the frequency of each character.
#
#       2. Create a Node class :
#                    - frequency
#                    - character
#                    - children (left, right)
#                    - comparison method (based on frequency)
#
#       3. Create a method in this class to compare node's frequencies.
#       
#       ~Since we build the tree from the bottom up :
#           - Priority Queue (Min-Heap) : 
#               keep track of the minimum element in a collection.
#               easily remove the smallest elements and add them to the tree.
#               form a new node with the removed elements and add them back up in the queue.
#               The last element in the queue is the root of the Huffman encoding tree.
# 
#       4. Priority Queue operations 
#           a. create a heap based on the {frequency, char}
#           b. remove 2 least frequent nodes
#           c. create node from them
#           d. add created node back to the heap.
#           
#           
#       5. After (bottoms-up) tree creation is done,
#           build the codebook top-bottom (read only).
#           

#   Example :
#       text = "hello"
#       Frequencies : {h:1, e:1, l:2, o:1}
#       (...see code in cell below in 3-Greedy_Algorithms.ipynb)


# =========================================
from heapq import heapify, heappop, heappush
from collections import Counter
from dataclasses import dataclass
from typing import Optional

@dataclass
class Node:
    # Our class defining each node in the Huffman tree. 
    freq: int
    char: Optional[str] = None
    children: Optional[list['Node']] = None
    
    def __lt__(self, other):
        return self.freq < other.freq           # O(1)


def huffman(text: str):
    # 1: Calculate the frequencies of each letter in text
    heap = [Node(char=c, freq=f) for c, f in Counter(text).items()]  # O(n) = 3 * O(n) = Counter() + .items() + list comprehension

    # 2: Heapify
    heapify(heap)                                           # O(n)

    # 3: Remove 2 least frequent letters, 
    #    create a new node with them as children,
    #    add created node back to the heap.
    while len(heap) > 1:                                    # O(n-1)  
        y, z = heappop(heap), heappop(heap)                 # O(log n) * 2
        w = Node(freq=y.freq + z.freq, children=[y, z])     #
        heappush(heap, w)                                   # O(log n)    

    # 4: Visual representation of the final coding {char: code}
    codebook = {}
    def build_code(letter: Node, prefix: str = ""):
        if letter.char is not None:                         # O(1)
            codebook[letter.char] = prefix                  # O(1)
        if letter.children:
            build_code(letter.children[0], prefix + "0")    # O(depht of node)
        if letter.children:
            build_code(letter.children[1], prefix + "1")    # O(depht of node)
    
    # 5: Build the codebook top-bottom (read only).
    build_code(heap[0])                                      # O(n) - visits all nodes (2n-1)

    return codebook

huffman('helllo')



# ======== COMPLEXITY =========
# 
# 
#   Line-by-Line Complexity :
#       1. Frequency Calculation + Node Creation :
#           - Counter(text) : Scans entire input string of length n     - O(n)
#           - .items()      : Iterates over n distinct characters       - O(n)
#           - List comp     : Creates one Node per distinct character   - O(n)
#       
#           Total :                                                     ~ O(n)
#       
#       2. Heapify :
#           - heapify(heap) : Builds a min-heap from n elements         ~ O(n)
#       
#
#       3. Huffman Tree :
#           - while     : (each it. reduces heap size by 1) ~O(n-1)     - O(n)
#           - heappop() : Each heappop = O(log n)                       - O(log n)
#           - Node      : Node creation and list creation of size 2     - O(1)
#           - heappush(): heap insertion O(log n)                       - O(log n)
#
#           -> Cost per iteration : O(log n)
#
#           Total : freq    + heapify + (while  * 2 heappop/heappush)       ~
#                   3* O(n) + O(n)    + (O(n-1) * 2 O(log n)        )       ~
#                   -> O(n) * O(log n)         
#                   = O(n log n)                                            ~ O(n log n)
#
#
#       4. Codebook Creation :
#           - build_code()   :  visiting each node once                  - O(n)
#           - recursive call : concat : prefix + "0" =                   - O(depht of node)
#           - recursive call : concat : prefix + "1" =                   - O(depht of node)
#           
#           -> Cost per iteration : 2* O(depth)                          ~ O(depth)
#           
#           
#           Total : 
#               - Each node is visited once → O(#nodes) = O(n)
#               - At each visit, we concatenate prefix string → O(depth)
#
#               Worst Case : depht = n  ----> (#nodes * depht) =  n^2     ~ O(n^2)
#                   
#               Best Case  : depht = log n  ---->              = n log n  ~ O(n log n)
#       
#               (see below for details)
#       
#       
#       
#       
#       
#    Nodes - Binary Tree :
#       - node : anything that occupies a position in the tree (leaf/elements + internal nodes/not elements)
#       - total nodes = n_leaves + internal_nodes = n + (n-1) = 2n - 1
#
#
# 
# 
#   Cost of traversing the tree :
#
#       Best Case :
#           - tree is balanced (all nodes have 2 children, except root).
#           - height of tree : log n  (classic binary tree)
#           
#           Iteration over nodes : O(n)
#           Copy char to codebook : O(log n)
#           
#           Total cost : (#nodes * depht) = 
#           O(n) * O(log n) =
#           ~ O(n log n)
#           
# 
#       Worst Case :
#           - tree is completely skewed (linear tree, no branches)
#           - height of tree : n 
#           
#           Iteration over nodes : O(n)
#           Copy char to codebook : O(n)
#           
#           Total cost : (#nodes * depht) = 
#           O(n) * O(n) =
#           ~ O(n^2)
#           








## 4. Unweighted Graphs

### Signals

### BFS - Breadth First Search :

In [8]:
# Breadth-First Search (BFS) :
#
#
# Methodology :
#       Goal : visit all nodes reachable from start node
#       Idea : explore each neighbor first before going deeper in the graph.
# 
#
#
#   1. Keep track of visited nodes to avoid cycles.
#   2. Use a queue (FIFO) to explore nodes level by level.
#   
#
#   Example :
#       Graph : 0: {1, 2}, 1: {3, 4}, ...
#       
#       1. enqueue node 0.
#       
#       while :
#           2. Dequeue node 0, visit it, enqueue its neighbors 1 and 2.
#           3. add node 0 to visited list.
#           4. Enqueue ALL node 0 neighbors (that are NOT in visited) : 1, 2
#           5. etc..
#           
#          
#       
#       
#       
#       
#
# ========================================
import collections
def BFS(adj, start):
    visited = set()
    queue = collections.deque([start])              # O(1) deque: a data structure optimized for fast FIFO operation

    while queue:                                    # O(V) : each Vertex dequeued once
        node = queue.popleft()                      # O(1)

        print("Visiting: ", node)                   # O(1)
        
        visited.add(node)                           # O(1)
        not_visited_yet = adj[node] - visited       # set difference : O(k) k = number of neighbors
        queue.extend(not_visited_yet)               # O(1) but total is O(E) : add neighbor adjacent nodes to visit next

BFS(
    { 
        0: {1, 2}, 
        1: {3, 4}, 
        2: {5, 6}, 
        3: set(), 4: set(), 5: set(), 6: set() 
    }, 
        0   # start node
    )



# ======== COMPLEXITY =========
#
#   Line-by-Line Complexity :
#       - create visited set()                 - O(1)
#       - create queue                         - O(1)
#       
#       - while queue:                         - O(V) : each Vertex dequeued once
#       
#           - popleft()                             - O(1)
#           - add to visited set                    - O(1)
#           - check if neighbor in visited          - O(k)  , k = number of neighbors
#           - enqueue                               - O(E),  over V-iterations, each edge examined once
#       
#       
#     Average Case :
#       
#       Vertex : each point/node we want to visit (cities, people, computers, etc)
#       Edges  : connections between vertices
#       
#       - While visits each Vertex once                       - O(V)
#       - extend(adj[node] -visited) explore each edge once   - O(E) 
#       
#       Total : O(V + E)
#       
#     Notes :
#       - O(2E) : for undirected graphs.
#       - Why not O(V*E) ? -> we don't re-process each Vertex for each neighbor (thanks to visited)
#
#       - if instead of popleft() we used pop(0) -->  add O(V+E+n) due to shifting n-elements.


# Complexity : O(V + E)
    # O(V) : (while) due to dequeing each Vertex
    # O(E) : (queue.ext) due to examining each neighbor of each Vertex
    



#! Exam : explain O(V + E)



Visiting:  0
Visiting:  1
Visiting:  2
Visiting:  3
Visiting:  4
Visiting:  5
Visiting:  6


### DFS - Depth First Search :

In [8]:
# Depth-First Search (DFS) :
#
#
# Methodology :
#       Goal : visit all nodes reachable from start node
#       Idea : explore as deep as possible before backtracking.
# 
#
#
#   1. For each node in the adj, explore its first neighbor, for this neighbor explore its first neighbor, etc.
#       - Keep track of visited nodes to avoid cycles.
#   
#
#   Example :
#       Graph : 0: {1, 2}, 1: {3, 4}, ...
#       
#       1. for node in graph: explore()
#       
#       explore() :
#           1. add node to visited list.
#           2. select neighbors to be visited (not in visited)
#           3. for each neighbor : explore(neighbor)
#           4. etc..
#           
#
# ========================================
def DFS(adj: dict[any, set]):

    visited = set()                             # O(1)

    # -------------
    def explore(start):
        visited.add(start)                      # O(1)
        print(f"Exploring: {start}")

        to_visit = adj[start] - visited         # set difference : O(k) k = number of neighbors of start

        for neighbor in to_visit:               # O(k) but executed O(E) times overall
            explore(neighbor)    
        print("Finished exploring", start, ", backtracking")           
    # -------------

    for node in adj:                            # O(V)
        if node not in visited:
            explore(node)                       # O(E) - Recursion over each edge


DFS(
    { 
        0: {1, 2}, 
        1: {3, 4}, 
        2: {5, 6}, 
        3: set(), 4: set(), 5: set(), 6: set() 
    })




# ======== COMPLEXITY =========
#
#   Line-by-Line Complexity :
#       - create visited set()                    - O(1)
#
#       - for node in adj:                        - O(V)
#
#       - explore(node) : 
#               -> will ultimately iterate
#                               over each edge    - O(E)
#
#       - "to_visit = adj[start] - visited "
#                   -> set difference :           - O(k) k = number of neighbors of start
#
#       - for neighbor in to_visit:     
#               -> O(deg(node)) per call          - O(k) but executed O(E) times overall (called by explore())
#               -> summed over all nodes = O(E)
#
#
#
#   Average Case :
#       - No Best/Worst Case, it is always the same complexity (deterministic algorithm)
#
#       Total Complexity :
#           - O(V) : Iterates over all vertices once (vertices given in adj list)       - O    , 1    , ..., V
#           - O(E) : Iterating over adjacency lists, over all vertices (no repetition)  - {1,2}, {3,4},..., {E}
#
#
#   Comparison BFS vs DFS :
#       - Both have same time complexity O(V + E)
#       - They explore the same number of Edges and Vertices.
#       - BUT do so in a different order.
#       - DFS and BFS always run in O(V + E) because every vertex is visited once and every edge is examined once; 
#           the difference lies in traversal order, not complexity.
#
#       BFS :
#           - finds shortest path in unweighted graphs
#           - explore all neighbors before going deeper (level by level)
#           - 
#
#       DFS :
#           - Finds a path, not necessarily shortest
#           - Goes deep before trying alternatives (backtracking)
#           - 


Exploring: 0
Exploring: 1
Exploring: 3
Finished exploring 3 , backtracking
Exploring: 4
Finished exploring 4 , backtracking
Finished exploring 1 , backtracking
Exploring: 2
Exploring: 5
Finished exploring 5 , backtracking
Exploring: 6
Finished exploring 6 , backtracking
Finished exploring 2 , backtracking
Finished exploring 0 , backtracking


### Shortest Path Algorithm (BFS)

#### SP - Get distance to start:

In [None]:
# Shortest-Path algorithm using BFS :
# 
# Methodology :
#       Goal : changed the visited set() to a dict to track the path taken. 
# 
#       Idea :
#           - replace visited by dist, which will also track the distance of each node to the start node.
# 
#   Parent Pointers :
#       - track the parent node that brought us to the current node.
# 
#   Why BFS and not DFS ?
#       - BFS explores level by level : All nodes at distance k are visited before distance k+1
#       - First time you visit a node = shortest path found
#       - DFS goes deep first, not short first.
#           -> ex. start=0, dest=9 : DFS might go 0->1->3->7->9 (length 4) before exploring 0->2->9 (length 2)
#       - 
#       - DFS Does not explore by distance : First found path ≠ shortest path
#
#       -> BFS finds shortest paths in unweighted graphs because it explores vertices in increasing order of distance, 
#           whereas DFS may explore longer paths before shorter ones.
#
#   Example :
#       Graph : 0: {1, 2}, 1: {3, 4}, ...
#
#       1. enqueue node 0, dist[0] = 0
#       while :
#           2. Dequeue/pop node 0
#           3. dist[1] = dist[0] + 1 = 1  (get parent node distance + 1 = neighbor distance to parent)
#           4. append/enqueue neighbor 1 to queue
#
#
#
#       START : 
#           queue = [0], 
#           dist = {0:0}
#
#       POP queue : 0
#           dist = 0
#           new neighbor : 1
#           neighbor 1 in dist ? No
#           add neighbor 1 to dist, with distance =  dist[1] = dist[0] + 1 = 1
#           queue = [1] ; add neighbor 1 to queue
#           
#           same for neighbor 2 : dist[2] = 1 and queue = [1,2]
#
#       POP queue : 1
#           dist = 1
#           new neighbor : 3
#           neighbor 3 in dist ? No
#           add neighbor 3 to dist, with distance =  dist[3] = dist[1] + 1 = 2
#           queue = [2,3] ; add neighbor 3 to queue
# 
#           etc...
#
# ========================================
import collections
def shortest_path(adj, start):

    # store the shortest-known distance to a node, init with start
    dist = {start: 0}                               # O(1)
    queue = collections.deque([start])              # O(1)
    
    while queue:                                    # O(V)
        node = queue.popleft()                      # O(1)
        print("Visiting: ", node)

        for neighbor in adj[node]:                  # O(E) - visit each neighbor in adj - already visited nodes
            if neighbor not in dist:
                dist[neighbor] = dist[node] + 1     # O(1)
                queue.append(neighbor)              # O(1)
    print(dist)

shortest_path(
    { 
        0: {1, 2}, 
        1: {3, 4}, 
        2: {5, 6}, 
        3: set(), 4: set(), 5: set(), 6: set() 
    }, 
        0
    )

# step by step:
    # queue = [0]
    # node = 0, queue = []
    # visited = {0}
    # queue = [1,2]    # whats the distance to 1 and 2 ?

# ======== COMPLEXITY =========
#
#   Line-by-Line Complexity :
#
#
#
#
#
#
#   O(V) : Iterating over adj       -> vertices
#   O(E) : Iterating over adj[node] -> edges out of that node
#
#   Adding a queue does not add to the complexity : all operations are O(1)
#
#
#
#


Visiting:  0
Visiting:  1
Visiting:  2
Visiting:  3
Visiting:  4
Visiting:  5
Visiting:  6
{0: 0, 1: 1, 2: 1, 3: 2, 4: 2, 5: 2, 6: 2}


#### SP - Visualize:

In [11]:
# see 4_Unweighted_Graphs.ipynb for more details

#### SP - Parent Pointers:

In [10]:
# Parent Pointers : Obtain the ACTUAL shortest path 
# 
# 
# 
#   Idea :
#       - when we enqueue a neighbor, store who discovered it : 
#       - parent[child] = current_node
# 
#   Complexity :
#       - does not add to the complexity : O(V + E)
#       - but reconstruct has its own complexity.
# 
# ====================
import collections
def shortest_path_with_parents(adj, start):

    dist = {start: 0}                               # dist from start
    parent = {start: None}                          # parent pointers (who discovered who)
    queue = collections.deque([start])              # bfs queue
    
    while queue:                                    # 
        node = queue.popleft()                      # 
        print("Visiting: ", node)

        for neighbor in adj[node]:                  # 
            if neighbor not in dist:
                dist[neighbor] = dist[node] + 1     # 
                parent[neighbor] = node             # add parent "discovered this child" pointer
                queue.append(neighbor)              # 
    # print(dist)
    return dist, parent

# =========================================
def reconstruct_path(parent, start, target):
    path = []                       # O(1)
    current = target                # O(1)

    while current is not None:      # O(L) , L = length of path from target to start
        path.append(current)        # O(1)
        current = parent[current]   # O(1)

    path.reverse()                  # O(L)

    if path[0] == start: return path
    else: return None  # target not reachable

adj = { 
    0: {1, 2}, 
    1: {3, 4}, 
    2: {5, 6}, 
    3: set(), 4: set(), 5: set(), 6: set() 
}

dist, parent = shortest_path_with_parents(adj, 0)

print("Distances:", dist)
print("Parents:", parent)

print("Shortest path 0 -> 6:", reconstruct_path(parent, 0, 6))



# ======== COMPLEXITY =========
#
#
#   Line-by-Line Complexity :
#       - while :                                   - O(L) , L = length of path from target to start
#      
#       - path.reverse()                            - O(L)
#
#
#   Total Complexity : O(L) + O(L) = O(L)
#
#       Best Case : 
#           - target is start : L = 0
#       ~ O(1)
#
#       Worst Case :
#           - target is farthest node : O(V) , L = V
#       ~ O(V)
#
#
#
#
#






Visiting:  0
Visiting:  1
Visiting:  2
Visiting:  3
Visiting:  4
Visiting:  5
Visiting:  6
Distances: {0: 0, 1: 1, 2: 1, 3: 2, 4: 2, 5: 2, 6: 2}
Parents: {0: None, 1: 0, 2: 0, 3: 1, 4: 1, 5: 2, 6: 2}
Shortest path 0 -> 6: [0, 2, 6]


### Topological Sort

In [14]:
# see 4_Unweighted_Graphs.ipynb for more details
#   explanations and examples.
#

In [12]:
# Topological Sort - DFS implementation
#
#
#   Methodology :
#       Goal : 
#           - find one sequence of tasks that respects all “must happen before” constraints.
#           - e.g. follow the arrows
#           - this does not find ALL the possible paths, just one that works.
#
#   Why DFS and not BFS ?
#       - DFS naturally explores “dependencies first.”
#       - In topological sort, we need finish the tasks before starting another.
#       - DFS: Go as deep as possible to finish all dependencies → then mark your task done.
#       - BFS: Visit everything layer by layer → can’t be sure dependencies are satisfied before scheduling.
#       - 
#
#
#   Example : 
#       See DFS example above.
#
#
#!   Important Note :
#       - bug in teacher code : the (if node in visited) check must happen INSIDE the explore() function,
#           otherwise nodes already visited will be re-explored when calling the recursion explore(neighbor).
#           Before : [1, 6, 5, 4, 5, 3, 2, 6, 3, 6] (duplicates !)
#           After  : [1, 6, 5, 4, 3, 2]

# =================================================

#? models “must happen before” constraints: task scheduling, build systems, course prerequisites, etc.

def topological_sort(adj: dict[any, set]) -> list:
    visited = set()                             # O(1)
    ordered = []                                # O(1)

    def explore(start):
        """
            O(E) = creating to_visit + iterating over it (recursive calls)
            O(E) = O(k) + O(k)
        """
        if start in visited: return             # O(1)
        visited.add(start)                      # O(1) - mark as visited
        print(f"Visiting: {start}")             # 

        to_visit = adj[start] - visited         # O(k) - set difference

        for neighbor in to_visit:               # 
            explore(neighbor)                   # O(k=len(neighbors)) - recurse over each edge -> O(k) per call, O(E) overall
                                                # O(E) = O(k) + O(k)

        print(f"Finish exploring: {start}")     #
        ordered.append(start)                   # O(1)
        # ordered.insert(0, start)              # -alternative way, no need to reverse

    for node in adj:                    # O(V) - iterate over each vertex 
            explore(node)               #

    return list(reversed(ordered))            # O(V) - reverse the ordered list



# Example : Morning Routine
routine  = { 
    1: {2,3,4,5,6},   # 1 - Wake up 
    2 : {3, 6},       # 2 - eat breakfast
    3 : {6},          # 3 - brush teeth
    4 : {5, 6},       # 4 - shower
    5 : {6},          # 5 - get dressed
    6 : set()         # 6 - goto school
}

res = list(topological_sort(routine))
print(res)


# ======== COMPLEXITY =========
#
#   Line-by-Line Complexity :
#
#       - for node in adj: explore()              - O(V)
#           
#       - for neighbor in to_visit: explore()     - O(E)
#
#
#
#   Total Complexity :  
#
#       - O(V) : iterating over all nodes
#
#       - O(E) : exploring each edge once
#
#       - O(V) : reversing the final list
# 
# 
#       Total : O(V) + O(E) + O(V)  = O(2V + E) 
#
#       ~ O(V + E)
# 
# 


Visiting: 1
Visiting: 2
Visiting: 3
Visiting: 6
Finish exploring: 6
Finish exploring: 3
Finish exploring: 2
Visiting: 4
Visiting: 5
Finish exploring: 5
Finish exploring: 4
Finish exploring: 1
[1, 4, 5, 2, 3, 6]


## 5. Weighted Graphs

### Dijkstra

In [None]:
# Djikstra's Algorithm : 
#
#
#!   Note :
#       - O((V+E) log V) can be confusing, it means the average case.
#       - In best case scenario, complexity can be O(V log V) because E = V - 1 (linear graph).
#       - In worst case scenario, complexity can be O(E log V) because E ~ V^2 (dense graph).
#
#
#
#   Methodology :
#       Goal : 
#           - Given a weighted graph V (positively weighted edges, e.g. travel time in minutes).
#           - from start `s`, find shortest path between `s` and every other nodes (city).
#
#       Idea : 
#           - Cities = Nodes, 
#           - Roads= Edges with weights (travel time)
#           - You start in city `A`, whats the shortest path to every other city ?
#           
#           - Djikstra's Rule : 
#                   "Always visit the closest unvisited city next."
#           
#           - Priority Queue :
#                   helps to always choose the city to visit next (e.g. closest city, least cost).
#
#           Greedy  Approach : 
#               - same as BFS but each node has a weight (cost, ex. travel cost). Always choose lesser cost.
#
#
#   Example :
#       1. Create a dictionnary with all the distances init to infinity. And distance to start = 0.
#       2. Add city 'A' to queue, so we explore it's neighbors first.
# 
#       while queue :
#           3. Always extract the node with the smllest known distance (closest city).
#           4. Explore all the neighbors.
#           5. estimate the distance from the current city to this neighbor.
#           6. if this estimate is better than the previously known distance, update it.
#                   and add it to the queue !
#                    
#        Note :
#           - cities can appear twice or more in the queue, because we explore all neighbors each time.
#           - this allows to find better paths to cities already in the queue.
#           - 
#           - 
#           
#           
# =================================
import heapq
def dijkstra(adj, start):
    # Init
    dist = {u : float('inf') for u in adj}                  # O(V)
    dist[start] = 0
    queue = [(0, start)]

    while queue:                                            # O(V) : nodes are pushed at least once (V times) at worst (E times)  
        distance, node = heapq.heappop(queue)               # O(log V) : heap operation
        print("Visiting: ", node)

        
        for neighbor in adj[node]:                          # O(E) (for V in neighbor, loop over E)
            estimate = distance + adj[node][neighbor]
            if estimate < dist[neighbor]:
                dist[neighbor] = estimate
                heapq.heappush(queue, (estimate, neighbor)) # O(log V) : heap operation
                
    return dist
    
# O((E+E) log V)) 

# dijkstra({
#     'A': {'B': 4, 'C': 1},
#     'B': {},
#     'C': {'B': 2}
# }, 'A')


# ======== COMPLEXITY =========
#
#   Line-by-Line Complexity :
#       - init dictionnary dist :                           - O(V)    
# 
#       - while queue :                                     - O(V), worst case : O(E)
#           as many pops as pushes !
#           same node can be pushed multiple times 
#           but some pops will be ignored (if distance > dist[node])
#           Total  pops = E
#           Useful pops = V
#
#       - heappop()                                         - O(log V)
#       ~ O(E * log V)
#
#
#       - for neighbor in adj[node]:                        - O(E)
#       - heappush()                                        - O(log V)
# 
#       ~ O(E * log V)
# 
# 
#   Best Case :
#       - Linear graph (linked list) : A -> B -> ... -> V
#       - V : vertices
#       - E = V - 1
# 
#       Total Complexity :
#           - O(V) : init dictionnary
#           - O(V * log V) : vertices are popped only once (total V pops).
#           - O(E * log V) : edges    are pushed only once (total E pushes).
# 
#        ~ O((V + E) log V),  but since E = V - 1 
#        ~ O(V log V)
# 
# 
# 
#   Worst Case :
#       - Dense graph (complete graph) : each node connected to every other node.
#       - V : vertices
#       - E = V * (V - 1) / 2    ~ (V^2)
#    
#       Total Complexity :
#           - O(V) : init dictionnary
#           - O(E * log V) : vertices are pushed and popped for each Edge exploration.(total E pops).
#               ? BUT not all pops have their neighbors pushed due to 'if estimate < dist[neighbor]:', this affects the complexity how ?
#
#           - O(E * log V) : vertices are pushed as many times as there are edges (total E pushes).
# 
#        ~ O((E + E) log V), but since E ~ V^2
#           ~ O(E log V) 
#               ~ O(V^2 log V)
# 
# 
# 
#   Using Sort instead of Priority Queue :
# 
#       """ while queue:
#            queue.sort(key=lambda x: x[0])  # O(k log k), k = current queue size
#            distance, node = queue.pop(0)   # O(1)
#       """
# 
#       - At each step, we select the vertex with the smallest distance by sorting the entire list.
# 
#       Worst Case Complexity :
#           - Queue can grow up to size E.
#           - Sorting the queue takes O(E log E) time.
#           - This sorting happens E times (while queue).
#        - Total Complexity : O(E^2 log E)
# 
# 
#       Best Case Complexity :
#           - Each vertex has only one outgoing edge.
#           - Queue has never more than 1 elem.
#           - Sorting takes O(1) time. done O(V) times.
#           - Edges iterations done O(E) times.
#         - Total Complexity : O(V + E) = O(V) since E = V - 1
#
#
#
#
#
#
#

#! Exam : Why queue and not sort ?
    # only need partial sort, faster complexity (logn )
    # heapq implements a min-heap O(log n), no need to sort each time O( n log n )





### Bellman-Ford

In [None]:
# Bellman-Ford Algorithm :
# 
#!   Note :
#       - Classic Belman-Ford is O(V*E)
#       - We implement BF with memoization + recursion (dynamic programming).
# 
#
# Subproblems :
#   Find shortest path between city `s` and city `v` using at most k-voyages.
#
# Base-case : 
#   BF(v, 0) = (0 if v=s) , (+inf otherwise)
#
# Guess :
#   last stop on path
# 
# Recurrence :
#   last step : w
#   BF(v, k) = min[ BF(w,k-1) + weight(w,v) ]
# 
# Complexity :
#   = number of subproblems * time/sub
# 
# =======================================================

# compare direct path s --- v with indirect path s --- u --- v:
    # best( s->v )   ?<=   best( s->u ) + weigth( u->v ) 


import functools
def bellman_ford(adj, S):    # S ---- V or S --- U --- V
    @functools.cache
    def BF(B, steps_k: int):
        """Find shortest distance from start s to node v using at most k edges."""
        # Base-case
        if (steps_k == 0): return 0 if (B==S) else float('inf')         # O(1)

        # Recurrence (guess last step)
        best = BF(B, steps_k-1)                                         # O(1) - from caching      

        for prev_neighbor in adj:                                       # O(V)
            if B in adj[prev_neighbor] :
                weight = adj[prev_neighbor][B]
                candidate = BF(prev_neighbor, steps_k -1) + weight      # O(1) 
                if candidate < best:
                    best = candidate
        return best
        # return min([      # list comprehension version
        #     BF(v, k-1),
        #     *[BF(u, k-1) + adj[u][v] for u in adj if v in adj[u]]
        # ])

    return { B: BF(B, len(adj) -1) for B in adj}                        # O(V) * O(V) = num_Nodes * (V-1)paths = O(V^2)



adj = {
    'A': {'B': 2, 'C': 4},
    'B': {'C': -2},
    'C': {}
}
# bellman_ford(adj , 'A')
bellman_ford(adj , 'A')




# ========== COMPLEXITY ==========
#
#
#   Idea : 
#       - Make k=V-1 iterations, each time relaxing all edges.
#       - relaxing : improving the current best path.
#       - Why making multiple iterations/relaxations ? 
#           -> Improving u might later improve v
#           -> Improving v might later improve x
#           -> Etc.
#           -> Bellman–Ford repeats relaxation V−1 times so improvements can “ripple” across the graph.
#
#
#   Complexity :
#       Subfunction BF defines our subproblems : "shortest distance from S → B using at most steps_k edges ?"
#
#
#       - def BF(B, steps_k) : 
#                B : called V-times          ->  O(V)
#                k : from 0 to V-1 paths     ->  O(V)
#
#           Total subproblems =               ~  O(V^2)
#
#       - time/subproblem :
#           best = BF(B, steps_k - 1) :      ->  O(1)  (from caching)
#           for prev_neighbor in adj  :      ->  O(V)
#           candidate = ...           :      ->  O(1)  (from caching) 
#                        
#            Cost of ONE subproblem =        ~  O(V)
#           
#           
#       Total : O(V^3)
#
#
#       Note : 
#           - No memoization would make it exponential time : O(V^V)
#           - 
#           - Finding shortest path between any pair : V-cities * BF = O(V^4)
#           - 
#           - Number of subproblems is always : O(V * V)
#           - 
#           - But cost/subproblem can be :
#                   -> O(V) for Vertex scanning          -> Total : O(V * V * V)
#                   -> O(E) for Edge scanning            -> Total : O(V * V * E)   
#
#
#
#
#   Time Complexity:
#       - The function BF is called for each node v and each number of edges k from 0 up to |V| - 1, where |V| is the number of vertices.
#
#
#       - Since memoization ensures each (v, k) pair is computed once, the total number of unique calls is O(|V| * |V|) = O(|V|^2).
#       - Each call involves iterating over the adjacency list of v, which in the worst case can be O(|V|).
#       - Therefore, the total time complexity is approximately O(|V|^3), considering all calls and adjacency traversals.
#
#
#
#
#
#
#   Comparison with Djikstra :
#       - Djikstra's O((V + E) log V) is typically faster than BF's 
#
#       - BF can handle negative weights and even detect negative cycles.
#       - Djikstra's only work for positive weights. And is faster.
#








### Floyd-Warshall

In [None]:
# Floyd - Warshall Algorithm :
# 
#   Instead of finding the shortest path from a single source (Belmann-Ford),
#   find shortest paths between ALL pairs of nodes.
# 
#   + faster to find all the pairs (builds gradually the answer, NOT one shotest path of a city at a time)
#   - slower to find the path of a single pair (need to compute all pairs first)
# 
# 
#   Idea :
#       1. Get list of all vertices V = [V_1, V_2, ..., V_n]
#       
#       2. let u = V_1,... V_n and v = V_1,... V_n
#           for all pairs (u,v), find shortest path FW(u, v, k), k = len(V)
#       
#       Note : k is only the index of V to choose intermediate cities
#               - V[k-1] will ultimately iterate on all the cities from right to left.
#       
#       3. Subproblem :
#           - return if u and v are the same city, distance = 0
#           - if k = 0 (no intermediate cities allowed, direct path only) :
#               -> try and find if this path exist in adj (ex. A -> B : cost 3)
#               -> if not, cities unreachable with k=0, distance = +inf
#            
#           Recurrence :
#               - Idea : compare if  V_k is a shortcut between u and v
#               
#               intermediate = vertices[k - 1]
#                   -> select city from V=['A', 'B', 'C', ...] in reverse order
#               
#               - Case 1 : dont go through V_k
#                   -> start → end
#                   -> cost = FW(start, end, k-1)
#                
#               - Case 2 : go through V_k
#                   -> start → V_k → end
#                   -> cost = (start → V_k)       + (V_k → end)
#                   ->      = FW(start, V_k, k-1) + FW(V_k, end, k-1)
#       
#               - best = min( Case 1 , Case 2 )
# 
# 


# Subproblems :
# 

# Base-case : 
    # cannot change u,v. but k yes
    # FW(u, v, 0) = (0 if v=s) , (+inf otherwise)

# Guess :
    # last stop on path

# Recurrence :   
        # let path : u --(1)--- V_k ---(2)--- v
        # FW(u,v,k) = (1) + (2) = FW(u,V_k, k-1) + FW(K_v, v, k-1)
    #

# Complexity :
    #

# =======================================================

import functools
def floyd_warshall(adj):
    vertices = list(adj.keys())                                 # O(V)

    @functools.cache
    def FW(start, end, k):
        # Base case --------------------------------------------# O(1)
        if start == end: return 0                 # O(1)
        if k==0:                                  # O(1)
            if end in adj[start]: 
                cost = adj[start][end]
                return cost
            else:
                return float('inf')
        
        # Recurrence -------------------------------------------# O(V) - combination of the 3 recurrence calls, limited by memoization
        intermediate = vertices[k - 1]

        # Case 1 : path doesnt go through V_k
        cost_without = FW(start, end, k - 1)      # O(1) - memo

        # Case 2 : path goes through V_k
        cost_to_intermediate = FW(start, intermediate, k - 1)  # O(1) - memo
        cost_from_intermediate = FW(intermediate, end, k - 1)  # O(1) - memo

        cost_with = cost_to_intermediate + cost_from_intermediate

        best = min(cost_without, cost_with)


        return best

    res = {}

    for u in vertices:                                          # O(V)                   
        for v in vertices:                                      # O(V)
            print(f"Computing shortest path from {u} to {v}")
            res[(u, v)] = FW(u, v, len(vertices))               # O(V) - k goes from 0 to V
    
    return res


adj = {                 # list from : 
    'A': {'C': -2},
    'B': {'A': 4, 'C': 3},
    'C': {'D': 2},
    'D': {'B': -1}
}

adj_prof = {
    'A': {'B': 2, 'C': 4},
    'B': {'C': -2},
    'C': {}
}

print(floyd_warshall(adj_prof))




# =========== COMPLEXITY ==========
#
#
#   Number of subproblems :
#           - for u in vertices:                                 - V choices
#           - for v in vertices:                                 - V choices
#           
#           - recursion inside FW(u,v,k) with k from 0 to V :    - V choices
#               this last O(V) comes from the fact that k 
#               iterates from 0 to V.
#           
#           ex. (A, A, 0), (A, B, 1), (A, B, 2), ..., (A, B, V)  
#               (A, B, 0), (A, B, 1), (A, B, 2), ..., (A, B, V)
#               (A, C, 0), (A, C, 1), (A, C, 2), ..., (A, C, V)
#               ...
#               (B, A, 0), (B, A, 1), (B, A, 2), ..., (B, A, V)
#               ...
#               (C, A, 0), (C, A, 1), (C, A, 2), ..., (C, A, V)
#               (C, B, 0), (C, B, 1), (C, B, 2), ..., (C, B, V)
#               (C, C, 0), (C, C, 1), (C, C, 2), ..., (C, C, V)
#           
#           
#           
#           Total subproblems = O(V^3)
#
#   Time/subproblem :
#       - memoization allows for each subproblem to cost only O(1) time.
#       - O(1)
#
#
#
#   No memoization :
#       - Each call to FW(start,end,k) calls 3 more recursive calls
#       - each calls again with k - 1.
#
#       - This leads to a binary tree with height k = V.
#       - With 3 branches created at each time, the total number of calls is O(3^V).
#       - 
#       - Total complexity without memoization : O(V^2 * 3^V)
#       ~ O(3^V)
#








# ====== List comprehension =========
import functools

# u : last predecessor to target
# v : target
# s : start

def floyd_warshall(adj):
    V = list(adj.keys())

    @functools.cache
    def FW(u, v, k):
        # Base-case
        if (u == v): return 0
        if (k == 0): return adj[u][v] if v in adj[u] else float('inf')

        return min([
            # Fastest way doesnt go through V_k
            FW(u, v, k-1),
            FW(u, V[k - 1], k-1) + FW(V[k - 1], v, k-1)
        ])

    return {(u, v): FW(u, v, len(V)) for u in V for v in V}


# floyd_warshall({
#     'A': {'B': 2, 'C': 4},
#     'B': {'C': -2},
#     'C': {}
# })


# Complexity :
    # number of poss : u->V, v->V and k->V
    # O( V^3 ) subproblems
    # O(1) time/sub

    # total: O( V^3 )

#### Detailled Example :

In [None]:
# Example :
#
#     'A': {'B': 2, 'C': 4},
#     'B': {'C': -2},
#     'C': { }
#
#      Vertices = [A, B, C]
#
#       1. From 'A' to 'A' :    -> FW(A, A, 3) = 2
#           - start == end  => return 0
#           res = { ('A', 'A'): 0 }
#
#       2. From 'A' to 'B' :    -> FW(A, B, 3) = 2
#           
#           - k = 3, intermediate = V[3-1] = 'C'
#
#           Case 1 : cost_without = FW(start, end, k - 1)
#               - FW(A, B, 2)
#                   - k = 2, intermediate = V[2-1] = 'B'
#
#                   - Case 1.1 : cost_without = FW(A, B, 1)
#                       - k = 1, intermediate = V[1-1] = 'A'
#                       - Case 1.1.1 : cost_without = FW(A, B, 0)
#                           - k = 0, is 'B' in adj['A'] ? yes, cost = 2
#                       - Case 1.1.2 : cost_with = FW(A, A, 0) + FW(A, B, 0)
#                           - 0 + 2 = 2
#                       - best = min(2, 2) = 2
#
#
#                   - Case 1.2 : cost_with = FW(A, B, 1) + FW(B, B, 1)
#                       - FW(A, B, 1) = 2 : (memo)
#                       - FW(B, B, 1) = 0 (start == end)
#                       - return 2 + 0 = 2
#
#                   - best = min(2, 2) = 2
#               
#               ~ Case 1 : ('A', 'B') = 2
#
#           Case 2 : cost_with = FW(A, C, 2) + FW(C, B, 2)
#               - FW(A, C, 2)
#                   - k = 2, intermediate = 'B'
#
#                   - Case 2.1 : cost_without = FW(A, C, 1)
#                       - k = 1, intermediate = 'A'    
#                       - Case 2.1.1 : cost_without = FW(A, C, 0)
#                           - k = 0, is 'C' in adj['A'] ? yes, cost = 4
#                           - FW(A, C, 0) = 4
#                       - Case 2.1.2 : cost_with = FW(A, A, 0) + FW(A, C, 0)
#                           - 0 + 4 = 4
#                       
#                       - best = min(4, 4) = 4
#
#                   - Case 2.2 : cost_with = FW(A, B, 1) + FW(B, C, 1)
#                       - FW(A, B, 1) =  2 : (memo)
#                       - FW(B, C, 1) = -2 : 
#                           - k = 1, intermediate = 'A'
#                           - Case 2.2.1 : cost_without = FW(B, C, 0)   
#                           - k = 0, is 'C' in adj['B'] ? yes, cost = -2
#                           - Case 2.2.2 : cost_with = FW(B, A, 0) + FW(A, C, 0)
#                           - inf + 4 = inf
#                          ~ best = min(-2, inf) = -2
#                       - return 2 + (-2) = 0
#
#                   - best = min(4, 0) = 0
#
#               ~ FW(A, C, 2) = 0
#
#               - FW(C, B, 2)
#                   - k = 2, intermediate = V[2-1] = 'B'            # Note: here intermediate = end ! thats ok
#
#                   - Case 2.3 : cost_without = FW(C, B, 1)
#                       - k = 1, intermediate = V[1-1] = 'A'
#                       - Case 2.3.1 : cost_without = FW(C, B, 0)
#                           - k = 0, is 'B' in adj['C'] ? No, cost = inf
#                       - Case 2.3.2 : cost_with = FW(C, A, 0) + FW(A, B, 0)
#                           - FW(C, A, 0) = inf (no path)
#                           - FW(A, B, 0) = 2 : (memo)
#                           ~ inf + 2 = inf
#
#                       - best = min(inf, inf) = inf    
#
#                   - Case 2.4 : cost_with = FW(C, B, 1) + FW(B, B, 1)
#                       - FW(C, B, 1) = inf : (memo)
#                       - FW(B, B, 1) = 0 : (start == end)
#                       ~ inf + 0 = inf
#                   
#                   ~ best = min(inf, inf) = inf
#
#               ~ Case 2 : FW(C, B, 2) = inf
#
#
#
#           ~ best = min(Case 1, Case 2) = min(2, inf) = 2
#
#       res = { ('A', 'B'): 2 }
#       
#       
#       3. From 'A' to 'C' :    -> FW(A, C, 3) = 0
#           - etc ....
#
#
#   FINAL RESULT :
#       {
#       ('A','A'): 0,
#       ('A','B'): 2,
#       ('A','C'): 0,
#       ('B','A'): inf,
#       ('B','B'): 0,
#       ('B','C'): -2,
#       ('C','A'): inf,
#       ('C','B'): inf,
#       ('C','C'): 0
#       }
#
#   Actual Output from algorithm hereup :
#   {('A', 'A'): 0, ('A', 'B'): 2, 
#   ('A', 'C'): 0, ('B', 'A'): inf, 
#   ('B', 'B'): 0, ('B', 'C'): -2, 
#   ('C', 'A'): inf, ('C', 'B'): inf, 
#   ('C', 'C'): 0}
#
#
#





### Prim

In [None]:
# Prim's Algorithm :
#
#   Goal : 
#       - "How can i build the cheapest network connecting all cities ?"
#
#       - build a minimum spanning tree (MST) from a connected, undirected graph with weighted edges.
#       - we just want to visit each city once, always choosing the cheapest path. 
#       - e.g. connect all cities, keep cheapest connections, throw out the expensives ones.
#       
#       - this differs from Djikstra's which finds shortest path from a source to all nodes.
#       - here we optimize the the combined travel costs of each edge in the tree, not the shortest path from a source to a target.
#       
#       - Here we dont see "cities" anymore, we operate on edges = tuple(start, end)
#       
#
#   Methodology :
#
#       1. Init
#           tree = []           : list of edges tuple(start, end) in the MST 
#           visited = set()     : to prevent cycles.
#           queue = [(0, s, s)] : min-heap priority queue of (weight, start, end)
#
#
#       2. While queue not empty :
#           - pop edge with smallest weight from queue (_, start, end) - we already know its the smallest weight from the heap queue
#           
#           - check visited() : continue
#           - add end to visited
#
#           - add edge to the tree if start != end
#
#           - for each neighbour of end :
#               - if neighbour not in visited : push this neighbour to the queue.
#
#
#   Example :
#       
#
#
#
#






import heapq

def prim(adj, s):
    tree = []
    visited = set()
    queue = [(0, s, s)]


    while queue:                                                    # O(E)
        _, start, end = heapq.heappop(queue)                        # O(log E)

        if end in visited: continue                                 # O(1)
        visited.add(end)                                            # O(1)
        if start != end: tree.append((start, end))                  # O(1) - O(V) total   


        for neighbour, weight in adj[end].items():                 # O(E)
            if neighbour not in visited:
                heapq.heappush(queue, (weight, end, neighbour))    # O(log E)

    return tree

prim({
    0: {1: 1, 2: 4},
    1: {0: 1, 2: 2, 3: 6},
    2: {0: 4, 1: 2, 3: 3},
    3: {1: 6, 2: 3},
}, 0)




# ========== COMPLEXITY ==========
#
#   Line-by-Line Complexity :
#       - init tree, visited, queue :                       - O(1)
#
#       - while queue :                                     - O(E)
#           heappop :                                       - O(log E)
#           visited checks and append :                     - O(V)
#
#           for neighbor in end :                           - O(E)
#               heappush :                                  - O(log E)
#
#
#
#   ! Not log V but log E !
#
#   Best Case :
#       - Linear graph (linked list) : A -> B -> ... -> V
#       - V : vertices
#       - E = V - 1
# 
#       Total Complexity :
#           - O(1) : inits
#           - O(E * log E) : vertices are popped only once (total V pops).
#           - O(E * log E) : edges    are pushed only once (total E pushes).
# 
#        ~ O(E log E)
# 
#   Note :
#       - Teacher says its O((V+E) log E)....
#       - while queue : O(V) in his explanations, i dont understand why, good luck.
#          
#          
#          
#          
#          
#
#
#
#   Worst Case :
#       - Each Vertex is inserted once.
#       - Each Edge   is considered once.
#
#
#
#
#
#
#
#
#
















