# Weighted Graphs:


https://learning.ecam.be/SA4T/slides/05-graphs

<img src="img/wg.png" width="40%">

### Djikstra's:

1.  Given a (positively) weighted graph V and a start node`s`, find the (length of the) shortest path between `s and all the other nodes.
2. What is the time complexity?
3. Change your algorithm to output the actual paths as well.

_Indication :_
This algorithm is simply BFS with a priority queue to consider the next vertex to explore. It is greedy and based on the fact that the closest estimated distance will always be the length of the shortest path.

Starting from BFS_shortest_path algorithm (from 4_Unweighted_Graphs.ipynb), construct Djikstra's Algoruithm.

#### Djikstra

In [None]:
# Djikstra's Algorithm : 
#
#
#!   Note :
#       - O((V+E) log V) can be confusing, it means the average case.
#       - In best case scenario, complexity can be O(V log V) because E = V - 1 (linear graph).
#       - In worst case scenario, complexity can be O(E log V) because E ~ V^2 (dense graph).
#
#
#
#   Methodology :
#       Goal : 
#           - Given a weighted graph V (positively weighted edges, e.g. travel time in minutes).
#           - from start `s`, find shortest path between `s` and every other nodes (city).
#
#       Idea : 
#           - Cities = Nodes, 
#           - Roads= Edges with weights (travel time)
#           - You start in city `A`, whats the shortest path to every other city ?
#           
#           - Djikstra's Rule : 
#                   "Always visit the closest unvisited city next."
#           
#           - Priority Queue :
#                   helps to always choose the city to visit next (e.g. closest city, least cost).
#
#           Greedy  Approach : 
#               - same as BFS but each node has a weight (cost, ex. travel cost). Always choose lesser cost.
#
#
#   Example :
#       1. Create a dictionnary with all the distances init to infinity. And distance to start = 0.
#       2. Add city 'A' to queue, so we explore it's neighbors first.
# 
#       while queue :
#           3. Always extract the node with the smllest known distance (closest city).
#           4. Explore all the neighbors.
#           5. estimate the distance from the current city to this neighbor.
#           6. if this estimate is better than the previously known distance, update it.
#                   and add it to the queue !
#                    
#        Note :
#           - cities can appear twice or more in the queue, because we explore all neighbors each time.
#           - this allows to find better paths to cities already in the queue.
#           - 
#           - 
#           
#           
# =================================
import heapq
def dijkstra(adj, start):
    # Init
    dist = {u : float('inf') for u in adj}                  # O(V)
    dist[start] = 0
    queue = [(0, start)]

    while queue:                                            # O(V) : nodes are pushed at least once (V times) at worst (E times)  
        distance, node = heapq.heappop(queue)               # O(log V) : heap operation
        print("Visiting: ", node)

        
        for neighbor in adj[node]:                          # O(E) (for V in neighbor, loop over E)
            estimate = distance + adj[node][neighbor]
            if estimate < dist[neighbor]:
                dist[neighbor] = estimate
                heapq.heappush(queue, (estimate, neighbor)) # O(log V) : heap operation
                
    return dist
    
# O((E+E) log V)) 

# dijkstra({
#     'A': {'B': 4, 'C': 1},
#     'B': {},
#     'C': {'B': 2}
# }, 'A')


# ======== COMPLEXITY =========
#
#   Line-by-Line Complexity :
#       - init dictionnary dist :                           - O(V)    
# 
#       - while queue :                                     - O(V), worst case : O(E)
#           as many pops as pushes !
#           same node can be pushed multiple times 
#           but some pops will be ignored (if distance > dist[node])
#           Total  pops = E
#           Useful pops = V
#
#       - heappop()                                         - O(log V)
#       ~ O(E * log V)
#
#
#       - for neighbor in adj[node]:                        - O(E)
#       - heappush()                                        - O(log V)
# 
#       ~ O(E * log V)
# 
# 
#   Best Case :
#       - Linear graph (linked list) : A -> B -> ... -> V
#       - V : vertices
#       - E = V - 1
# 
#       Total Complexity :
#           - O(V) : init dictionnary
#           - O(V * log V) : vertices are popped only once (total V pops).
#           - O(E * log V) : edges    are pushed only once (total E pushes).
# 
#        ~ O((V + E) log V),  but since E = V - 1 
#        ~ O(V log V)
# 
# 
# 
#   Worst Case :
#       - Dense graph (complete graph) : each node connected to every other node.
#       - V : vertices
#       - E = V * (V - 1) / 2    ~ (V^2)
#    
#       Total Complexity :
#           - O(V) : init dictionnary
#           - O(E * log V) : vertices are pushed and popped for each Edge exploration.(total E pops).
#               ? BUT not all pops have their neighbors pushed due to 'if estimate < dist[neighbor]:', this affects the complexity how ?
#
#           - O(E * log V) : vertices are pushed as many times as there are edges (total E pushes).
# 
#        ~ O((E + E) log V), but since E ~ V^2
#           ~ O(E log V) 
#               ~ O(V^2 log V)
# 
# 
# 
#   Using Sort instead of Priority Queue :
# 
#       """ while queue:
#            queue.sort(key=lambda x: x[0])  # O(k log k), k = current queue size
#            distance, node = queue.pop(0)   # O(1)
#       """
# 
#       - At each step, we select the vertex with the smallest distance by sorting the entire list.
# 
#       Worst Case Complexity :
#           - Queue can grow up to size E.
#           - Sorting the queue takes O(E log E) time.
#           - This sorting happens E times (while queue).
#        - Total Complexity : O(E^2 log E)
# 
# 
#       Best Case Complexity :
#           - Each vertex has only one outgoing edge.
#           - Queue has never more than 1 elem.
#           - Sorting takes O(1) time. done O(V) times.
#           - Edges iterations done O(E) times.
#         - Total Complexity : O(V + E) = O(V) since E = V - 1
#
#
#
#
#
#
#

#! Exam : Why queue and not sort ?
    # only need partial sort, faster complexity (logn )
    # heapq implements a min-heap O(log n), no need to sort each time O( n log n )





Visiting:  A
Visiting:  F
Visiting:  C
Visiting:  E
Visiting:  E
Visiting:  B
Visiting:  D
Visiting:  G
Visiting:  B


{'A': 0, 'C': 3, 'F': 2, 'E': 4, 'D': 7, 'G': 7, 'B': 6}

#### Djikstra - Visualisation :

<img src="img/city.png" width="70%">

In [None]:
# Visualize Djikstra's Algorithm :
#
#   - see https://www.youtube.com/watch?v=EFg3u_E6eHU
#


city_map = {
    'A' : {'C' : 3, 'F' : 2},
    'C' : {'A' : 3, 'F' : 2, 'E' : 1, 'D' : 4},
    'F' : {'A' : 2, 'C' : 2, 'E' : 3, 'G' : 5, 'B' : 6},
    'E' : {'C' : 1, 'F' : 3, 'B' : 2},
    'D' : {'C' : 4, 'B' : 1},
    'G' : {'F' : 5, 'B' : 2},
    'B' : {'E' : 2, 'D' : 1, 'G' : 2}
}


#   Solution :
#       - The shortest path from 'A' to 'B' is A -> C -> E -> B with a total cost of 6.
#       - This algorithm doesnt show the path taken, only the cost.
#       
#       - Print : {'A': 0, 'C': 3, 'F': 2, 'E': 4, 'D': 7, 'G': 7, 'B': 6}
#       - Meaning : We see here all the cities, and the shortest time/cost it would take to go there from 'A'.
#           -> entry B shows cost 6 to go from A to B.
#

dijkstra(city_map, start='A')





Visiting:  A
Visiting:  F
Visiting:  C
Visiting:  E
Visiting:  E
Visiting:  B
Visiting:  D
Visiting:  G
Visiting:  B


{'A': 0, 'C': 3, 'F': 2, 'E': 4, 'D': 7, 'G': 7, 'B': 6}

### Bellman-Ford:


Dijkstra is greedy and doesn't work on graphs with `negative` weights. 

Let's use `dynamic programming` instead:
- Subproblems: find `BF(v, k)`, the length of the shortest path between `s` and `v` using at most `k` edges.
- Base cases:
- Guess:
- Recurrence:
- Complexity:

_Exercice:_
Implement Bellman-Ford and give the time complexity. How would you get
the paths themselves?

In [None]:
# Bellman-Ford Algorithm :
# 
#!   Note :
#       - Classic Belman-Ford is O(V*E)
#       - We implement BF with memoization + recursion (dynamic programming).
# 
#
# Subproblems :
#   Find shortest path between city `s` and city `v` using at most k-voyages.
#
# Base-case : 
#   BF(v, 0) = (0 if v=s) , (+inf otherwise)
#
# Guess :
#   last stop on path
# 
# Recurrence :
#   last step : w
#   BF(v, k) = min[ BF(w,k-1) + weight(w,v) ]
# 
# Complexity :
#   = number of subproblems * time/sub
# 
# =======================================================

# compare direct path s --- v with indirect path s --- u --- v:
    # best( s->v )   ?<=   best( s->u ) + weigth( u->v ) 


import functools
def bellman_ford(adj, S):    # S ---- V or S --- U --- V
    @functools.cache
    def BF(B, steps_k: int):
        """Find shortest distance from start s to node v using at most k edges."""
        # Base-case
        if (steps_k == 0): return 0 if (B==S) else float('inf')         # O(1)

        # Recurrence (guess last step)
        best = BF(B, steps_k-1)                                         # O(1) - from caching      

        for prev_neighbor in adj:                                       # O(V)
            if B in adj[prev_neighbor] :
                weight = adj[prev_neighbor][B]
                candidate = BF(prev_neighbor, steps_k -1) + weight      # O(1) 
                if candidate < best:
                    best = candidate
        return best
        # return min([      # list comprehension version
        #     BF(v, k-1),
        #     *[BF(u, k-1) + adj[u][v] for u in adj if v in adj[u]]
        # ])

    return { B: BF(B, len(adj) -1) for B in adj}                        # O(V) * O(V) = num_Nodes * (V-1)paths = O(V^2)



adj = {
    'A': {'B': 2, 'C': 4},
    'B': {'C': -2},
    'C': {}
}
# bellman_ford(adj , 'A')
bellman_ford(adj , 'A')




# ========== COMPLEXITY ==========
#
#
#   Idea : 
#       - Make k=V-1 iterations, each time relaxing all edges.
#       - relaxing : improving the current best path.
#       - Why making multiple iterations/relaxations ? 
#           -> Improving u might later improve v
#           -> Improving v might later improve x
#           -> Etc.
#           -> Bellman–Ford repeats relaxation V−1 times so improvements can “ripple” across the graph.
#
#
#   Complexity :
#       Subfunction BF defines our subproblems : "shortest distance from S → B using at most steps_k edges ?"
#
#
#       - def BF(B, steps_k) : 
#                B : called V-times          ->  O(V)
#                k : from 0 to V-1 paths     ->  O(V)
#
#           Total subproblems =               ~  O(V^2)
#
#       - time/subproblem :
#           best = BF(B, steps_k - 1) :      ->  O(1)  (from caching)
#           for prev_neighbor in adj  :      ->  O(V)
#           candidate = ...           :      ->  O(1)  (from caching) 
#                        
#            Cost of ONE subproblem =        ~  O(V)
#           
#           
#       Total : O(V^3)
#
#
#       Note : 
#           - No memoization would make it exponential time : O(V^V)
#           - 
#           - Finding shortest path between any pair : V-cities * BF = O(V^4)
#           - 
#           - Number of subproblems is always : O(V * V)
#           - 
#           - But cost/subproblem can be :
#                   -> O(V) for Vertex scanning          -> Total : O(V * V * V)
#                   -> O(E) for Edge scanning            -> Total : O(V * V * E)   
#
#
#
#
#   Time Complexity:
#       - The function BF is called for each node v and each number of edges k from 0 up to |V| - 1, where |V| is the number of vertices.
#
#
#       - Since memoization ensures each (v, k) pair is computed once, the total number of unique calls is O(|V| * |V|) = O(|V|^2).
#       - Each call involves iterating over the adjacency list of v, which in the worst case can be O(|V|).
#       - Therefore, the total time complexity is approximately O(|V|^3), considering all calls and adjacency traversals.
#
#
#
#
#
#
#   Comparison with Djikstra :
#       - Djikstra's O((V + E) log V) is typically faster than BF's 
#
#       - BF can handle negative weights and even detect negative cycles.
#       - Djikstra's only work for positive weights. And is faster.
#








{'A': 0, 'B': 2, 'C': 0}

#### Viualizing Bellman-Ford :

<img src="img/bf1.png" width="70%">

In [None]:
# Example :

adj = {
    'A': {'B': 6, 'C': 4, 'D': 5},
    'B': {'E': -1},
    'C': {'B': -2, 'E': 3},
    'D': {'C': -2, 'F': -1},
    'E': {'F': 3},
    'F': {}
}

#   1. Init dist to inf except A = 0
#
#   According to Belman-Ford, we need to perform V-1 = 5 iterations (edges) before converging.
#
#
#   calling bellman_ford(adj , 'A') :
#   for B in adj :
#       compute a dict {B: BF(B, len(adj) -1) ,...}
#       so we start with City 'A' and k = 6-1 (max edges)
#
#
#   It 1 :
#       From Node A :
#           - Edge A -> B : dist[B] = min(inf, 0 + 6) = 6
#           - set predecessor[B] = A
#
#           - Edge A -> C : dist[C] = min(inf, 0 + 4) = 4
#           - set predecessor[C] = A
#
#           - Edge A -> D : dist[D] = min(inf, 0 + 5) = 5
#           - set predecessor[D] = A
#
#       From Node B :
#           - Edge B -> E : dist[E] = min(inf, 6 + (-1)) = 5
#           - set predecessor[E] = B
#           
#       From Node C :
#           - Edge C -> B : dist[B] = min(6, 4 + (-2)) = 2
#           - set predecessor[B] = C
#           Note : we found a better path to B through C ! (prev was A -> B = 6, now A -> C -> B = 4 + (-2) = 2)
#           
#           - Edge C -> E : dist[E] = min(5, 4 + 3) = 5
#           Note : no update, previous path was better (A -> B -> E = 5) than (A -> C -> E = 7)
#           
#           
#       From Node D :
#           - Edge D -> C : dist[C] = min(4, 5 + (-2)) = 3
#           - set predecessor[C] = D
#           
#           - Edge D -> F : dist[F] = min(inf, 5 + (-1)) = 4
#           - set predecessor[F] = D
#           
#       From Node E :
#           - Edge E -> F : dist[F] = min(4, 5 + 3) = 4
#           
#           
#   It 2 :
#       From Node A :  
#           - Edge A -> B : dist[B] = min(2, 5 - 2 - 2) = 1
#           - set predecessor[B] = C
#           Note : better path A -> D -> C -> B = 5 - 2 - 2 = 1
#
#       ...         
#           
#       From Node B :    
#           - Edge B -> E : dist[E] = min(5, 1 - 1) = 0
#           - set predecessor[E] = B
#           Note : better path A -> D -> C -> B -> E = 0
#           
#           

In [1]:
import functools

def bellman_ford_verbose(adj, S):
    """
    Bellman-Ford using recursion and memoization.
    Finds the shortest distance from start S to all nodes using at most k edges.
    Prints all intermediate steps for clarity.
    """

    @functools.cache
    def BF(node, steps_k):
        print(f"BF called: node={node}, steps_k={steps_k}")

        # -----------------
        # Base Case
        # -----------------
        if steps_k == 0:
            if node == S:
                print(f"Base case: node {node} is start {S}, distance = 0")
                return 0
            else:
                print(f"Base case: node {node} not start {S}, distance = inf")
                return float('inf')

        # -----------------
        # Option 1: don't use a new edge
        # -----------------
        best = BF(node, steps_k - 1)
        print(f"Distance to {node} using at most {steps_k-1} edges = {best}")

        # -----------------
        # Option 2: last step comes from a predecessor
        # -----------------
        for prev_node in adj:
            if node in adj[prev_node]:  # is there a road from prev_node → node?
                weight = adj[prev_node][node]
                print(f"Considering edge {prev_node} -> {node} with weight {weight}")

                candidate_distance = BF(prev_node, steps_k - 1) + weight
                print(f"Candidate distance via {prev_node} = {candidate_distance}")

                if candidate_distance < best:
                    print(f"Updating best distance for {node}: {best} → {candidate_distance}")
                    best = candidate_distance

        print(f"Return best distance to node {node} with steps_k={steps_k}: {best}")
        return best

    result = {}
    total_steps = len(adj) - 1
    print(f"Starting Bellman-Ford from node {S} with at most {total_steps} edges")
    for node in adj:
        print(f"\nComputing shortest distance to node {node}:")
        result[node] = BF(node, total_steps)
        print(f"Shortest distance to node {node} = {result[node]}")
    return result

# -----------------
# Example Graph
# -----------------
adj = {
    'A': {'B': 2, 'C': 4},
    'B': {'C': -2},
    'C': {}
}

# Run verbose Bellman-Ford
bellman_ford_verbose(adj, 'A')


Starting Bellman-Ford from node A with at most 2 edges

Computing shortest distance to node A:
BF called: node=A, steps_k=2
BF called: node=A, steps_k=1
BF called: node=A, steps_k=0
Base case: node A is start A, distance = 0
Distance to A using at most 0 edges = 0
Return best distance to node A with steps_k=1: 0
Distance to A using at most 1 edges = 0
Return best distance to node A with steps_k=2: 0
Shortest distance to node A = 0

Computing shortest distance to node B:
BF called: node=B, steps_k=2
BF called: node=B, steps_k=1
BF called: node=B, steps_k=0
Base case: node B not start A, distance = inf
Distance to B using at most 0 edges = inf
Considering edge A -> B with weight 2
Candidate distance via A = 2
Updating best distance for B: inf → 2
Return best distance to node B with steps_k=1: 2
Distance to B using at most 1 edges = 2
Considering edge A -> B with weight 2
Candidate distance via A = 2
Return best distance to node B with steps_k=2: 2
Shortest distance to node B = 2

Computi

{'A': 0, 'B': 2, 'C': 0}

#### BF with parent pointers:

In [3]:
# =========== With parent tracking ==========
import math
def BF(edges, start, nodes):
    distances = {node : math.inf for node in nodes}
    distances[start] = 0
    predecessors = {node : None for node in nodes}
    result =[]

    #relaxtions for V-1 iterations
    for _ in range(len(nodes) - 1):                   # O(V)
        updated = False

        for u, v, weight in edges:                    # O(E)
            if distances[u] + weight < distances[v]:
                distances[v] = distances[u] + weight
                predecessors[v] = u
                updated = True
        result.append(distances.copy())

        if not updated: break

    # Detect negative cycles
    for u, v, weight in edges:
        if distances[u] + weight < distances[v]:
            raise ValueError("Graph contains a negative-weight cycle")

    return distances, predecessors, result

def get_shortest_path(predecessors, start, end):
    path = []
    current = end

    while current is not None:
        path.append(current)
        current = predecessors[current]

    path.reverse()

    if path[0] == start:
        return path
    else:
        return None  # No path found

edges = [
    ('A', 'B', 6),
    ('A', 'C', 4),
    ('A', 'D', -2),
    ('B', 'E', -1),
    ('C', 'B', -2),
    ('C', 'E', 3),
    ('D', 'C', -2),
    ('D', 'F', -1),
    ('E', 'F', 3),
]
nodes = ['A', 'B', 'C', 'D', 'E', 'F']
result = BF(edges, 'A', nodes)
distances, predecessors, iterations = result

print("Distances:", distances)
print("Predecessors:", predecessors)
print("Iterations:", iterations)


Distances: {'A': 0, 'B': -6, 'C': -4, 'D': -2, 'E': -7, 'F': -4}
Predecessors: {'A': None, 'B': 'C', 'C': 'D', 'D': 'A', 'E': 'B', 'F': 'E'}
Iterations: [{'A': 0, 'B': 2, 'C': -4, 'D': -2, 'E': 5, 'F': -3}, {'A': 0, 'B': -6, 'C': -4, 'D': -2, 'E': -1, 'F': -3}, {'A': 0, 'B': -6, 'C': -4, 'D': -2, 'E': -7, 'F': -4}, {'A': 0, 'B': -6, 'C': -4, 'D': -2, 'E': -7, 'F': -4}]


### Floyd-Warshall:

- Faster for shortest path between any pair.
- Slower for shortest path for only 1 pair.


What if we are interested in finding the shortest paths between any two nodes? If we apply Dijkstra/Bellman-Ford for each node as starting point, what would the complexity be?

To be quicker, use dynamic programming.
- Subproblems: find `FW(u, v, k), the length of the shortest path between `u` and
`` only using the first `k nodes as intermediate nodes.
- Base cases:
- Guess:
- Recurrence:

In [None]:
# Floyd - Warshall Algorithm :
# 
#   Instead of finding the shortest path from a single source (Belmann-Ford),
#   find shortest paths between ALL pairs of nodes.
# 
#   + faster to find all the pairs (builds gradually the answer, NOT one shotest path of a city at a time)
#   - slower to find the path of a single pair (need to compute all pairs first)
# 
# 
#   Idea :
#       1. Get list of all vertices V = [V_1, V_2, ..., V_n]
#       
#       2. let u = V_1,... V_n and v = V_1,... V_n
#           for all pairs (u,v), find shortest path FW(u, v, k), k = len(V)
#       
#       Note : k is only the index of V to choose intermediate cities
#               - V[k-1] will ultimately iterate on all the cities from right to left.
#       
#       3. Subproblem :
#           - return if u and v are the same city, distance = 0
#           - if k = 0 (no intermediate cities allowed, direct path only) :
#               -> try and find if this path exist in adj (ex. A -> B : cost 3)
#               -> if not, cities unreachable with k=0, distance = +inf
#            
#           Recurrence :
#               - Idea : compare if  V_k is a shortcut between u and v
#               
#               intermediate = vertices[k - 1]
#                   -> select city from V=['A', 'B', 'C', ...] in reverse order
#               
#               - Case 1 : dont go through V_k
#                   -> start → end
#                   -> cost = FW(start, end, k-1)
#                
#               - Case 2 : go through V_k
#                   -> start → V_k → end
#                   -> cost = (start → V_k)       + (V_k → end)
#                   ->      = FW(start, V_k, k-1) + FW(V_k, end, k-1)
#       
#               - best = min( Case 1 , Case 2 )
# 
# 


# Subproblems :
# 

# Base-case : 
    # cannot change u,v. but k yes
    # FW(u, v, 0) = (0 if v=s) , (+inf otherwise)

# Guess :
    # last stop on path

# Recurrence :   
        # let path : u --(1)--- V_k ---(2)--- v
        # FW(u,v,k) = (1) + (2) = FW(u,V_k, k-1) + FW(K_v, v, k-1)
    #

# Complexity :
    #

# =======================================================

import functools
def floyd_warshall(adj):
    vertices = list(adj.keys())                                 # O(V)

    @functools.cache
    def FW(start, end, k):
        # Base case --------------------------------------------# O(1)
        if start == end: return 0                 # O(1)
        if k==0:                                  # O(1)
            if end in adj[start]: 
                cost = adj[start][end]
                return cost
            else:
                return float('inf')
        
        # Recurrence -------------------------------------------# O(V) - combination of the 3 recurrence calls, limited by memoization
        intermediate = vertices[k - 1]

        # Case 1 : path doesnt go through V_k
        cost_without = FW(start, end, k - 1)      # O(1) - memo

        # Case 2 : path goes through V_k
        cost_to_intermediate = FW(start, intermediate, k - 1)  # O(1) - memo
        cost_from_intermediate = FW(intermediate, end, k - 1)  # O(1) - memo

        cost_with = cost_to_intermediate + cost_from_intermediate

        best = min(cost_without, cost_with)


        return best

    res = {}

    for u in vertices:                                          # O(V)                   
        for v in vertices:                                      # O(V)
            print(f"Computing shortest path from {u} to {v}")
            res[(u, v)] = FW(u, v, len(vertices))               # O(V) - k goes from 0 to V
    
    return res


adj = {                 # list from : 
    'A': {'C': -2},
    'B': {'A': 4, 'C': 3},
    'C': {'D': 2},
    'D': {'B': -1}
}

adj_prof = {
    'A': {'B': 2, 'C': 4},
    'B': {'C': -2},
    'C': {}
}

print(floyd_warshall(adj_prof))




# =========== COMPLEXITY ==========
#
#
#   Number of subproblems :
#           - for u in vertices:                                 - V choices
#           - for v in vertices:                                 - V choices
#           
#           - recursion inside FW(u,v,k) with k from 0 to V :    - V choices
#               this last O(V) comes from the fact that k 
#               iterates from 0 to V.
#           
#           ex. (A, A, 0), (A, B, 1), (A, B, 2), ..., (A, B, V)  
#               (A, B, 0), (A, B, 1), (A, B, 2), ..., (A, B, V)
#               (A, C, 0), (A, C, 1), (A, C, 2), ..., (A, C, V)
#               ...
#               (B, A, 0), (B, A, 1), (B, A, 2), ..., (B, A, V)
#               ...
#               (C, A, 0), (C, A, 1), (C, A, 2), ..., (C, A, V)
#               (C, B, 0), (C, B, 1), (C, B, 2), ..., (C, B, V)
#               (C, C, 0), (C, C, 1), (C, C, 2), ..., (C, C, V)
#           
#           
#           
#           Total subproblems = O(V^3)
#
#   Time/subproblem :
#       - memoization allows for each subproblem to cost only O(1) time.
#       - O(1)
#
#
#
#   No memoization :
#       - Each call to FW(start,end,k) calls 3 more recursive calls
#       - each calls again with k - 1.
#
#       - This leads to a binary tree with height k = V.
#       - With 3 branches created at each time, the total number of calls is O(3^V).
#       - 
#       - Total complexity without memoization : O(V^2 * 3^V)
#       ~ O(3^V)
#








# ====== List comprehension =========
import functools

# u : last predecessor to target
# v : target
# s : start

def floyd_warshall(adj):
    V = list(adj.keys())

    @functools.cache
    def FW(u, v, k):
        # Base-case
        if (u == v): return 0
        if (k == 0): return adj[u][v] if v in adj[u] else float('inf')

        return min([
            # Fastest way doesnt go through V_k
            FW(u, v, k-1),
            FW(u, V[k - 1], k-1) + FW(V[k - 1], v, k-1)
        ])

    return {(u, v): FW(u, v, len(V)) for u in V for v in V}


# floyd_warshall({
#     'A': {'B': 2, 'C': 4},
#     'B': {'C': -2},
#     'C': {}
# })


# Complexity :
    # number of poss : u->V, v->V and k->V
    # O( V^3 ) subproblems
    # O(1) time/sub

    # total: O( V^3 )

Computing shortest path from A to A
Computing shortest path from A to B
Computing shortest path from A to C
Computing shortest path from B to A
Computing shortest path from B to B
Computing shortest path from B to C
Computing shortest path from C to A
Computing shortest path from C to B
Computing shortest path from C to C
{('A', 'A'): 0, ('A', 'B'): 2, ('A', 'C'): 0, ('B', 'A'): inf, ('B', 'B'): 0, ('B', 'C'): -2, ('C', 'A'): inf, ('C', 'B'): inf, ('C', 'C'): 0}


#### Detailled Example

In [None]:
# Example :
#
#     'A': {'B': 2, 'C': 4},
#     'B': {'C': -2},
#     'C': { }
#
#      Vertices = [A, B, C]
#
#       1. From 'A' to 'A' :    -> FW(A, A, 3) = 2
#           - start == end  => return 0
#           res = { ('A', 'A'): 0 }
#
#       2. From 'A' to 'B' :    -> FW(A, B, 3) = 2
#           
#           - k = 3, intermediate = V[3-1] = 'C'
#
#           Case 1 : cost_without = FW(start, end, k - 1)
#               - FW(A, B, 2)
#                   - k = 2, intermediate = V[2-1] = 'B'
#
#                   - Case 1.1 : cost_without = FW(A, B, 1)
#                       - k = 1, intermediate = V[1-1] = 'A'
#                       - Case 1.1.1 : cost_without = FW(A, B, 0)
#                           - k = 0, is 'B' in adj['A'] ? yes, cost = 2
#                       - Case 1.1.2 : cost_with = FW(A, A, 0) + FW(A, B, 0)
#                           - 0 + 2 = 2
#                       - best = min(2, 2) = 2
#
#
#                   - Case 1.2 : cost_with = FW(A, B, 1) + FW(B, B, 1)
#                       - FW(A, B, 1) = 2 : (memo)
#                       - FW(B, B, 1) = 0 (start == end)
#                       - return 2 + 0 = 2
#
#                   - best = min(2, 2) = 2
#               
#               ~ Case 1 : ('A', 'B') = 2
#
#           Case 2 : cost_with = FW(A, C, 2) + FW(C, B, 2)
#               - FW(A, C, 2)
#                   - k = 2, intermediate = 'B'
#
#                   - Case 2.1 : cost_without = FW(A, C, 1)
#                       - k = 1, intermediate = 'A'    
#                       - Case 2.1.1 : cost_without = FW(A, C, 0)
#                           - k = 0, is 'C' in adj['A'] ? yes, cost = 4
#                           - FW(A, C, 0) = 4
#                       - Case 2.1.2 : cost_with = FW(A, A, 0) + FW(A, C, 0)
#                           - 0 + 4 = 4
#                       
#                       - best = min(4, 4) = 4
#
#                   - Case 2.2 : cost_with = FW(A, B, 1) + FW(B, C, 1)
#                       - FW(A, B, 1) =  2 : (memo)
#                       - FW(B, C, 1) = -2 : 
#                           - k = 1, intermediate = 'A'
#                           - Case 2.2.1 : cost_without = FW(B, C, 0)   
#                           - k = 0, is 'C' in adj['B'] ? yes, cost = -2
#                           - Case 2.2.2 : cost_with = FW(B, A, 0) + FW(A, C, 0)
#                           - inf + 4 = inf
#                          ~ best = min(-2, inf) = -2
#                       - return 2 + (-2) = 0
#
#                   - best = min(4, 0) = 0
#
#               ~ FW(A, C, 2) = 0
#
#               - FW(C, B, 2)
#                   - k = 2, intermediate = V[2-1] = 'B'            # Note: here intermediate = end ! thats ok
#
#                   - Case 2.3 : cost_without = FW(C, B, 1)
#                       - k = 1, intermediate = V[1-1] = 'A'
#                       - Case 2.3.1 : cost_without = FW(C, B, 0)
#                           - k = 0, is 'B' in adj['C'] ? No, cost = inf
#                       - Case 2.3.2 : cost_with = FW(C, A, 0) + FW(A, B, 0)
#                           - FW(C, A, 0) = inf (no path)
#                           - FW(A, B, 0) = 2 : (memo)
#                           ~ inf + 2 = inf
#
#                       - best = min(inf, inf) = inf    
#
#                   - Case 2.4 : cost_with = FW(C, B, 1) + FW(B, B, 1)
#                       - FW(C, B, 1) = inf : (memo)
#                       - FW(B, B, 1) = 0 : (start == end)
#                       ~ inf + 0 = inf
#                   
#                   ~ best = min(inf, inf) = inf
#
#               ~ Case 2 : FW(C, B, 2) = inf
#
#
#
#           ~ best = min(Case 1, Case 2) = min(2, inf) = 2
#
#       res = { ('A', 'B'): 2 }
#       
#       
#       3. From 'A' to 'C' :    -> FW(A, C, 3) = 0
#           - etc ....
#
#
#   FINAL RESULT :
#       {
#       ('A','A'): 0,
#       ('A','B'): 2,
#       ('A','C'): 0,
#       ('B','A'): inf,
#       ('B','B'): 0,
#       ('B','C'): -2,
#       ('C','A'): inf,
#       ('C','B'): inf,
#       ('C','C'): 0
#       }
#
#   Actual Output from algorithm hereup :
#   {('A', 'A'): 0, ('A', 'B'): 2, 
#   ('A', 'C'): 0, ('B', 'A'): inf, 
#   ('B', 'B'): 0, ('B', 'C'): -2, 
#   ('C', 'A'): inf, ('C', 'B'): inf, 
#   ('C', 'C'): 0}
#
#
#





### Prim's:

<img src="img/prims.png" width="60%">

In [None]:
# Prim's Algorithm :
#
#   Goal : 
#       - "How can i build the cheapest network connecting all cities ?"
#
#       - build a minimum spanning tree (MST) from a connected, undirected graph with weighted edges.
#       - we just want to visit each city once, always choosing the cheapest path. 
#       - e.g. connect all cities, keep cheapest connections, throw out the expensives ones.
#       
#       - this differs from Djikstra's which finds shortest path from a source to all nodes.
#       - here we optimize the the combined travel costs of each edge in the tree, not the shortest path from a source to a target.
#       
#       - Here we dont see "cities" anymore, we operate on edges = tuple(start, end)
#       
#
#   Methodology :
#
#       1. Init
#           tree = []           : list of edges tuple(start, end) in the MST 
#           visited = set()     : to prevent cycles.
#           queue = [(0, s, s)] : min-heap priority queue of (weight, start, end)
#
#
#       2. While queue not empty :
#           - pop edge with smallest weight from queue (_, start, end) - we already know its the smallest weight from the heap queue
#           
#           - check visited() : continue
#           - add end to visited
#
#           - add edge to the tree if start != end
#
#           - for each neighbour of end :
#               - if neighbour not in visited : push this neighbour to the queue.
#
#
#   Example :
#       
#
#
#
#






import heapq

def prim(adj, s):
    tree = []
    visited = set()
    queue = [(0, s, s)]


    while queue:                                                    # O(E)
        _, start, end = heapq.heappop(queue)                        # O(log E)

        if end in visited: continue                                 # O(1)
        visited.add(end)                                            # O(1)
        if start != end: tree.append((start, end))                  # O(1) - O(V) total   


        for neighbour, weight in adj[end].items():                 # O(E)
            if neighbour not in visited:
                heapq.heappush(queue, (weight, end, neighbour))    # O(log E)

    return tree

prim({
    0: {1: 1, 2: 4},
    1: {0: 1, 2: 2, 3: 6},
    2: {0: 4, 1: 2, 3: 3},
    3: {1: 6, 2: 3},
}, 0)




# ========== COMPLEXITY ==========
#
#   Line-by-Line Complexity :
#       - init tree, visited, queue :                       - O(1)
#
#       - while queue :                                     - O(E)
#           heappop :                                       - O(log E)
#           visited checks and append :                     - O(V)
#
#           for neighbor in end :                           - O(E)
#               heappush :                                  - O(log E)
#
#
#
#   ! Not log V but log E !
#
#   Best Case :
#       - Linear graph (linked list) : A -> B -> ... -> V
#       - V : vertices
#       - E = V - 1
# 
#       Total Complexity :
#           - O(1) : inits
#           - O(E * log E) : vertices are popped only once (total V pops).
#           - O(E * log E) : edges    are pushed only once (total E pushes).
# 
#        ~ O(E log E)
# 
#   Note :
#       - Teacher says its O((V+E) log E)....
#       - while queue : O(V) in his explanations, i dont understand why, good luck.
#          
#          
#          
#          
#          
#
#
#
#   Worst Case :
#       - Each Vertex is inserted once.
#       - Each Edge   is considered once.
#
#
#
#
#
#
#
#
#


















[(0, 1), (1, 2), (2, 3)]