# Introduction

Thoughout the course, we've seen how data storage or data processing mechanisms can be deduced into an operational process of graphs. Particularly, we've learnt a few graph analysing algorithms, such as Dijkstra’s algorithm, PageRank algorithm and Girvan-Newman algorithm. This tutorial will introduce you to two useful techniques, the Ford–Fulkerson algorithm and the Edmonds-Karp algorithm, for computing the maximum flow in a flow network. To get started, let's go over a few concepts first.


# Flow network

A flow network G = (V,E) is a directed graph in which each edge (u,v) ∈ E has a nonnegative capacity c(u,v) ≥ 0. We also require that if E contains an edge (u,v), then there is no edge (v,u) in the reverse direction. 

We pick out two vertices in a flow network: the source s and the sink t. We can assume that each vertex lies on some path from the source to the sink. That is, for each vertex v ∈ V , the flow network contains a path s -> v -> t. The graph is therefore connected and, since each vertex other than s has at least one entering edge, |E| ≥ |V| - 1.

Here's an example of a flow network:
![title](network_flow.png)


# Flow

A flow in G is a function f : V × V → ℝ that satisfies the following two constraints for all nodes u and v:
Capacity constraint: 0 ≤ f (u, v) ≤ c(u, v)
Flow conservation: For all u = V - {s,t}, ∑(v ∈ V) f(v, u) = ∑(v ∈ V) f(u, v). That is, all the sum of flows that come out from u is the same as the sum that go in.
We call the quantity f (u,v) ≥ 0 the flow from vertex u to vertex v. The value |f| of a flow f is ∑(v ∈ V) f (s, v) = ∑(v ∈ V) f (v, s). That is, the total flow out of the source minus the total flow into the source.
In the maximum-flow problem, we are given a flow network G with source s and sink t, and we wish to find a flow of maximum value.


# The Ford-Fulkerson Algorithm

The Ford-Fulkerson algorithm solves the maximum-flow problem. It iteratively increases the value of the flow. We start with f(u, v) = 0 for all u, v ∈ V, giving an initial flow of value 0. At each iteration, we increase the flow value in G by finding an augmenting path in an associated residual network Gf. To under stand this, we need to formally define residual network and augmenting path.

Residual network: Given a flow network G = (V,E) with source s and sink t. Let there also be a flow f in G. The residual network Gf has a capacity defined as such:
![title](capacity.png)

Augmenting path: The augmenting path is just a path from s to t in the residual network.

Once we know the edges of an augmenting path in Gf, we can combine the flows on the specific edges of the augmenting path on the edges of the original flow to increase the total flow. Although each iteration of Ford-Fulkerson increases the value of the flow, the flow on any particular edge of G may increase or decrease; decreasing the flow on some edges may be necessary in order to send more flow from the source to the sink. We repeatedly augment the flow until the residual network has no more augmenting paths.

Here's the sudo code and python implementation for the algorithm:

In [None]:
class Edge(object):
    def __init__(self, u, v, w):
        # the edge goes from u to v and has weight w
        self.source = u
        self.sink = v  
        self.capacity = w
        
        # give a way to represent the edge
    def __repr__(self):
        return "%s->%s:%s" % (self.source, self.sink, self.capacity)

class FordFulkerson(object):
    def __init__(self):
        self.adj = {}
        self.flow = {}
        
    def num_vertex(self):
        return len(self.adj)
 
    def add_vertex(self, vertex):
        self.adj[vertex] = []
 
    def get_edges(self, v):
        return self.adj[v]
 
    def add_edge(self, u, v, w=0):
        if u == v:
            # we can't allow the source and destination to be the same
            raise ValueError("u == v")
        # initialize edges in both directions
        edge = Edge(u,v,w)
        redge = Edge(v,u,0)
        edge.redge = redge
        redge.redge = edge
        
        self.adj[u].append(edge)
        self.adj[v].append(redge)
        
        self.flow[edge] = 0
        self.flow[redge] = 0
 
    def find_path(self, source, sink, path):
        if source == sink:
            return path
        # perform a dfs based on the residual graph
        for edge in self.get_edges(source):
            # flow can be in same or opposite direction as path
            # residual can be positive in either case
            residual = edge.capacity - self.flow[edge]
            if residual > 0 and edge not in path:
                result = self.find_path( edge.sink, sink, path + [edge]) 
                if result != None:
                    return result
 
    def max_flow(self, source, sink):
        path = self.find_path(source, sink, [])
        # check to see if there's an augmenting path
        # in the residual graph
        while path != None:
            for edge in path:
                residuals.append(edge.capacity - self.flow[edge])
            flow = min(residuals)
            for edge in path:
                self.flow[edge] += flow
                self.flow[edge.redge] -= flow
            path = self.find_path(source, sink, [])
        flow = 0
        for edge in self.get_edges(source):
            flow += self.flow[edge]
        return flow

# Analysis of Ford-Fulkerson

Let's assume that the maximum flow f is an integer. Then the while loop used to find augmenting paths can be executed at most |f| times because in each iteration the flow is increased by at least 1. In each iteration, we need to perform a DFS, which takes O(V+E) time. Since we know the graph is connected, O(V+E) = O(E). So the total time used to fun Ford-Fulkerson is O(E*|f|).


# The Edmonds-Karp algorithm

We can improve the Ford-Fulkerson algorithm by using BFS instead of DFS to find the augmenting path. Specifically, we can change the find_path function to the following:

In [None]:
class EdmondsKarp(object):
    def __init__(self): #same as FordFulkerson
        self.adj = {}
        self.flow = {}
        
    def num_vertex(self): #same as FordFulkerson
        return len(self.adj)
 
    def add_vertex(self, vertex): #same as FordFulkerson
        self.adj[vertex] = []
 
    def get_edges(self, v): #same as FordFulkerson
        return self.adj[v]
 
    def add_edge(self, u, v, w=0): #same as FordFulkerson
        if u == v:
            # we can't allow the source and destination to be the same
            raise ValueError("u == v")
        # initialize edges in both directions
        edge = Edge(u,v,w)
        redge = Edge(v,u,0)
        edge.redge = redge
        redge.redge = edge
        
        self.adj[u].append(edge)
        self.adj[v].append(redge)
        
        self.flow[edge] = 0
        self.flow[redge] = 0
 
    def find_path(self, source, sink): #DIFFERENT from FordFulkerson, using BFS
        n = self.num_vertex()
        
        parent = []
        for i in range(n):
            parent.append(None)
        #prevent source from being visited by assigning it some value not None
        parent[source] = Edge(source, source, 0)
        
        capacity = []
        i in range(n):
            capacity.append(0)
        #there's no constraint on the flow out from source
        #make it infinity so that it doesn't interfere with 
        #capacities of nodes down the graph
        capacity[source] = decimal.Decimal('Infinity')
        
        # perform a bfs based on the residual graph
        queue = []
        queue.append(source)
        while len(queue)>0:
            u = queue.pop(0)
            for edge in self.get_edges(u):
                residual = edge.capacity - self.flow[edge]
                #check that the node hasn't been visited before
                #and that there's a positive capacity towards the node
                if residual > 0 and parent[edge.sink] is None:
                    parent[edge.sink] = edge
                    #check if the current edge is the critical edge
                    capacity[edge.sink] = min(capacity[edge.source], residual)
                    if edge.sink!=sink:
                        queue.append(edge.sink)
                    else:
                        return capacity[sink], parent
        return 0, parent
 
    def max_flow(self, source, sink): #DIFFERENT from FordFulkerson
        flow, parent = self.find_path(source, sink)
        # check to see if there's an augmenting path
        # in the residual graph
        while flow != 0:
            for edge in parent:
                if edge is None:
                    continue
                self.flow[edge] += flow
                self.flow[edge.redge] -= flow
            flow, parent = self.find_path(source, sink)
        flow = 0
        for edge in self.get_edges(source):
            flow += self.flow[edge]
        return flow

# Analysis of Edmonds-Karp

To analyze the complexity of Edmonds-Karp, let's first define a useful concept and acknowledge a lemma.

Let's define the distance of v to source s as the length of the path found in the BFS from s to v.

Lemma: If the Edmonds-Karp algorithm is run on a flow network G = (V,E) with source s and sink t, then for all vertices v ∈ V - {s,t}, the distance of v in the residual network Gf increases monotonically with each flow augmentation. (The prove of this lemma can be found in Introduction to Algorithms chapter 26 under lemma 26.7.)

With this lemma let's construct the following argument.

In each flow augmentation, one edge in the residual network would reach its maximum capacity. Let's call this edge the critical edge and say edge e = (u,v) becomes the critical edge for the first time in one flow augmentation. After this flow augmentation, e would no longer exist in the residual graph because its max capacity is used. In order for e be become a critical edge again, its flow needs to be relieved in one flow augmentation and again be filled to full capacity in another augmentation. After these two augmentations, we know that the distance of v from source s must have increased by at least 2. Since there're only |V| edges in the graph, the total number of times edge e = (u,v) can become critical is at most |V|/2. Also because there's at least one critical edge in each augmentation, there can be at most |v|/2*|E| or O(V*E) augmentations. We know that the time to run a BFS in a connect graph is O(E). So the total run time is O(E)*O(VE) = O(VE^2). 

As we can see, Edmonds-Karp's complexity doesn't depend on the maximum flow value. Therefore it's much more efficient than Ford-Fulkerson in most cases. To see the advantage of Edmonds-Karp over Ford-Fulkerson. Let's see the following example.

![title](advantage.png)

In this graph, a through c represent the residual network after each flow augmentation. The shaded path represents the augmenting path in each residual graph. If we run Ford-Fulkerson on this graph, it could take ϴ(E|f|) time, where |f| = 2,000,000. Specifically, Ford-Fulkerson could potentially get us stuck between the loop from a->b->c->a 1,000,000 times, depending on the sequence of access of DFS.

However running Edmonds-Karp on this graph would only take ϴ(VE^2) time. In this case |V| and |E| are extremely small compared with |f|. Therefore we would have an large improvement in time complexity.


# Real World Application of Maximum Flow

Let's consider the following real word example. Some factories need to deliver food to some villages. A networks of roads connect the factories and the villages. Each road has a certain capacity for maximum goods that can travel through it. We need to find if there is a circulation that satisfies the demand. This problem can be reduced into a simple maximum-flow problem in the following two steps:

1. Add a source node s and add edges from it to every factory node f. The edges should have capacity p where p is the production rate of factory f.
2. Add a sink node t and add edges from all villages v to t.  The edges should have capacity d where d is the demand rate of village v.
3. Let G = (V, E) be this new network. There exists a circulation that satisfies the demand if and only if :

The maximum flow value is equal to the total demand of all villages.

If demand can be satisfied, we can look at the flow and figure out the amount of good that needs to be transferred on each path.


# Further Resources

Many other graph and even combinatorial problems can be solved using maximum-flow algorithms. For further reading, I suggest the 26th chapter of the book Introduction to Algorithms. The book provides proof for the correctness and time complexity of the algorithms in this tutorial and introduces a lot more interesting problems that can be solved with maximum-flow. Such as the maximum-bipartite-matching problem, the minimum-path-cover problem and many more.