<center style="font-size: 30px">

# DAA Assignment

</center>

- [Team Members](#team-members)
- [Introduction](#introduction)
- [Problem Statement](#problem-statement)
- [Algorithm](#algorithm)
  - [Ford Fulkerson](#Ford-Fulkerson)
  - [Segmented Least Squares](#Segmented-Least-Squares)
- [Helper Functions](#helper-function)
- [Analysis and Conclusions](#analysis-and-conclusions)

<a name="team-members"></a>

## Team Members

<style>
td, th {
   border: none!important;
   font-size: 20px;
}
</style>

| NAME                        | ID            |
| --------------------------- | ------------- |
| Milind Jain                 | 2020A7PS0153H |
| Mokshith Naidu Thakkilapati | 2020A7PS1885H |
| Anish Kumar Kallepalli      | 2020A7PS0282H |
| Sriram Srivatsan            | 2020A7PS0273H |


<a name="introduction"></a>

## Introduction

The maximum flow problem involves finding the maximum amount of flow that can be sent from a source node to a sink node in a directed graph, subject to capacity constraints on the edges. The Ford-Fulkerson algorithm is an algorithm for solving the maximum flow problem in a flow network.

The minimum s-t cut problem is closely related to the maximum flow problem, in fact, it can be shown that the maximum flow is equal to the minimum capacity of all s-t cuts. This is known as the max-flow min-cut theorem.

A bipartite graph is a graph whose vertices can be divided into two disjoint sets such that all edges connect a vertex in one set to a vertex in the other set. The Bipartite Matching problem is a graph optimization problem that involves finding the largest possible matching between two disjoint sets of vertices in a bipartite graph. There can be more than one maximum matchings for a given Bipartite Graph.

Segmented Least Squares using dynamic programming is a computational technique that is used to solve the problem of finding the best piecewise linear approximation of a given set of data points.Segmented Least Squares using dynamic programming is useful in a wide range of applications, including data compression, signal processing, and time series analysis.Dynamic programming is used to efficiently find the optimal partition and the coefficients of the linear functions.


<a name="problem-statement"></a>

## Problem Statements

### Problem Statement 1

We are supposed to implement the Ford-Fulkerson Algorithm that was explained in class. Then we should implement the subroutine to find the minimum st-cut of a network flow graph. We are also supposed to use the Ford-Fulkerson algorithm for solving Bipartite Matching problem. The Bipartite Matching problem can be stated as follows: Given a bipartite graph G = (U, V, E), where U and V are the disjoint sets of vertices and E is the set of edges connecting vertices in U to vertices in V, the goal is to find a maximum cardinality matching M, which is a subset of E such that no two edges in M share a common endpoint. We are also supposed to run the algorithm on different kinds of network flow graphs, such as smaller graphs to test the code and larger graphs to verify the robustness of implementations.

### Problem Statement 2

Assume a set P of n points in the plane, labelled (x1,y1), (x2,y2), (x3,y3),..., (xn,yn) and suppose x1 < x2 < …< xn. Divide P into a few parts. Segment is a subset of P that represents a contiguous set of x-coordinates. Compute the line minimizing the error with respect to the points in S. Along with the error, we want to penalise having too many partitions.Penalty is calculated as the sum of the segments into which P is divided times a constant multiplier C > 0 and for each segment, the error value of the optimal line through that segment. Our aim is to find a partition with the least amount of penalty.


<a name="algorithm"></a>

## Algorithm

<a name="Ford-Fulkerson"></a>

### Ford-Fulkerson

The algorithm finds augmenting paths repeatedly in the residual graph, which is a graph that represents the capacity of the remaining flow after the initial flow has been subtracted, starting with an initial flow of zero. In the residual graph, an augmenting path is one that has positive capacity on all of its edges and connects the source and sink nodes. The algorithm increases the flow along an augmenting path by the minimum capacity of the edges along the path once it has been identified, effectively pushing more flow from the source node to the sink node. The flow is at its highest point when this process is repeated until no more augmenting paths can be discovered. There are several ways to find augmenting paths when using the Ford-Fulkerson algorithm, including breadth-first search and depth-first search. If the capacities are not integral, one important factor to take into account is that the algorithm might not always converge to the maximum flow. Hence we use a updated Ford-Fulkerson algorithm that uses the shortest augmenting path instead of any augmenting path. This ensures that the algorithm will terminate in a finite number of iterations, since the length of the shortest augmenting path can only decrease after each iteration.

The algorithm has a worst-case time complexity of $O(m^2log C)$ , where m is the number of edges in the flow network and c is the maximum flow. However, in practice, the algorithm tends to perform much better than this worst-case bound.


<a name="Segmented-Least-Squares"></a>

### Segmented Least Squares

The segmented least squares problem can be solved with dynamic programming.
First we calculate errors between each pair of points e(i,j) using the formula :

$$ e*{ij} : \sum*{i}^{n} {(y_i \, – \, ax_i \, – \, b^2)}^2 $$

$$ a : \frac{\sum*{i}^{n} {x_i\,y_i} - (\sum*{i} {x*i})(\sum*{i=1} {y*i})}{\sum*{i}^{n} {x*i^2}-(\sum*{i} {x_i})^2}$$

$$ b : \frac{\sum*{i} {y_i} - a\sum*{i} {x_i}}{n} $$

Let $OPT_j$ denote the minimum cost for points $p_1, p_2,\dots, p_j$. Let $e_{ij}$ denote the minimum squared error for points $p_i, p_{i+1},\dots, p_j$. The crucial observation is that the last point $p_j$ for some subproblem $OPT_j$ belongs to a single segment in the optimal partition, and that segment begins at some earlier point pi. Thus, if we knew $OPT_{i-1}$, we could compute $OPT_j$. This leads to the following recursive formulation:
$$ OPT*j = min*{1 \le i \le j}(e*{ij} + C + OPT*{i-1}) $$
Here $C$ is the penalty for each segement is taken as input <br>
$OPT_n$ gives us the minimum penalty for all the $n$ points


<a name="helper-function"></a>

## Helper Functions


```
vector<int> get_path(int start, int end, vector<vector<int>> &res_adj, int del)
```

Function to get a path with a source vertex, sink vertex using BFS such that every edge on the path has a weight greater than delta.
<br>Time complexity is $O(N+M')$, where $N$ is number of vertices and $M'$ is number of edges in the residual graph.


```
void augment(vector<int> &f, vector<int> &path, vector<vector<int>> &res_adj, map<array<int, 2>, int> &edg_to_i)
```

Function to update the adjacency matrix of the residual graph with the bottleneck of the path obtained. Such that the forward edge weight will be reduced by the bottleneck and the backward edge weight will be increased by the bottleneck.
<br>Time complexity is $O(|P|)$, where $P$ is the path and $|P|$ is the length of the path.


```
void reach_from_source(int u, vector<bool> &vis, vector<vector<int>> &res_adj)
```

Function to get all the vertices that can be visited from source vertex. This is done by iterating through all the vertices that are not visited and if there is an edge between u and this vertex, recursively call this function to find all the points u can reach from this point.
<br>Time complexity is $O(N+M')$, where $N$ is number of vertices and $M'$ is number of edges in the residual graph.


```
void find_segment(int j, vector<vector<int>> &segments, vector<int> &best_seg)
```

Function to recursively find the best set of segments
<br>Time complexity is $O(N)$, where $N$ is number of points.


<a name="analysis-and-conclusions"></a>

## Analysis and Conclusions


We have implemented Ford Fulkerson algorithm and used it to solve min-cut problem and matching bipartite problem. We also used Dynamic Programming techniques to solve Segmented Least Squares problem. We have used a visualizer program which allows use to view the points and the best fit segments for the latter. These are some of the results we have gathered by running our code on many datasets obtained from multiple sources. For the sake of brevity, we do not present the results individually for every single dataset.

All the codes were implemented on g++ (MinGW.org GCC Build-2) 9.2.0 and run on a PC with a processor 12th Gen Intel(R) Core(TM) i7-12700H with a 2.30 GHz speed.
<br>


<a name="acp1"></a>

### Problem Statement 1


#### Example 1 Ford Fulkerson: <br>

<img src="images/q1.png" align="left" height="200" width="300"/>
<img src="images/sol1.png" align="right" height="200" width="300"/>
<div align="right">
 
</div>
<center> <-Question &emsp; Solution-><br><br>
Number of vertices: 4 <br>
Number of edges: 3 <br>
Maximum flow: 3 <br>
Time for running Ford Fulkerson: 0ms <br>

</center>
<br>
<br>


#### Example 2 Ford Fulkerson: <br>

<img src="images/q2.png" align="left" height="200" width="300"/>
<img src="images/sol2.png" align="right" height="200" width="300"/>
<div align="right">
 
</div>
<center> <-Question &emsp; Solution-><br><br>
Number of vertices: 4 <br>
Number of edges: 5 <br>
Maximum flow: 6 <br>
Time for running Ford Fulkerson: 0ms <br>

</center>
<br>
<br>


| Vertices | Edges | Avg Dec. time |
| -------- | ----- | ------------- |
| 4        | 3     | 0 ms          |
| 4        | 5     | 0 ms          |
| 67       | 154   | 13.549 ms     |
| 112      | 312   | 39.904 ms     |
| 233      | 478   | 87.476 ms     |
| 500      | 998   | 149.904 ms    |


#### Example 1 Bipartite Matching: <br>

<img src="images/qq1.png" align="left" height="300" width="300"/>
<img src="images/sols1.png" align="right" height="300" width="300"/>
<div align="right">
 
</div>
<center> <-Question &emsp; Solution-><br><br>
Number of vertices: 10 <br>
Number of edges: 9 <br>
Number of nodes in the left bipartite set (bi_index): 5 <br>
Time for running Bipartite Matching: 0ms <br><br>
<strong>If edge number is less than bi_index it is <br>
part of left bipartite set else it is part of right bipartite set.</strong><br>

</center>
<br>
<br>


#### Example 2 Bipartite Matching: <br>

<img src="images/qq2.png" align="left" height="300" width="300"/>
<img src="images/sols2.png" align="right" height="300" width="300"/>
<div align="right">
 
</div>
<center> <-Question &emsp; Solution-><br><br>
Number of vertices: 12 <br>
Number of edges: 8 <br>
bi_index: 6 <br>
Time for running Bipartite Matching: 0ms <br><br>
<strong>If edge number is less than bi_index it is <br>
part of left bipartite set else it is part of right bipartite set.</strong><br>

</center>
<br>
<br>


| Vertices | Edges | Avg Run time |
| -------- | ----- | ------------ |
| 10       | 9     | 0 ms         |
| 12       | 8     | 0 ms         |
| 67       | 39    | 11.7936 ms   |
| 112      | 96    | 31.055 ms    |
| 264      | 153   | 59.1638 ms   |
| 498      | 435   | 97.4668 ms   |


<br>
We can conclude from our observations that as the number of vertices or edges increases the time taken to run Ford Fulkerson or for solving Bipartite Matching increases. From the images we can conclude that the implementation of the algorithms was correct.


<a name="acp1"></a>

### Problem Statement 2


#### Example 1 Least Square Segments: <br>

<img src="images/1_10.png" align="right" height="200" width="300"/>
Number of Points: 10 <br>
Number of Segments: 2 <br>
Penalty per Segment: 200 <br>
Total Cost: 458.8139 <br>
Avg Run Time: 0.58ms <br>
<br>
<br>


#### Example 2 Least Square Segments: <br>

<img src="images/2_50.png" align="right" height="200" width="300"/>
Number of Points: 50 <br>
Number of Segments: 20 <br>
Penalty per Segment: 5 <br>
Total Cost: 106.7701 <br>
Avg Run Time: 2ms <br>
<br>
<br>


#### Example 3 Least Square Segments: <br>

<img src="images/3_50.png" align="right" height="200" width="300"/>
Number of Points: 50 <br>
Number of Segments: 9 <br>
Penalty per Segment: 50 <br>
Total Cost: 680.7989 <br>
Avg Run Time: 2.013 ms <br>
<br>
<br>


#### Example 4 Least Square Segments: <br>

<img src="images/4_200.png" align="right" height="200" width="300"/>
Number of Points: 200 <br>
Number of Segments: 70 <br>
Penalty per Segment: 10 <br>
Total Penalty: 811.9039 <br>
Avg Run Time: 76.034 ms <br>
<br>
<br>


#### Example 5 Least Square Segments: <br>

<img src="images/5_500.png" align="right" height="200" width="300"/>
Number of Points: 500 <br>
Number of Segments: 177 <br>
Penalty per Segment: 10 <br>
Total Cost: 2045.6983 <br>
Avg Run Time: 1134.924 ms <br>
<br>
<br>


#### Example 6 Least Square Segments: <br>

<img src="images/6_1000.png" align="right" height="200" width="300"/>
Number of Points: 1000 <br>
Number of Segments: 109 <br>
Penalty per Segment: 100 <br>
Total Cost: 21490.8284 <br>
Avg Run Time: 8860.5 ms <br>
<br>
<br>


| Points | Segments | Penalty_Segment | Total Cost | Avg Run time |
| ------ | -------- | --------------- | ---------- | ------------ |
| 10     | 2        | 200             | 458.8139   | 0.58 ms      |
| 50     | 20       | 5               | 106.7701   | 2 ms         |
| 50     | 9        | 50              | 680.7989   | 2.013 ms     |
| 200    | 70       | 10              | 811.9039   | 76.034 ms    |
| 500    | 177      | 10              | 2045.6898  | 1134.924 ms  |
| 1000   | 109      | 100             | 21490.8284 | 8860.5 ms    |


<br>
We can conclude from our observations that our dynamic programming based algorithm gives us the segments with the least penalty and this algorithm can clearly be applied to find the segments also.
