# Introduction to Computation and Python Programming

## Lecture 11

### Today
----------

- Knapsack and Graph Optimization Problems

### Optimization problems

- An optimization problem has two parts
    * **Objective function** - that is being maximized or minimized
    * **Set of constraints** - (possibly empty) that must be honored

### Main lessons to learn here

- Many problems of real importance can be formulated in a simple way that leads naturally to a computational problem
- Reducing a seemingly new problem to an instance of a well-known problem allows one to use preexisting solutions
- Knapsack problems and graph problems are classes of problems to which other problems can often be reduced
- Exhaustive enumeration algorithms provide a simple, but often computationally intractable, way to search for optimal solutions
- A greedy algorithm is often a practical approach to finding a pretty good, but not always optimal, solution to an optimization problem

### Knapsack problems

- A burglar with a knapsack that can hold at most 20 kgs breaks into a house

| | Value | Weight | Value/Weight |
|-|-------|--------|--------------|
|Clock| 175 | 10 | 17.5 |
|Painting| 90 | 9 | 10 |
|Radio| 20 | 4 | 5 |
|Vase| 50 | 2 | 25 |
|Book| 10 | 1 | 10 |
|Computer| 200 | 20 | 10 |

- What does he take? What does he leave behind?


### Greedy Algorithm

- Best item first, next best next, and so on
- But what is "best"?
    * Most valuable? Least heavy? Highest value-weight ratio?
- Notice that greedy-by-density gives the best result (255)
- BUT there is no guaranee that greedy-by-density will give a better solution that greedy by weight or value
- More generally, for the knapsack problem, greedy algorithm not guaranteed to be optimal

##### Code

### An optimal solution to the 0/1 Knapsack problem

Formally, the **0/1 knapsack problem** is:
- Each items is represented by a pair, $<value, weight>$
- A knapsack can accomodate items with a total weight of no more than $w$
- A vector, $I$, of length $n$, represents the set of available items. Each element of the vector is an item
- A vector, $V$, of length $n$, is used to indicate whether or not each item is taken by the burglar. If $V[i] = 1$, item $I[i]$ is taken. If $V[i] = 0$, item $I[i]$ is not taken
- Find a $V$ that maximizes

\begin{equation*}
\sum_{i=0}^{n-1} V[i] * I[i].value \\
subject\ to\ the\ constraint\ that \\
\sum_{i=0}^{n-1} V[i] * I[i].weight <= w \\
\end{equation*}

### Enumeration

1. Enumerate all possible combinations of items. Power set of the set of items
2. Remove all of the combinations whose weight exceeds the allowed weight
3. From the remaining combinations choose any one whose value is the largest

##### Code

### Exhaustive Search is hopeless

- $O(n * 2^n)$ complexity
- Notice that enumeration gives us a better solution that greedy algorithms
- Greedy algorithms make **locally optimal** choice but solution needs to be **globally optimal**

### Graph Optimization Problems

- Example: List of prices of all airline flights between each pair of cities given
    * What is the smallest number of stops between some pair of cities?
    * What is the least expensive airfare between some pair of cities?
    * What is the least expensive airfare between some pair of cities involving no more than two stops?
    * What is the least expensive way to visit some collection of cities?
- Formally: A **graph** is a set of objects called **nodes** (or **vertices**) connected by a set of **edges** (or **arcs**). If the edges are unidirectional the graph is called a **directed graph** or **digraph**. 
- In a directed graph, if there is an edge from $n1$ to $n2$, we refer to $n1$ as the **source** or **parent node** and $n2$ as the **destination** or **child node**

### Graph Data Structures

##### Code

- Digraph is commonly represented using an $n x n$ **adjacency matrix**, where n is the number of nodes in the graph. Each cell of the matrix contains information (e.g. weights) about the edges connecting the pair of nodes $<i, j>$. 
- Another representation (shown in code) is an **adjacency list**
- Class ```Graph``` is a sublcass of ```Digraph```
    * overrides ```addEdge```
    * why ```Graph``` is a subclass of ```Digraph``` and not the other way around - **substitution principle** - client code that works correctly using an instance of the supertype should also work with an instance of the subtype when substituted for the instance of the supertype

### Some Classic Graph-Theoretic Problems

- **Shortest path**: For some pair of nodes $n1$ and $n2$, find the shortest sequence of edges $<s_n, d_n>$ (source and destination node), such that
    * The source node in the first edge is $n1$
    * The destination node of the last edge is $n2$
    * For all edges $e1$ and $e2$ in the sequence, if $e2$ follows $e1$ in the sequence, the source node of $e2$ is the destination node of $e1$
    
- **Shortest weighted path**: Like the shortest path, except instead of choosing the shortest sequence of edges that connects two nodes, we define some function on the weights of edges in the sequence (e.g. their sum) and minimize that value

- **Maximum clique**: A **clique** is a set of nodes such that there is an edge between each pair of nodes in the set. A maximum clique is a clique of the largest size in a graph

- **Min cut**: Given two sets of nodes in a graph, a **cut** is a set of edges whose removal eliminates all paths from each node in one set to each node in the other. The minimum cut is the smallest set of edges whose removal accomplishes this


### Shortest Path: Depth-First Search (DFS)

- Choose one child of the start node
- Choose one child of that node and so on, going deeper and deeper
- Until you reach the goal node or a node with no children
- **Backtrack**, returning to the most recent node with children that you have not yet visited
- When all paths have been explored, choose the shortest path

##### Code

### Shortest Path: Breadth-First Search (BFS)

- Visit all children of the start node
- If none of those is the end node, visit all children of each of those ndoes. And so on
- Unlike depth-first serach, which is usually implemented recursively, breadth-first search is usually implemented iteratively
- Since it generates paths in ascending order of length, the first path found with the goal as its last node is guaranteed to have a minimum number of edges

##### Code