# Dynamic Programming (Part 2)

In this notebook, we'll explore solving the Traveling Salesperson problem using dynamic programming.

### If you're using Datahub:
* Run the cell below **and restart the kernel if needed**

### If you're running locally:
You'll need to perform some extra setup.
#### First-time setup
* Install Anaconda following the instructions here: https://www.anaconda.com/products/distribution 
* Create a conda environment: `conda create -n cs170 python=3.11`
* Activate the environment: `conda activate cs170`
    * See for more details on creating conda environments https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html
* Install pip: `conda install pip`
* Install jupyter: `conda install jupyter`

#### Every time you want to work
* Make sure you've activated the conda environment: `conda activate cs170`
* Launch jupyter: `jupyter notebook` or `jupyter lab` 
* Run the cell below **and restart the kernel if needed**

In [None]:
# Install dependencies
!pip install -r requirements.txt --quiet

In [None]:
import otter
assert (otter.__version__ >= "5.5.0"), "Please reinstall the requirements and restart your kernel."

grader = otter.Notebook("hw08.ipynb")
import numpy.random as random
from networkx import Graph, draw
import string
import pylev
import tqdm
import time
import pickle 

from autograder_utils import validate_tour, handle_timeout, profile

test_data = pickle.load(open("public_data.pkl", "rb"))

rng_seed = 0

## Q1: Traveling Salesperson DP
Now, we'll implement the dynamic programming algorithm for the traveling salesperson problem (TSP). A brute force solution will be hopelessly slow even for moderate-sized test cases, but we can use dynamic programming to get a solution in slightly more reasonable (but still exponential) time. For a refresher on the TSP algorithm, see Lecture 13 or https://people.eecs.berkeley.edu/~vazirani/algorithms/chap6.pdf#page=20. 

As with previous problems, we want you to return the actual tour, not the cost of the tour. We can once again apply the same procedure of backtracking through our subproblem array to reconstruct this tour.

### Representing Subproblems
If we use a set as one of our subproblem parameters, we can't directly use a 2D array to store our subproblems. There are two common ways to work around this issue:

#### 1. Subproblem Dictionary
You could store subproblems in a dictionary, where the keys are tuples of the form `(S, i)`, where `i` represents the last city visited before returning home and `S` is the set of cities visited so far. 

To make this work, you need to ensure that the keys are hashable. One way is using Python's built-in `frozenset` class for `S`. `frozenset` is a built-in type so you can use it without any additional imports, and works just like a normal set, except that it is immutable (and hashable). You can read more about `frozenset` here: https://docs.python.org/3/library/stdtypes.html#frozenset.

#### 2. Bitmasking
Instead of a hash set, we actually *can* still use a 2D array to store subproblems, where `S` is represented as an $n$-bit unsigned integer, and the $i$-th bit of `S` would be set to 1 if and only if the $i$-th city is part of the set of visited cities. Since `S` is an integer, we can use it to index into our 2D array.

The bitmasking approach tends to be about twice as fast and much more memory-efficient than the `frozenset` approach, but both approaches will pass the autograder if implemented correctly.

### Implementation tip:
As with before, storing the entire tour at each step is too memory-intensive and will cause the autograder to fail. Instead, consider maintaining a separate dictionary or array which stores a smaller amount of information but can still help you reconstruct the tour (can the "shortest path in the DP DAG" idea help here?).

**Be careful with indexing!** The algorithm from the book assumes your cities are labeled $1, \dots, n$ - if you are indexing into a Python list, will you need to adjust your indices?

Be careful with subproblem ordering! We need to ensure that whenever we go to solve a subproblem, all of the subproblems it depends on have already been solved.

The graph is not necessarily complete! If no tour is possible, return an empty list.

### Graph helpers
Like the last homeworks, we use a weighted adjacency list to represent the graph. We'll use a similar format as before, except `graph[u]` is now a hashmap instead of a list of pairs. 
 **For this assignment, graphs are undirected**, so if there is an (undirected) edge between nodes `u` and `v` with weight `w`, then `graph[u]` contains key `v` with value `w` and `graph[v]` contains key `u` with value `w`.

We provide the following code to help you test your implementation.

In [None]:
def generate_adj_list(n, edge_list: list[tuple[int]]) -> list[dict[int, int]]:
    """
    args:
        n:int = number of nodes in the graph. The nodes are labelled with integers 0 through n-1
        edge_list:list[tuple[int,int,int]] = edge list where each tuple (u,v,w) represents the 
        undirected and weighted edge (u,v,w) in the graph
    return:
        A list[dict[int, int]] representing the adjacency list 
    """
    adj_list = [{} for _ in range(n)] 
    for u, v, w in sorted(edge_list):
        adj_list[u][v] = w
        # undirected edges
        adj_list[v][u] = w
    return adj_list

def draw_graph(adj_list: list[dict[int, int]]):
    """Utility method for visualizing graphs

    args:
        adj_list: list[dict[int, int]] = adjacency list representation of the graph generated by generate_adj_list
    """
    G = Graph()
    for u in range(len(adj_list)):
        for v, w in adj_list[u]:
            G.add_edge(u, v, weight=w)
    draw(G, with_labels=True)

In [None]:
def tsp_dp(adj_list):
    """Compute the exact solution to the TSP using dynamic programming and returns the optimal path.

    Args:
        dist_arr: Weighted undirected graph represented as an adjacency list. 

    Returns:
        List[int]: A list of city indices representing the optimal path.
    """
    ...


In [None]:
grader.check("TSP")

### Debugging
A simplified verion of the otter tests are pasted here for your convenience. Feel free to add whatever print statements or assertions you'd like when debugging.

In [None]:
# tests on very small cases
for adj_list, expected_distance in tqdm.tqdm(test_data['TSP-1']):
    result = tsp_dp(adj_list)

    if expected_distance < 0: 
        # no tour is possible
        assert result == [], "You returned a tour when no tour is possible"
    else:
        assert set(result) == set(range(len(adj_list))), f"Your output does not visit all cities"
        student_length = validate_tour(result, adj_list)
        assert student_length >= 0, f"Your output is not a valid tour"
        assert student_length == expected_distance, f"Your output is not a minimum distance tour"

## Submission

Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. The cell below will generate a zip file for you to submit.

In [None]:
grader.export(pdf=False, force_save=True, run_tests=True)