# Algorithms 202: Coursework 1 Task 2: Dynamic Programming

Group-ID: XX

Group members: ADD NAMES HERE 

# Objectives

The aim of this coursework is to enhance your algorithmic skills by mastering the divide and conquer and dynamic programming strategies. You are asked to show that you can:

- implement dynamic programming algorithms
- run an experimental analysis to find the answer for a given problem

This notebook *is* the coursework. It contains cells with function definitions that you will need to complete. You will submit this notebook as your coursework.

The comparisons of different algorithms involve textual descriptions and graphical plots. For graphing you will be using [matplotlib](http://matplotlib.org/index.html) to generate plots. [This tutorial](http://matplotlib.org/index.html) will be useful to go through to get you up to speed. For the textual descriptions you may wish to use [LaTeX](http://en.wikipedia.org/wiki/LaTeX) inline like $\mathcal{O}(n\log{}n)$. Double click this cell to reveal the required markup - and [see here](http://texblog.org/2014/06/24/big-o-and-related-notations-in-latex/) for useful guidance on producing common symbols used in asymptotic run time analysis.

# Preliminaries: helper functions

Here we define a collection of functions that will be useful for the rest of the coursework. You'll need to run this cell to get started.

In [None]:
# so our plots get drawn in the notebook
%matplotlib inline
from matplotlib import pyplot as plt
import numpy as np
from pathlib import Path
from sys import maxsize
from time import clock
from urllib.request import urlretrieve

# a timer - runs the provided function and reports the
# run time in ms
def time_f(f):
    before = clock()
    f()
    after = clock()
    return after - before

# we can get a word list from here - we download it once
# to 'wordlist.txt' and then reuse this file.
url = 'http://www.doc.ic.ac.uk/~bglocker/teaching/wordlist.txt'
if not Path('wordlist.txt').exists():
    print("downloading word list...")
    urlretrieve(url, 'wordlist.txt')
    print('acquired word list.')
    
with open('wordlist.txt') as f:
    # here we use a *set* comprehension - just
    # like we've done with lists in the past but
    # the result is a set so each element is
    # guaranteed to be unique.
    # https://docs.python.org/3/tutorial/datastructures.html#sets
    # note that you can loop over a set just like you would a list
    wordlist = {l.strip() for l in f.readlines()}
    print("loaded set of words into 'wordlist' variable")

## Task 2: Dynamic Programming

### 2a. Implement `levenshtein_distance`

Complete the below definition for `levenshtein_distance`. Do not change the name of the function or it's arguments. 


Hints:

- You are given access to numpy (`np`). Numpy is the crown jewel of the scientific Python community - it provides a multidimensional array (`np.array()`) which can be very convenient to solve problems involving matrices.

In [None]:
def levenshtein_distance(x, y):
    # complete function without changing signature
    pass

Use this test to confirm your implementation is correct.

In [None]:
print(levenshtein_distance('sunny', 'snowy') == 3)
print(levenshtein_distance('algorithm', 'altruistic') == 6)
print(levenshtein_distance('imperial', 'empirical') == 3)
print(levenshtein_distance('weird', 'wired') == 2)

### 2b. Find the minimum levenshtein distance

Use your `levenshtein_distance` function to find the `closest_match` between a `candidate` word and an iterable of `words`. Note that if multiple words from `words` share the minimal edit distance to the `candidate`, you should return the word which would come first in a dictionary. 

As a concrete example, `zark` has an edit distance of 1 with both `ark` and `bark`, but you would return `ark` as it comes lexicographically before `bark`.

Your function should return a tuple of two values - first the closest word match, and secondly the edit distance between this word and the candidate.

```python
closest, distance = closest_match('zark', ['ark', 'bark', ...])
assert closest == 'ark'
assert distance == 1
```

In [None]:
def closest_match(candidate, words):
    # complete function without changing signature
    pass

Run the below cell to test your implementation

In [None]:
# A one liner that queries closest_match and then prints the result
print_closest = lambda w, wl: print('{}: {} ({})'.format(w, *closest_match(w, wl)))

print_closest('zilophone', wordlist)
print_closest('inconsidrable', wordlist)
print_closest('bisamfiguatd', wordlist)

**Discuss in a few lines the running time of `closest_match`. Can you propose any ideas for making this faster? (Only discuss those in words, no need to do any implementations, unless you want to.)**

*Replace with your discussion...*

### 2c. Coin change problem

Coin change is the problem of finding the least number of coins for a given amount of money.

For example, the UK coin set contains the following coins:
1p, 2p, 5p, 10p, 20p, 50p, £1, £2, and £5 (very uncommon).
For £2.82, the optimal change is £2, 50p, 20p, 10p, 2p.

i) Implement the `coin_change` function and answer the following questions by running an experimental analysis.

ii) How many coins are needed on average to represent any amounts between £0.01 and £5.00 with the UK coin set?

iii) How many more coins are needed on average to represent any amounts between £0.01 and £5.00 if we were to remove both the 10p and 20p coins from the UK coin set?

iv) If you had to decide whether to keep the 10p or the 20p coin in the UK coin set, which one would you choose?

In [None]:
change = [0 for x in range(10000)]
sol = [0 for x in range(10000)]

def coin_change(n,coins):
    # n is 0, basecase. We need no coins
    if n == 0:
        return 0, change
    
    # We have already computed the solution, just return it 
    if sol[n] != 0:
        return sol[n], change
    
    # Try every possible coin and recursively compute the best solution
    for c in coins:
        if n - c >= 0:
            sol_t, _ = coin_change(n - c, coins)
            sol_t = sol_t + 1
            if sol_t < sol[n] or sol[n] == 0:
                sol[n] = sol_t
                change[n] = c
        
    return sol[n], change

In [None]:
def print_change(n,coins):
    counts, change = coin_change(n,coins)
    while n > 0:
        print(change[n])
        n = n - change[n]

UK_coin_set = [1,2,5,10,20,50,100,200,500]
print_change(252,UK_coin_set)

*Do your experimental analysis here...*

Answer questions (ii)-(iv) here.

In [None]:
data = range(1,20)
#data = range(1,30)
#data = range(1,1000,10)
#data = range(1,10000,100)
#data = range(1,100000,1000)

In [None]:
dp_top_down_res = []

for i in data:
    dp_top_down_res.append(time_f(lambda: coin_change(i)))

In [None]:
%matplotlib inline
from matplotlib import pyplot as plt

plt.scatter(data, dp_top_down_res, c='red')
#plt.scatter(data, tdp, c='blue')
plt.xlabel('n')
plt.ylabel('time (/s)')
plt.xlim(0)
plt.ylim(0)