# Dynamic Programming

adopted from: https://www.amazon.com/Python-Algorithms-Mastering-Basic-Language/dp/148420056X

Python Algorithms: Mastering Basic Algorithms in the Python Language

![Bellman](https://upload.wikimedia.org/wikipedia/en/7/7a/Richard_Ernest_Bellman.jpg)

https://en.wikipedia.org/wiki/Richard_E._Bellman

## Dynamic Programming - not the programming in computer terms!
The term dynamic programming (or simply DP) can be a bit confusing to newcomers. Both of the words are
used in a different way than most might expect. Programming here refers to making a set of choices (as in “linear
programming”) and thus has more in common with the way the term is used in, say, television, than in writing
computer programs. Dynamic simply means that things change over time—in this case, that each choice depends
on the previous one. In other words, this “dynamicism” has little to do with the program you’ll write and is just a
description of the problem class. In Bellman’s own words, “I thought dynamic programming was a good name. It was
something not even a Congressman could object to. So I used it as an umbrella for my activities

* The core technique of DP -> caching
* Decompose your problem recursively/inductively (usual)
* allow overlap between the subproblems. 
* Plain recursive solution xponential number of times -> caching trims away waste
* result is usually both an impressively efficient algorithm and a greater insight into the problem.


Commonly, DP algorithms turn the recursive formulation upside down, making it iterative and filling out some
data structure (such as a multidimensional array) step by step. 

* Another option well suited to high-level languages such as Python—is to implement the recursive formulation directly but to cache the return
values. 
* If a call is made more than once with the same arguments, the result is simply returned directly from the
cache. This is known as **memoization**

## Little puzzle: Longest Increasing Subsequence

Say you have a sequence of numbers, and you want to find its
longest increasing (or, rather nondecreasing) subsequence—or one of them, if there are more. A subsequence consists
of a subset of the elements in their original order. So, for example, in the sequence [3, 1, 0, 2, 4], one solution
would be [1, 2, 4].

# Brute force - when in doubt use brute force

In [1]:
from itertools import combinations # from standard library

In [2]:
my_list = [3, 1, 0, 2, 4]
my_list

[3, 1, 0, 2, 4]

In [3]:
list(combinations(my_list, 4))

[(3, 1, 0, 2), (3, 1, 0, 4), (3, 1, 2, 4), (3, 0, 2, 4), (1, 0, 2, 4)]

In [4]:
list(combinations(my_list, 3)) # 5 choose 3 - formula is n! / (k! * (n - k)!)
# n is the number of items in the list - 5
# k is the number of items in each combination - 3
# so here we have 5! / (3! * (5 - 3)!) = 5! / (3! * 2!) = 120 / (6 * 2) = 120 / 12 = 10

[(3, 1, 0),
 (3, 1, 2),
 (3, 1, 4),
 (3, 0, 2),
 (3, 0, 4),
 (3, 2, 4),
 (1, 0, 2),
 (1, 0, 4),
 (1, 2, 4),
 (0, 2, 4)]

In [5]:
list(combinations(my_list, 2))

[(3, 1),
 (3, 0),
 (3, 2),
 (3, 4),
 (1, 0),
 (1, 2),
 (1, 4),
 (0, 2),
 (0, 4),
 (2, 4)]

In [5]:
list(combinations(my_list, 1))
# gives us a list of single item tuples

[(3,), (1,), (0,), (2,), (4,)]

In [6]:

def naive_list(seq):
    for length in range(len(seq), 0, -1): # n, n-1, ... , 1 # worse case for decreasing range would be 1 item as a solution
        for sub in combinations(seq, length): # Subsequences of given length
            # TODO find a faster way to check if the subsequence is sorted
            if list(sub) == sorted(sub): # An increasing subsequence? sorted one would be same with the original one
                # possible we could cache the sorted version of the subsequence
                return sub # Return it!

In [7]:
naive_list([3,1,0,2,4])

(1, 2, 4)

In [8]:
# i started with 5 then 4 then checked 3 number combination 
list(combinations(my_list, 3))

[(3, 1, 0),
 (3, 1, 2),
 (3, 1, 4),
 (3, 0, 2),
 (3, 0, 4),
 (3, 2, 4),
 (1, 0, 2),
 (1, 0, 4),
 (1, 2, 4),
 (0, 2, 4)]

In [12]:
naive_list([5,0,2,1,6,3,7,4,6])

(0, 2, 3, 4, 6)

In [None]:
# This works (on smaller sequences at least)



In [8]:
import random
random.seed(2024) # Set the seed so we all get the same pseudo-"random" results
r10 = [random.randint(1,100) for _ in range(10)]
r20 = [random.randint(1,100) for _ in range(20)]
r25 = [random.randint(1,100) for _ in range(25)]
r100 = [random.randint(1,1000) for _ in range(100)]
r1k = [random.randint(1,10_000) for _ in range(1000)]
r10

[61, 24, 94, 75, 39, 26, 93, 53, 97, 92]

In [9]:
naive_list(r10)

(61, 75, 93, 97)

In [10]:
%%timeit
naive_list(r10)

296 µs ± 11.7 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [11]:
naive_list(r20)

(34, 46, 54, 68, 79, 91, 94)

In [12]:
%%timeit
naive_list(r20)

473 ms ± 11.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [13]:
naive_list(r25)

(27, 28, 53, 54, 60, 93, 96, 98)

In [None]:
# how about complexity? 
# Two nested loops -> n^2 ?
# Hint combinations is not O(1).... so it is at least O(n^2) but more likely O(n^3) or worse

# TODO find exact complexity of this naive_list
# HINT anything involving factorials, is going to be candidate for factorial complexity

## Fibonacci numbers

In [15]:
## Fibonacci
def fib(i): # finding i-th member in our Fibonacci chain 1,1,2,3,5,8,13,21,34,55,89
    if i < 2: # base case - first two numbers are 1
        return 1
    else:
        return fib(i-1) + fib(i-2) # we have two recursive calls, but our problem space is not going down by half

In [16]:
fib(0),fib(1),fib(2),fib(3),fib(4),fib(5)

(1, 1, 2, 3, 5, 8)

In [17]:
# sidenote: Fibonaccy ratios approach golden mean
for n in range(1,20):
    print(fib(n), fib(n-1), round(fib(n)/fib(n-1),8))

1 1 1.0
2 1 2.0
3 2 1.5
5 3 1.66666667
8 5 1.6
13 8 1.625
21 13 1.61538462
34 21 1.61904762
55 34 1.61764706
89 55 1.61818182
144 89 1.61797753
233 144 1.61805556
377 233 1.61802575
610 377 1.61803714
987 610 1.61803279
1597 987 1.61803445
2584 1597 1.61803381
4181 2584 1.61803406
6765 4181 1.61803396


![Fib Spiral](https://upload.wikimedia.org/wikipedia/commons/thumb/9/93/Fibonacci_spiral_34.svg/500px-Fibonacci_spiral_34.svg.png)

https://en.wikipedia.org/wiki/Golden_ratio

In [18]:
# so far so good?
fib(30)

1346269

In [19]:
fib(35)

14930352

In [24]:
fib(36)

24157817

#https://stackoverflow.com/questions/35959100/explanation-on-fibonacci-recursion
![FibTree](https://i.stack.imgur.com/QVSdv.png)

In [22]:
from functools import wraps
# memo is a function that takes a function as argument and returns a wrapped version of this function with caching
def memo(func):
    cache = {} # Stored subproblem solutions, this dictionary - Hashmap type - so O(1) lookup
    @wraps(func) # Make wrap look like func
    def wrap(*args): # The memoized wrapper
        if args not in cache: # Not already computed? O(1) lookup
            cache[args] = func(*args) # Compute & cache the solution O(1) to store - of func itself will have some cost
        return cache[args] # Return the cached solution - return again is O(1)
    return wrap # Return the wrapper
# PS Python actually has memoization library as well - we are learning here

In [23]:
fib_memo = memo(fib) #functions are first class citizens in Python


In [28]:
fib_memo(30)

1346269

In [31]:
fib_memo(34)

9227465

In [35]:
fib_memo(35)

14930352

In [None]:
%%timeit
fib(35)

1 loop, best of 5: 4.08 s per loop


In [32]:
@memo # this is a decorator - meaning we wrap our fib_m function in memo function
def fib_m(i):
    if i < 2: 
        return 1
    else:
        return fib_m(i-1) + fib_m(i-2)

In [33]:
%%timeit
fib_m(35)

304 ns ± 8.07 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [34]:
%%timeit
fib_m(38)

366 ns ± 79.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [35]:
%%timeit
fib_m(45)

313 ns ± 33.8 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [None]:
fib_m(36) # well the book implemention does not quite work, it is not caching properly

24157817

In [None]:
fib_m(40)

165580141

In [None]:
fib_m(60)

2504730781961

In [47]:
fib_m(200)

453973694165307953197296969697410619233826

In [20]:
import sys
sys.getrecursionlimit()

3000

In [52]:
fib_m(600) # on Google Colab stack size is limited to 1k by defaulb

178684461669052552311410692812805706249615844217278044703496837914086683543763273909969771627106004287604844670397177991379601

In [3]:
import sys
sys.setrecursionlimit(100_000)

In [4]:
# check recursion limit
sys.getrecursionlimit()

100000

In [None]:
fib_m(2000) 
# sadly memoization will not solved stack overflow problem

6835702259575806647045396549170580107055408029365524565407553367798082454408054014954534318953113802726603726769523447478238192192714526677939943338306101405105414819705664090901813637296453767095528104868264704914433529355579148731044685634135487735897954629842516947101494253575869699893400976539545740214819819151952085089538422954565146720383752121972115725761141759114990448978941370030912401573418221496592822626

In [None]:
fib_m(5000) 
# sadly memoization will not help us with stack overflow , a good way to force kernel restart :)

6276302800488957086035253108349684055478528702736457439025824448927937256811663264475883711527806250329984690249846819800648580083040107584710332687596562185073640422286799239932615797105974710857095487342820351307477141875012176874307156016229965832589137779724973854362777629878229505500260477136108363709090010421536915488632339240756987974122598603591920306874926755600361865354330444681915154695741851960071089944015319300128574107662757054790648152751366475529121877212785489665101733755898580317984402963873738187000120737824193162011399200547424034440836239726275765901190914513013217132050988064832024783370583789324109052449717186857327239783000020791777804503930439875068662687670678802914269784817022567088069496231111407908953313902398529655056082228598715882365779469902465675715699187225655878240668599547496218159297881601061923195562143932693324644219266564617042934227893371179832389642895285401263875342640468017378925921483580111278055044254198382265567395946431803304304326865077

In [21]:
import functools
# https://stackoverflow.com/questions/1988804/what-is-memoization-and-how-can-i-use-it-in-python
@functools.lru_cache(maxsize=None) #by default only 128 latest so by setting it to None I cache everything - this is a decorator
def fib(num):
    if num < 2:
        return num
    else:
        return fib(num-1) + fib(num-2)
# in 3.9 Python there is a newer @functools.cache - similar idea

In [22]:
fib(35)

9227465

In [23]:
%%timeit
fib(35)

81.6 ns ± 6.08 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [24]:
%%timeit
fib(45)

82.7 ns ± 2.6 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [42]:
%%timeit
fib(100)

67.4 ns ± 4.46 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [25]:
fib(900)

54877108839480000051413673948383714443800519309123592724494953427039811201064341234954387521525390615504949092187441218246679104731442473022013980160407007017175697317900483275246652938800

In [26]:
fib(1_000)

43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875

In [27]:
%%timeit
fib(1_000)

94.2 ns ± 2.44 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [None]:
%%timeit
fib_m(100)  # so my memoization solution is about 2.5x slower than library solution but both are extremely fast

The slowest run took 44.83 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 5: 247 ns per loop


In [None]:
# so official caching version is 3 times faster than our self-made version, factor of 3 is not a dealbreak but nice to know

In [None]:
fib(40)

102334155

In [None]:
fib(100)

354224848179261915075

In [42]:
fib(200)

280571172992510140037611932413038677189525

In [41]:
fib(1200)

27269884455406270157991615313642198705000779992917725821180502894974726476373026809482509284562310031170172380127627214493597616743856443016039972205847405917634660750474914561879656763268658528092195715626073248224067794253809132219056382939163918400

In [None]:
fib(3000)

410615886307971260333568378719267105220125108637369252408885430926905584274113403731330491660850044560830036835706942274588569362145476502674373045446852160486606292497360503469773453733196887405847255290082049086907512622059054542195889758031109222670849274793859539133318371244795543147611073276240066737934085191731810993201706776838934766764778739502174470268627820918553842225858306408301661862900358266857238210235802504351951472997919676524004784236376453347268364152648346245840573214241419937917242918602639810097866942392015404620153818671425739835074851396421139982713640679581178458198658692285968043243656709796000

In [None]:
# so the problem with TOP-DOWN memoization is that we are still left with recursive calls going over our stack limit

## So how to solve the stack overflow problem?

In [None]:
# we could try the build up solution - meaning BOTTOM-UP method
# this will usually be an iterative(looping) solution so no worries about stack


# Iterative version of Fibonacci with full table

In [28]:
# silly iterative version
# useful if you want to return ALL fibs
def fib_it(n): # so n will be 0 based
    fibs = [1,1] #so we are going to build our answers
    # lets pretend we do not know of any formulas and optimizations
#     n += 1 # fix this
    if n < 2:
        return fibs[n] # off by one errors
    ndx = 2
    while ndx <= n:
        fibs.append(fibs[ndx-1]+fibs[ndx-2]) # so I am building a 1-d table(array/list) of answers
        # so appending to list incurs some cost as well
        # in fib we could have of course just stored 2 previous values instead of all
        ndx+=1
    return fibs[n] # again off by one indexing 0 based in python and 1 based in our function


In [29]:
fib_it(2)

2

In [30]:
fib_it(4)

5

In [31]:
fib_it(100)

573147844013817084101

In [36]:
fib(0),fib(1),fib(2),fib(3)

(0, 1, 1, 2)

In [33]:
fib_it(0)

1

In [32]:
fib_it(999)

43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875

In [37]:
%%timeit
fib(999)

98.9 ns ± 2.46 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [38]:
%%timeit
fib_it(999)

186 µs ± 21.6 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [48]:
%%timeit
fib_it(35)
# turns out this solution will be slower than memoized recursive solution
# why because we spend time setting up a list and appending items

8.08 µs ± 488 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


In [50]:
%%timeit
fib_it(100)

21.5 µs ± 3.44 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


In [39]:
%%timeit
fib_it(10_000)

5.86 ms ± 294 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [41]:
len(str(fib_it(10_000)))

2090

In [64]:
for n in range(0,10+1):
    print(fib_it(n))

1
1
2
3
5
8
13
21
34
55
89


In [None]:
fib_it(5),fib_it(6)

(8, 13)

In [None]:
fib_it(35),fib_it(36)

(14930352, 24157817)

In [None]:
%%timeit
fib_it(35)

15.1 µs ± 160 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [None]:
# so one way we could improve is that we do not need to store all this knowledge about previous solutions, 
# (unless we were building a table of ALL solutions)

In [42]:
# so we only need to store 2 values, but this would be improving constant and solving our space complexity
# instead of needing O(n) to store our fib(n), we just need O(1) - since we are onl storing two values
def fib_v2(n):
    prev, cur = 1, 1 # so we save memory by not storing all previous values
    if n <= 1:
        return prev
    ndx = 2
    while ndx <= n:
        prev, cur = cur, prev+cur # python makes it easy to assign 2 values at once with tuple unpacking, right side eval first
        # if your language does not support it you can simply use a third temp variable 
        ndx += 1
    return cur

In [43]:
fib_v2(5),fib_v2(6)

(8, 13)

In [44]:
fib_v2(999)

43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875

In [45]:
%%timeit
fib_v2(999)

83.2 µs ± 2.53 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


In [56]:
%%timeit
fib_v2(100)

8.47 µs ± 683 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


In [57]:
%%timeit
fib_v2(10_000)

3.35 ms ± 614 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [59]:
len(str(fib_v2(10_000)))

2090

# Pascal's triangle
![Triangle](https://upload.wikimedia.org/wikipedia/commons/0/0d/PascalTriangleAnimated2.gif)

The combinatorial
meaning of C(n,k) is the number of k-sized subsets you can get from a set of size n.

In mathematics, a combination is a selection of items from a collection, such that the order of selection does not matter (unlike permutations). https://en.wikipedia.org/wiki/Combination

In [47]:
# this is horrible again we have 2 recursive calls for each call
def C(n,k):
    # first base case
    if k == 0: return 1 # means we have chosen all k items
    if n == 0: return 0 # means we have no items to choose from
    return C(n-1,k-1) + C(n-1,k)

In [48]:
C(2,0),C(2,1),C(2,2)

(1, 2, 1)

In [49]:
C(3,0),C(3,1),C(3,2),C(3,3)

(1, 3, 3, 1)

In [50]:
C(4,0),C(4,1),C(4,2),C(4,3),C(4,4)

(1, 4, 6, 4, 1)

In [51]:
for k in range(6):
    print(C(5,k), end=" ")
print()

1 5 10 10 5 1 


In [52]:
C(20,12) # so choosing 12 items from 20

125970

In [53]:
%%timeit
C(20,12)

187 ms ± 6.56 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [54]:
import functools

In [55]:
@functools.lru_cache(maxsize=None)
def C_mem(n,k):
    if k == 0: return 1
    if n == 0: return 0
    return C_mem(n-1,k-1) + C_mem(n-1,k)

In [56]:
C_mem(20,12)

125970

In [57]:
%%timeit
C_mem(20,12)

121 ns ± 6.94 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [58]:
%%timeit
C_mem(25,15)

116 ns ± 3.49 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [84]:
%%timeit
C(22,14)

1.56 s ± 8.06 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [38]:
!python --version  # so have to wait for Google to upgrade Python to 3.9, but of course you can try at home on your own computer 

Python 3.11.4


In [59]:
C_mem(50,30)

47129212243960

In [60]:
%%timeit
C_mem(50,30)

131 ns ± 5.35 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


### lru_cache wrapper

Newer versions of Python have a decorator that does the same thing as our memoize function. It’s called lru_cache
There is even a maxsize parameter that allows you to limit the size of the cache. This is useful if you know that
you’ll only need the last n results. The name stands for “least recently used,” and it means that if the cache is full

In [61]:
C_mem(64,8) # so BRUTE FORCE for 8-queens problem would require checking this many

4426165368

In [62]:
2**32 # no relation to above number

4294967296

In [None]:
%%timeit
C_mem(20,12)

203 ns ± 30 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


So 3 Million times faster
And it will only get worse with larger n and k values


In [None]:
C(22,10)  # keeps getting slower

646646

In [None]:
C_mem(30,22)

5852925

In [73]:
C_mem(500,127)

46437038934385738875060685074580974915827322393712987687159475139635977046776280972687848152763132582010929859049061968000

In [74]:
C_mem(2000,1555) # so we calculated how many different ways we can pick 1555 items out of set of 2000 items

RecursionError: maximum recursion depth exceeded

In [None]:
C_mem(49,6)

13983816

In [None]:
C_mem(55,7)  # so our odds in lotteries are not so great...

202927725

You may at times want to rewrite your code to make it iterative. This
can make it faster, and you avoid exhausting the stack if the recursion depth gets excessive. There’s another reason, too:
The iterative versions are often based on a specially constructed cache, rather than the generic “dict keyed by parameter
tuples” used in my @memo. 

This means that you can sometimes use more efficient structures, such as the multidimensional
arrays of NumPy (this is a C type array dense and efficient), or even just nested lists. 

This custom cache design
makes it possible to do use DP in more low-level languages(ahem C, C++), where general, abstract solutions such as our @memo decorator
are often not feasible.

Note that even though these two techniques often go hand in hand, you are certainly free to use an
iterative solution with a more generic cache or a recursive one with a tailored structure for your subproblem solutions.

Let’s reverse our algorithm, filling out Pascal’s triangle directly. 

In [63]:
from collections import defaultdict  # we could use a regular dict , just less work with defaultdict
# in your language check what collections are available, C#, Java have very nice collections libraries
# careful with C++ collections, check complexity on some of them, it might not be O(1) - map for one (which turns out to be ordered)
# you'd need to look up documentation for that particular structure - so in C++ you might need to use unordered map

In [66]:
def pascal_up(n,k):
    # so tabulated solution, we are going to build a table of solutions
    Cit = defaultdict(int)
    # Cit = {} # turns out going to plain dictionary did not help at all, slow down by 10%
    # careful from off by one errors
    for row in range(n+1):
        Cit[row,0] = 1 # we could have used previous row to calculate this one, technically just need first one
        for col in range(1,k+1): # looking like O(n*k) space and time complexity here right?
            # I am using get to avoid looking at edge cases (where we have no value to look up)
            Cit[row,col] = Cit.get((row-1,col-1),0) + Cit.get((row-1,col),0)
    return Cit[n,k]

### Pascal triangle with iterative DP - tabulation

In [None]:
pascal_up(20,12)

125970

In [None]:
%%timeit
pascal_up(20,12)

137 µs ± 4.18 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


In [67]:
pascal_up(50,30)

47129212243960

In [68]:
%%timeit
pascal_up(50,30)

821 µs ± 29.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [None]:
%%timeit
C_mem(20,12)

131 ns ± 6.31 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [69]:
pascal_up(200,120)

1647278652451762678788128833110870712983038446517480945400

In [None]:
C_mem(200,120)

1647278652451762678788128833110870712983038446517480945400

In [71]:
pascal_up(2024,1223)

1978313580909322814404279742888220639083711979061915952790417145826485941327234894939258385780824686228079384862407027961433025056766438463773323839416940536781231172166724228912771210164695158929585213737757529910987728315522103983421673060914798337366517226742260910135847960153174204828837881201328813992496688218062280491920325152745752467152549279836765394857076973819686255193997158217396454096616616197813730448592629973705108771904401411312180534910702836373199696360608671550707240571399665331895768867458771375296689485866658969198724600312085207901774291234960477899482724866200

In [None]:
# C_mem(2000,1255) # so we see the need for iterative version - other way would be to allow tail call optimization in functional languages

In [None]:
# so we could futher save memory for our pascal_up by only saving the needed information meaning we only need the previous row

### Pascal triangle with iterative DP - tabulation

In [None]:
pascal_up(20,12)

125970

In [None]:
%%timeit
pascal_up(20,12)

137 µs ± 4.18 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


In [None]:
pascal_up(50,30)

47129212243960

In [None]:
%%timeit
pascal_up(50,30)

822 µs ± 29.4 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [None]:
%%timeit
C_mem(20,12)

131 ns ± 6.31 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [None]:
pascal_up(200,120)

1647278652451762678788128833110870712983038446517480945400

In [None]:
C_mem(200,120)

1647278652451762678788128833110870712983038446517480945400

In [None]:
pascal_up(2000,1255)

59109593628908414948382028796066195825322294401884421608567282370732677108447726193711784730302760990299603214712874545403948907216441186822204871444990120418511945136706510520055755556735802821658571978418751352273298267569828345746978809441973750241286775122448947529540564006233186974990594534845090378195822798749749716850238790136589693799060708520531078121496569151588079913232831392320592160010660575061801016625866394043795717433667107028907325906172066327055484845091441072359350112683890957329354388664535777738465246096416216139097702868890896959249389851651200

In [None]:
# C_mem(2000,1255) # so we see the need for iterative version - other way would be to allow tail call optimization in functional languages

In [None]:
# so we could futher save memory for our pascal_up by only saving the needed information meaning we only need the previous row

# Difference between TOP-DOWN (with memoization) and BOTTOM-UP (with filling up DP table)

Basically the same thing is going on. The main difference is that we need to figure out which cells in the cache
need to be filled out, and we need to find a safe order to do it in so that when we’re about to calculate C[row,col], the
cells C[row-1,col-1] and C[row-1,col] are already calculated. With the memoized function, we needn’t worry about
either issue: It will calculate whatever it needs recursively.

In [None]:
## Back to LIS

In [73]:
# so recursive memoized solution - TOP/DOWN
def rec_lis(seq): # Longest increasing subseq.
    @functools.lru_cache(maxsize=None)
    def L(cur): # Longest ending at seq[cur]
        res = 1 # Length is at least 1 - because well because one number is always going to be a solution to non empty list
        # so for example reversely sorted list would have a solution 1 - because the numbers are going down
        for pre in range(cur): # Potential predecessors
            if seq[pre] <= seq[cur]: # A valid (smaller) predec.
                res = max(res, 1 + L(pre)) # Can we improve the solution?
        return res
    return max(L(i) for i in range(len(seq))) # The longest of them all

In [74]:
rec_lis([3,1,0,2,4])

3

In [76]:
naive_list([3,1,0,2,4,7,-6,9])

(1, 2, 4, 7, 9)

In [75]:
rec_lis([3,1,0,2,4,7,-6,9])  # so 5 should be answer here

5

In [54]:
# so recursive memoized solution - TOP/DOWN
# TODO add sequence passing
def rec_lis_full(seq): # Longest increasing subseq.
    @functools.lru_cache(maxsize=None)
    def L(cur): # Longest ending at seq[cur]
        res = 1 # Length is at least 1
        for pre in range(cur): # Potential predecessors
            if seq[pre] <= seq[cur]: # A valid (smaller) predec.
                res = max(res, 1 + L(pre)) # Can we improve the solution?
        return res
    return max(L(i) for i in range(len(seq))) # The longest of them all

In [77]:
# tabulated solution
# tabulation is the classical dynamic programming approach - BOTTOM - UP
def basic_lis(seq, debug=True):
    L = [1] * len(seq)
    for cur, val in enumerate(seq):
        for pre in range(cur): # so two nested loops smells quadratic right?
            if seq[pre] < val:
                L[cur] = max(L[cur], 1 + L[pre])
                if debug:
                    print(L)
    return max(L) # linear lookup

In [None]:
# TODO add iterative sequence passing 

In [78]:
basic_lis([3,1,2,0,4])

[1, 1, 2, 1, 1]
[1, 1, 2, 1, 2]
[1, 1, 2, 1, 2]
[1, 1, 2, 1, 3]
[1, 1, 2, 1, 3]


3

In [104]:
basic_lis([3,1,0,2,4])

[1, 1, 1, 2, 1]
[1, 1, 1, 2, 1]
[1, 1, 1, 2, 2]
[1, 1, 1, 2, 2]
[1, 1, 1, 2, 2]
[1, 1, 1, 2, 3]


3

In [None]:
basic_lis([3,1,0,2,4,7,-6,9]) 

5

In [89]:
for i,n in enumerate("Valdis",start=100):  # just a Pythonic way of adding index to sequences
    print(i,n)

100 V
101 a
102 l
103 d
104 i
105 s


In [90]:
basic_lis([3,1,0,2,4,7,9,6])

[1, 1, 1, 2, 1, 1, 1, 1]
[1, 1, 1, 2, 1, 1, 1, 1]
[1, 1, 1, 2, 2, 1, 1, 1]
[1, 1, 1, 2, 2, 1, 1, 1]
[1, 1, 1, 2, 2, 1, 1, 1]
[1, 1, 1, 2, 3, 1, 1, 1]
[1, 1, 1, 2, 3, 2, 1, 1]
[1, 1, 1, 2, 3, 2, 1, 1]
[1, 1, 1, 2, 3, 2, 1, 1]
[1, 1, 1, 2, 3, 3, 1, 1]
[1, 1, 1, 2, 3, 4, 1, 1]
[1, 1, 1, 2, 3, 4, 2, 1]
[1, 1, 1, 2, 3, 4, 2, 1]
[1, 1, 1, 2, 3, 4, 2, 1]
[1, 1, 1, 2, 3, 4, 3, 1]
[1, 1, 1, 2, 3, 4, 4, 1]
[1, 1, 1, 2, 3, 4, 5, 1]
[1, 1, 1, 2, 3, 4, 5, 2]
[1, 1, 1, 2, 3, 4, 5, 2]
[1, 1, 1, 2, 3, 4, 5, 2]
[1, 1, 1, 2, 3, 4, 5, 3]
[1, 1, 1, 2, 3, 4, 5, 4]


5

In [None]:
# so basic_lis is tabulating and using two nested loops - thus O(n^2) (again not all nested loops will be n^2 but most are)
# can we do better for this increasing subsequence length

A crucial insight is that if more than one predecessor terminate subsequences of length m, it doesn’t matter which
one of them we use—they’ll all give us an optimal answer. Say, we want to keep only one of them around; which one
should we keep? The only safe choice would be to keep the smallest of them, because that wouldn’t wrongly preclude
any later elements from building on it. So let’s say, inductively, that at a certain point we have a sequence end of
endpoints, where end[idx] is the smallest among the endpoints we’ve seen for increasing subsequences of length idx+1
(we’re indexing from 0). Because we’re iterating over the sequence, these will all have occurred earlier than our current
value, val. All we need now is an inductive step for extending end, finding out how to add val to it. If we can do that, at
the end of the algorithm len(end) will give us the final answer—the length of the longest increasing subsequence.

This devilishly clever little algorithm was first was first described by Michael L. Fredman in 1975

## Optimal Longest Increasing Subsequence
https://www.sciencedirect.com/science/article/pii/0012365X7590103X

In [91]:
# https://docs.python.org/3/library/bisect.html - basically we use binary search to find insertion points 
from bisect import bisect  # provides support for maintaining a list in sorted order without having to sort the list after each insertion. 
def lis(seq): # Longest increasing subseq.
    end = [] # End-values for all lengths
    for val in seq: # Try every value, in order
        idx = bisect(end, val) # no way to make this constant but this is logn
        if idx == len(end): # Can we build on an end val?
            end.append(val) # Longest seq. extended
        else: 
            end[idx] = val # Prev. endpoint reduced
    return len(end) # The longest we found

In [92]:
lis([3,1,0,2,4,7,9,6])

5

## So Top-Down
* Create a recursive solution
* memoize it - (cache it!)

## Bottom-Up
* figure out a way of storing previous results
* tabulate(calculate) up
* you might not need to store the full table maybe just previous row - can save space

In [None]:
## TODO tabulated solution to LIS problem that returns the actual sequence

## Coin change solution using dynamic bottom-up programming

Idea is to use a table of previous results to calculate the next one
Next solution has to be a sum of two previous solutions - so we can use the previous results to calculate the next one

Let's remember how our greedy solution failed on some cases

* For example Greedy would fail on [1, 3, 4] and 6 - where correct solution is 3,3 but greedy would give 4,1,1
* Another fail would be on [1,10,25] and 31 - greedy would give 25,1,1,1,1,1,1 and correct solution is 10,10,10,1

Instead we will use a dynamic programming solution - we will use a table to store the results of the previous solutions and use them to calculate the next one

In [79]:
# we will use a list to store our results
# we will have a sorted list of coins for each amount
# if we have a coin matching the amount we will add it to the list
def find_minimum_coins(amount, coins=(1,5,10,25)):
    table = [0] * (amount + 1) # so we are going to store our results here
    for i in range(1, amount + 1): # so we are going to go through all amounts
        table[i] = min(table[i - c] + 1 for c in coins if c <= i) # so we are going to find the minimum number of coins for each amount
    return table[amount], table

# let's test it on 31
coin_cnt, full_solution = find_minimum_coins(31)
# print full solution with amounts
print(*list(enumerate(full_solution)), sep="\n")

(0, 0)
(1, 1)
(2, 2)
(3, 3)
(4, 4)
(5, 1)
(6, 2)
(7, 3)
(8, 4)
(9, 5)
(10, 1)
(11, 2)
(12, 3)
(13, 4)
(14, 5)
(15, 2)
(16, 3)
(17, 4)
(18, 5)
(19, 6)
(20, 2)
(21, 3)
(22, 4)
(23, 5)
(24, 6)
(25, 1)
(26, 2)
(27, 3)
(28, 4)
(29, 5)
(30, 2)
(31, 3)


In [80]:
# lets try it for amount 15, and coins being (1,7,10)
coin_cnt, full_solution = find_minimum_coins(15, (1,7,10))
# print full solution with amounts
print(*list(enumerate(full_solution)), sep="\n")

(0, 0)
(1, 1)
(2, 2)
(3, 3)
(4, 4)
(5, 5)
(6, 6)
(7, 1)
(8, 2)
(9, 3)
(10, 1)
(11, 2)
(12, 3)
(13, 4)
(14, 2)
(15, 3)


In [61]:
# now let's test 31 but use only coins (1,10,25)
coin_cnt, full_solution = find_minimum_coins(31, (1,10,25))
# print full solution with amounts
print(*list(enumerate(full_solution)), sep="\n")

(0, 0)
(1, 1)
(2, 2)
(3, 3)
(4, 4)
(5, 5)
(6, 6)
(7, 7)
(8, 8)
(9, 9)
(10, 1)
(11, 2)
(12, 3)
(13, 4)
(14, 5)
(15, 6)
(16, 7)
(17, 8)
(18, 9)
(19, 10)
(20, 2)
(21, 3)
(22, 4)
(23, 5)
(24, 6)
(25, 1)
(26, 2)
(27, 3)
(28, 4)
(29, 5)
(30, 3)
(31, 4)


In [81]:
# lets get max int from sys
import sys
sys.maxsize, 2**63-1 # which is 2^63 - 1 because we also have 0 in our range


(9223372036854775807, 9223372036854775807)

In [83]:
# let's modify the function to store the actual coins used
def find_minimum_coins(amount, coins=(1,5,10,25)):
    table = [sys.maxsize] * (amount + 1) # so we are going to store our results here
    coins_used = [[]] * (amount + 1) # so we are going to store our results here
    for i in range(1, amount + 1): # so we are going to go through all amounts
        # table[i], coins_used[i] = min((table[i - c] + 1, c) for c in coins if c <= i) # so we are going to find the minimum number of coins for each amount
        if i in coins: # exact match so only 1 coin needed
            table[i] = 1
            coins_used[i] = [i]
        else: # now we need to find a minimum by combining previous solutions
            for n in range(i//2+1): # you could just go throuh i but this is a bit faster
                # we check previous solutions for minimum
                if table[n] + table[i-n] < table[i]:
                    table[i] = table[n] + table[i-n]
                    coins_used[i] = coins_used[n] + coins_used[i-n]
    return table[amount], coins_used

# test it on 31
coin_cnt, coins_used = find_minimum_coins(31)
# print coin_cnt
print(coin_cnt)
# print coins used
print(coins_used)

3
[[], [1], [1, 1], [1, 1, 1], [1, 1, 1, 1], [5], [1, 5], [1, 1, 5], [1, 1, 1, 5], [1, 1, 1, 1, 5], [10], [1, 10], [1, 1, 10], [1, 1, 1, 10], [1, 1, 1, 1, 10], [5, 10], [1, 5, 10], [1, 1, 5, 10], [1, 1, 1, 5, 10], [1, 1, 1, 1, 5, 10], [10, 10], [1, 10, 10], [1, 1, 10, 10], [1, 1, 1, 10, 10], [1, 1, 1, 1, 10, 10], [25], [1, 25], [1, 1, 25], [1, 1, 1, 25], [1, 1, 1, 1, 25], [5, 25], [1, 5, 25]]


In [84]:
# now let's check with 1,10,25 on 31
coin_cnt, coins_used = find_minimum_coins(31, (1,10,25))
# print coin_cnt
print(coin_cnt)
# print coins used
print(*coins_used, sep="\n")

4
[]
[1]
[1, 1]
[1, 1, 1]
[1, 1, 1, 1]
[1, 1, 1, 1, 1]
[1, 1, 1, 1, 1, 1]
[1, 1, 1, 1, 1, 1, 1]
[1, 1, 1, 1, 1, 1, 1, 1]
[1, 1, 1, 1, 1, 1, 1, 1, 1]
[10]
[1, 10]
[1, 1, 10]
[1, 1, 1, 10]
[1, 1, 1, 1, 10]
[1, 1, 1, 1, 1, 10]
[1, 1, 1, 1, 1, 1, 10]
[1, 1, 1, 1, 1, 1, 1, 10]
[1, 1, 1, 1, 1, 1, 1, 1, 10]
[1, 1, 1, 1, 1, 1, 1, 1, 1, 10]
[10, 10]
[1, 10, 10]
[1, 1, 10, 10]
[1, 1, 1, 10, 10]
[1, 1, 1, 1, 10, 10]
[25]
[1, 25]
[1, 1, 25]
[1, 1, 1, 25]
[1, 1, 1, 1, 25]
[10, 10, 10]
[1, 10, 10, 10]


## Algorithms that use Dynamic Programming

### Graph algorithms

* Shortest path algorithms
* Longest path algorithms
* Minimum spanning tree algorithms
* Maximum flow algorithms

### String algorithms

* Longest common subsequence
* Longest increasing subsequence
* Edit distance

### Combinatorial algorithms

* Binomial coefficients - Pascal's triangle
* Fibonacci numbers
* Factorials

### Geometric algorithms

* Convex hull
* Closest pair of points
* Largest empty rectangle