# Dynamic Programming

adopted from: https://www.amazon.com/Python-Algorithms-Mastering-Basic-Language/dp/148420056X

Python Algorithms: Mastering Basic Algorithms in the Python Language

![Bellman](https://upload.wikimedia.org/wikipedia/en/7/7a/Richard_Ernest_Bellman.jpg)

https://en.wikipedia.org/wiki/Richard_E._Bellman

## Dynamic Programming - not the programming in computer terms!
The term dynamic programming (or simply DP) can be a bit confusing to newcomers. Both of the words are
used in a different way than most might expect. Programming here refers to making a set of choices (as in “linear
programming”) and thus has more in common with the way the term is used in, say, television, than in writing
computer programs. Dynamic simply means that things change over time—in this case, that each choice depends
on the previous one. In other words, this “dynamicism” has little to do with the program you’ll write and is just a
description of the problem class. In Bellman’s own words, “I thought dynamic programming was a good name. It was
something not even a Congressman could object to. So I used it as an umbrella for my activities

* The core technique of DP -> caching
* Decompose your problem recursively/inductively (usual)
* allow overlap between the subproblems. 
* Plain recursive solution xponential number of times -> caching trims away waste
* result is usually both an impressively efficient algorithm and a greater insight into the problem.


Commonly, DP algorithms turn the recursive formulation upside down, making it iterative and filling out some
data structure (such as a multidimensional array) step by step. 

* Another option well suited to high-level languages such as Python—is to implement the recursive formulation directly but to cache the return
values. 
* If a call is made more than once with the same arguments, the result is simply returned directly from the
cache. This is known as **memoization**

## Requirements for implementing Dynamic Programming

DP is primarily used for solving problems that exhibit two key properties: overlapping subproblems and optimal substructure. DP is particularly effective in reducing the computational complexity of problems by breaking them down into smaller, simpler subproblems and storing the solutions to avoid redundant calculations.

1. Overlapping subproblems: This property indicates that a problem can be broken down into smaller subproblems, which are solved independently. However, these subproblems may share common subproblems, leading to repetitive calculations. Dynamic programming exploits this property by caching the results of subproblems, which can be reused in solving the overall problem.

2. Optimal substructure: This property indicates that an optimal solution to the overall problem can be constructed by combining optimal solutions to its subproblems. Thus, the problem can be solved using a bottom-up approach, where smaller subproblems are solved first and their solutions are combined to solve larger problems.

## Fibonacci 

A classic example of a problem with overlapping subproblems is the Fibonacci number sequence. The Fibonacci sequence is defined as follows:

F(0) = 0,
F(1) = 1,
F(n) = F(n-1) + F(n-2) for n > 1

A naive approach to calculate the nth Fibonacci number would be to use the recursive formula directly. However, this approach leads to an exponential time complexity (O(2^n)) due to the repeated calculation of the same subproblems

In [None]:
## Fibonacci
def fib(i): # finding i-th member in our Fibonacci chain 1,1,2,3,5,8,13,21,34,55,89
    if i < 2: 
        return 1
    else:
        return fib(i-1) + fib(i-2) # we have two recursive calls, but our problem space is not going down by half

In [None]:
fib(0),fib(1),fib(2),fib(3),fib(4),fib(5)

(1, 1, 2, 3, 5, 8)

In [None]:
fib(5)

8

In [None]:
fib(10)

89

In [None]:
for n in range(1,15):
    print(fib(n), fib(n-1), round(fib(n)/fib(n-1),5))

1 1 1.0
2 1 2.0
3 2 1.5
5 3 1.66667
8 5 1.6
13 8 1.625
21 13 1.61538
34 21 1.61905
55 34 1.61765
89 55 1.61818
144 89 1.61798
233 144 1.61806
377 233 1.61803
610 377 1.61804


![Fib Spiral](https://upload.wikimedia.org/wikipedia/commons/thumb/9/93/Fibonacci_spiral_34.svg/500px-Fibonacci_spiral_34.svg.png)

https://en.wikipedia.org/wiki/Golden_ratio

In [None]:
# so far so good?
fib(30)

1346269

In [None]:
%%timeit
fib(30)

408 ms ± 135 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [None]:
%%timeit
fib(31)

720 ms ± 224 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [None]:
fib(35)

14930352

In [None]:
fib(36)

24157817

#https://stackoverflow.com/questions/35959100/explanation-on-fibonacci-recursion
![FibTree](https://i.stack.imgur.com/QVSdv.png)

### Fibonacci memoization

Using dynamic programming, we can optimize the calculation of Fibonacci numbers by storing the results of subproblems in a cache or table. Here's a top-down (memoization) approach:

In [1]:
def fibonacci_memo(n, memo={}):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    elif n not in memo:
        memo[n] = fibonacci_memo(n - 1, memo) + fibonacci_memo(n - 2, memo)
    return memo[n]

In [3]:
fibonacci_memo(10)

55

In [4]:
%%timeit
fibonacci_memo(30)

135 ns ± 1.34 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [5]:
%%timeit
fibonacci_memo(100)

136 ns ± 0.53 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


### Memoization with built in tools

Python offers built in support for memoization using standard library module functools. The lru_cache decorator can be used to cache the results of a function. The decorator takes a single argument that specifies the maximum size of the cache. If the cache is full, the least recently used entry is discarded to make room for the new entry.

In [None]:
from functools import wraps
def memo(func):
    cache = {} # Stored subproblem solutions, this dictionary - Hashmap type
    @wraps(func) # Make wrap look like func
    def wrap(*args): # The memoized wrapper
        if args not in cache: # Not already computed? O(1) lookup
            cache[args] = func(*args) # Compute & cache the solution O(1) to store - of func itself will have some cost
        return cache[args] # Return the cached solution - return again is O(1)
    return wrap # Return the wrapper
# PS Python actually has memoization library as well - we are learning here

In [None]:
fib_memo = memo(fib) #functions are first class citizens in Python


In [None]:
fib_memo(35)

14930352

In [None]:
fib_memo(30)

1346269

In [None]:
fib_memo(40)

KeyboardInterrupt: ignored

In [None]:
%%timeit
fib(30)

317 ms ± 6.92 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [None]:
@memo # this is a decorator - meaning we wrap our fib_m function in memo function
def fib_m(i):
    if i < 2: 
        return 1
    else:
        return fib_m(i-1) + fib_m(i-2)

In [None]:
%%timeit
fib_m(30)

238 ns ± 107 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [None]:
%%timeit
fib_m(35)

196 ns ± 4.38 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [None]:
fib_m(36) # well the book implemention does not quite work, it is not caching properly

24157817

In [None]:
fib_m(40)

165580141

In [None]:
fib_m(60)

2504730781961

In [None]:
fib_m(200)

453973694165307953197296969697410619233826

In [None]:
fib_m(1000) # on Google Colab stack size is limited to 1k by defaulb

RecursionError: ignored

In [None]:
import sys
sys.setrecursionlimit(10_000)

In [None]:
fib_m(2000) 
# sadly memoization will not solved stack overflow problem

6835702259575806647045396549170580107055408029365524565407553367798082454408054014954534318953113802726603726769523447478238192192714526677939943338306101405105414819705664090901813637296453767095528104868264704914433529355579148731044685634135487735897954629842516947101494253575869699893400976539545740214819819151952085089538422954565146720383752121972115725761141759114990448978941370030912401573418221496592822626

In [None]:
fib_m(5000) 
# sadly memoization will not help us with stack overflow , a good way to force kernel restart :)

6276302800488957086035253108349684055478528702736457439025824448927937256811663264475883711527806250329984690249846819800648580083040107584710332687596562185073640422286799239932615797105974710857095487342820351307477141875012176874307156016229965832589137779724973854362777629878229505500260477136108363709090010421536915488632339240756987974122598603591920306874926755600361865354330444681915154695741851960071089944015319300128574107662757054790648152751366475529121877212785489665101733755898580317984402963873738187000120737824193162011399200547424034440836239726275765901190914513013217132050988064832024783370583789324109052449717186857327239783000020791777804503930439875068662687670678802914269784817022567088069496231111407908953313902398529655056082228598715882365779469902465675715699187225655878240668599547496218159297881601061923195562143932693324644219266564617042934227893371179832389642895285401263875342640468017378925921483580111278055044254198382265567395946431803304304326865077

## Built in solution for memoization in Python


In [None]:
import functools
# https://stackoverflow.com/questions/1988804/what-is-memoization-and-how-can-i-use-it-in-python
@functools.lru_cache(maxsize=None) #by default only 128 latest so by setting it to None I cache everything - this is a decorator
def fib(num):
    if num < 2:
        return num
    else:
        return fib(num-1) + fib(num-2)
# in 3.9 Python there is a newer @functools.cache - similar idea

In [None]:
fib(35)

9227465

In [None]:
%%timeit
fib(35)

90.3 ns ± 15.9 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [None]:
%%timeit
fib(100)

85.9 ns ± 1.45 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [None]:
%%timeit
fib_m(100)  # so my memoization solution is about 2.5x slower than library solution but both are extremely fast

The slowest run took 44.83 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 5: 247 ns per loop


In [None]:
# so official caching version is 3 times faster than our self-made version, factor of 3 is not a dealbreak but nice to know

In [None]:
fib(40)

102334155

In [None]:
%%timeit
fib(100)

88 ns ± 2.44 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [None]:
fib(200)

280571172992510140037611932413038677189525

In [None]:
fib(1200)

27269884455406270157991615313642198705000779992917725821180502894974726476373026809482509284562310031170172380127627214493597616743856443016039972205847405917634660750474914561879656763268658528092195715626073248224067794253809132219056382939163918400

In [None]:
fib(2000)

4224696333392304878706725602341482782579852840250681098010280137314308584370130707224123599639141511088446087538909603607640194711643596029271983312598737326253555802606991585915229492453904998722256795316982874482472992263901833716778060607011615497886719879858311468870876264597369086722884023654422295243347964480139515349562972087652656069529806499841977448720155612802665404554171717881930324025204312082516817125

In [None]:
fib(3000)

410615886307971260333568378719267105220125108637369252408885430926905584274113403731330491660850044560830036835706942274588569362145476502674373045446852160486606292497360503469773453733196887405847255290082049086907512622059054542195889758031109222670849274793859539133318371244795543147611073276240066737934085191731810993201706776838934766764778739502174470268627820918553842225858306408301661862900358266857238210235802504351951472997919676524004784236376453347268364152648346245840573214241419937917242918602639810097866942392015404620153818671425739835074851396421139982713640679581178458198658692285968043243656709796000

In [None]:
fib(5000)

RecursionError: maximum recursion depth exceeded

In [None]:
# so the problem with TOP-DOWN memoization is that we are still left with recursive calls going over our stack limit

## So how to solve the stack overflow problem?

In [None]:
# we could try the build up solution - meaning BOTTOM-UP method
# this will usually be an iterative(looping) solution so no worries about stack


# Iterative version of Fibonacci with full table

In [None]:
# silly iterative version
def fib_it(n): # so n will be 1 based
    fibs = [1,1] #so we are going to build our answers in a list
    # lets pretend we do not know of any formulas and optimizations
#     n += 1 # fix this
    if n < 2:
        return fibs[n] # off by one errors
    ndx = 2
    while ndx <= n:
        fibs.append(fibs[ndx-1]+fibs[ndx-2]) # so I am building a 1-d table(array/list) of answers
        # in fib we could have of course just stored 2 previous values instead of all
        ndx+=1
    return fibs[n] # again off by one indexing 0 based in python and 1 based in our function


In [None]:
fib_it(2)

2

In [None]:
%%timeit
fib_it(100)

37.4 µs ± 10.2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [None]:
for n in range(0,10+1):
    print(fib_it(n))

1
1
2
3
5
8
13
21
34
55
89


In [None]:
fib_it(5),fib_it(6)

(8, 13)

In [None]:
fib_it(35),fib_it(36)

(14930352, 24157817)

In [None]:
%%timeit
fib_it(35)

15.1 µs ± 160 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [None]:
# so one way we could improve is that we do not need to store all this knowledge about previous solutions, 
# (unless we were building a table of ALL solutions)

In [None]:
# so we only need to store 2 values, but this would be improving constant and solving our space complexity
# instead of needing O(n) to store our fib(n), we just need O(1) - since we are onl storing two values
def fib_v2(n):
    prev, cur = 1, 1
    # base case start of our table - but we do not need a table here just two values
    if n <= 1:
        return prev
    ndx = 2
    while ndx <= n:
        prev, cur = cur, prev+cur # python makes it easy to assign 2 values at once with tuple unpacking, right side eval first
        # if your language does not support it you can simply use a third temp variable 
        ndx += 1
    return cur

In [None]:
fib_v2(5),fib_v2(6)

(8, 13)

In [None]:
fib_v2(35)

14930352

In [None]:
%%timeit
fib_v2(100)

13.1 µs ± 3.69 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [None]:
%%timeit
fib_it(1000)

301 µs ± 77.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [None]:
%%timeit
fib_v2(1000)

243 µs ± 48.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [None]:
%%timeit
fib_it(10000)

4.89 ms ± 153 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [None]:
%%timeit
fib_v2(10_000)

2.55 ms ± 212 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [None]:
fib_it(10)

89

In [None]:
fib_v2(10)

89

## Memoization by hand 

Let's see how we could have implemented memoization directly in our function

In [None]:
def fib_cache(num, cache):
    # base
    if num in cache:
        return cache[num]
    cache[num] = fib_cache(num-1, cache) + fib_cache(num-2, cache)
    return cache[num]
    

In [None]:
fib_cache(5, {0:1, 1:1})

8

In [None]:
fib_cache(30, {0:1, 1:1})

1346269

In [None]:
fib_cache(35, {0:1, 1:1})

14930352

In [None]:
%%timeit
fib_cache(100, {0:1, 1:1})

57.3 µs ± 18.7 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


# Pascal's triangle
![Triangle](https://upload.wikimedia.org/wikipedia/commons/0/0d/PascalTriangleAnimated2.gif)

The combinatorial
meaning of C(n,k) is the number of k-sized subsets you can get from a set of size n.

In mathematics, a combination is a selection of items from a collection, such that the order of selection does not matter (unlike permutations). https://en.wikipedia.org/wiki/Combination

![Comb](https://upload.wikimedia.org/wikipedia/commons/thumb/6/65/Combinations_without_repetition%3B_5_choose_3.svg/440px-Combinations_without_repetition%3B_5_choose_3.svg.png)

In [None]:
# this is horrible again we have 2 recursive calls for each call
def C(n,k):
    if k == 0: return 1
    if n == 0: return 0
    return C(n-1,k-1) + C(n-1,k)

In [None]:
C(2,0),C(2,1),C(2,2)

(1, 2, 1)

In [None]:
C(3,0),C(3,1),C(3,2),C(3,3)

(1, 3, 3, 1)

In [None]:
C(4,0),C(4,1),C(4,2),C(4,3),C(4,4)

(1, 4, 6, 4, 1)

In [None]:
C(6,3)

20

In [None]:
C(20,12)

125970

In [None]:
%%timeit
C(20,12)

1 loop, best of 5: 350 ms per loop


In [None]:
import functools

In [None]:
@functools.lru_cache(maxsize=None)
def C_mem(n,k):
    if k == 0: return 1
    if n == 0: return 0
    return C_mem(n-1,k-1) + C_mem(n-1,k)

In [None]:
C_mem(20,12)

125970

In [None]:
%%timeit
C_mem(20,12)

The slowest run took 21.83 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 5: 138 ns per loop


## Python Functools

The functools module is for higher-order functions: functions that act on or return other functions.

* https://docs.python.org/3/library/functools.html


### functools.lru_cache vs functools.cache

* cache exactly same as lru_cache with maxsize=None 
* https://stackoverflow.com/questions/70301475/difference-between-functools-cache-and-lru-cache




In [None]:
@functools.cache
def C_cache(n,k):
    if k == 0: return 1
    if n == 0: return 0
    return C_mem(n-1,k-1) + C_mem(n-1,k)

AttributeError: ignored

# New Section

In [86]:
!python --version  # so have to wait for Google to upgrade Python to 3.9, but of course you can try at home on your own computer 

Python 3.9.16


In [None]:
C_mem(64,8) # so BRUTE FORCE for 8-queens problem would require checking this many

4426165368

In [None]:
2**32

4294967296

In [None]:
%%timeit
C_mem(20,12)

203 ns ± 30 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


So 3 Million times faster
And it will only get worse with larger n and k values


In [None]:
C(22,10)  # keeps getting slower

646646

In [None]:
C_mem(30,22)

5852925

In [None]:
C_mem(500,127)

46437038934385738875060685074580974915827322393712987687159475139635977046776280972687848152763132582010929859049061968000

In [None]:
C_mem(2000,1555) # so we calculated how many different ways we can pick 1555 items out of set of 2000 items

537425311263286649045728031247848043926269386315713439394349281522386287881044189299660257402655939762820833577618210552868619936076461996950343651583167622419765426925933775006289767911413157852444651898746557509579506062674421789656825030627507059629199364370311140152613280678386654811275586813136941829060081126272639830187540208190808703691937737524007173610897100607592784614616627680955021048820164545102485799315069769979712239992887424317738387672000

In [None]:
C_mem(49,6)

13983816

In [None]:
C_mem(55,7)  # so our odds in lotteries are not so great...

202927725

You may at times want to rewrite your code to make it iterative. This
can make it faster, and you avoid exhausting the stack if the recursion depth gets excessive. There’s another reason, too:
The iterative versions are often based on a specially constructed cache, rather than the generic “dict keyed by parameter
tuples” used in my @memo. 

This means that you can sometimes use more efficient structures, such as the multidimensional
arrays of NumPy (this is a C type array dense and efficient), or even just nested lists. 

This custom cache design
makes it possible to do use DP in more low-level languages(ahem C, C++), where general, abstract solutions such as our @memo decorator
are often not feasible.

Note that even though these two techniques often go hand in hand, you are certainly free to use an
iterative solution with a more generic cache or a recursive one with a tailored structure for your subproblem solutions.

Let’s reverse our algorithm, filling out Pascal’s triangle directly. 

In [72]:
from collections import defaultdict  # we could use a regular dict , just less work with defaultdict
# in your language check what collections are available, C#, Java have very nice collections libraries
# careful with C++ collections, check complexity on some of them, it might not be O(1) - map for one (which turns out to be ordered)
# you'd need to look up documentation for that particular structure - so in C++ you might need to use unordered map

In [73]:
def pascal_up(n,k):
    Cit = defaultdict(int)
    # Cit = {} # turns out going to plain dictionary did not help at all, slow down by 10%
    for row in range(n+1):
        Cit[row,0] = 1 #
        for col in range(1,k+1): # looking like O(n*k) space and time complexity here right?
            Cit[row,col] = Cit.get((row-1,col-1),0) + Cit.get((row-1,col),0)
    return Cit[n,k]

In [74]:
pascal_up(20,12)

125970

In [None]:
%%timeit
pascal_up(20,12)

10000 loops, best of 5: 148 µs per loop


In [None]:
%%timeit
pascal_up(20,12)

10000 loops, best of 5: 148 µs per loop


In [None]:
pascal_up(200,120)

1647278652451762678788128833110870712983038446517480945400

In [None]:
C_mem(200,120)

1647278652451762678788128833110870712983038446517480945400

In [None]:
pascal_up(2000,1255)

59109593628908414948382028796066195825322294401884421608567282370732677108447726193711784730302760990299603214712874545403948907216441186822204871444990120418511945136706510520055755556735802821658571978418751352273298267569828345746978809441973750241286775122448947529540564006233186974990594534845090378195822798749749716850238790136589693799060708520531078121496569151588079913232831392320592160010660575061801016625866394043795717433667107028907325906172066327055484845091441072359350112683890957329354388664535777738465246096416216139097702868890896959249389851651200

In [None]:
C_mem(2000,1255) # so we see the need for iterative version - other way would be to allow tail call optimization in functional languages

59109593628908414948382028796066195825322294401884421608567282370732677108447726193711784730302760990299603214712874545403948907216441186822204871444990120418511945136706510520055755556735802821658571978418751352273298267569828345746978809441973750241286775122448947529540564006233186974990594534845090378195822798749749716850238790136589693799060708520531078121496569151588079913232831392320592160010660575061801016625866394043795717433667107028907325906172066327055484845091441072359350112683890957329354388664535777738465246096416216139097702868890896959249389851651200

In [None]:
# so we could futher save memory for our pascal_up by only saving the needed information meaning we only need the previous row

# Difference between TOP-DOWN (with memoization) and BOTTOM-UP (with filling up DP table)

Basically the same thing is going on. The main difference is that we need to figure out which cells in the cache
need to be filled out, and we need to find a safe order to do it in so that when we’re about to calculate C[row,col], the
cells C[row-1,col-1] and C[row-1,col] are already calculated. With the memoized function, we needn’t worry about
either issue: It will calculate whatever it needs recursively.

## Longest Increasing Subsequence

In [None]:


# Say you have a sequence of numbers, and you want to find its
# longest increasing (or, rather nondecreasing) subsequence—or one of them, if there are more. A subsequence consists
# of a subset of the elements in their original order. So, for example, in the sequence [3, 1, 0, 2, 4], one solution
# would be [1, 2, 4].

# Brute force - when in doubt use brute force
from itertools import combinations
def naive_lis(seq):
    for length in range(len(seq), 0, -1): # n, n-1, ... , 1
        for sub in combinations(seq, length): # Subsequences of given length
            if list(sub) == sorted(sub): # An increasing subsequence?
                return sub # Return it!
naive_lis([3,1,0,2,4])
naive_lis([5,0,2,1,6,3,7,4,6])
# This works (on smaller sequences at least)


# how about complexity? 
# Two nested loops -> n^2 ?
# Hint combinations is not O(1)....


In [75]:
# so recursive memoized solution - TOP/DOWN
def rec_lis(seq): # Longest increasing subseq.
    @functools.lru_cache(maxsize=None)
    def L(cur): # Longest ending at seq[cur]
        res = 1 # Length is at least 1 - because well because one number is always going to be a solution to non empty list
        # so for example reversely sorted list would have a solution 1 - because the numbers are going down
        for pre in range(cur): # Potential predecessors
            if seq[pre] <= seq[cur]: # A valid (smaller) predec.
                res = max(res, 1 + L(pre)) # Can we improve the solution?
        return res
    return max(L(i) for i in range(len(seq))) # The longest of them all

In [76]:
rec_lis([3,1,0,2,4])  # we find the length of longest increasing subsequence

3

In [77]:
rec_lis([3,1,0,2,4,7,-6,9])  # so 5 should be answer here

5

In [78]:
# so recursive memoized solution - TOP/DOWN
# TODO add sequence passing
def rec_lis_full(seq): # Longest increasing subseq.
    @functools.lru_cache(maxsize=None)
    def L(cur): # Longest ending at seq[cur]
        res = 1 # Length is at least 1
        for pre in range(cur): # Potential predecessors
            if seq[pre] <= seq[cur]: # A valid (smaller) predec.
                res = max(res, 1 + L(pre)) # Can we improve the solution?
        return res
    return max(L(i) for i in range(len(seq))) # The longest of them all

In [79]:
rec_lis_full([3,1,0,2,4,7,-6,9]) 

5

### Solving LIS using tabulation

To solve the LIS problem using dynamic programming using tabulation (BOTTOM-UP), we can follow these steps:

1. Initialize a list dp with the same length as the input array, and set all elements to 1. Each element of dp represents the length of the longest increasing subsequence ending at the corresponding position in the input array.

2. Iterate through the input array from left to right. For each element, compare it with all previous elements. If the current element is greater than a previous element and the length of the longest increasing subsequence ending at the previous element plus one is greater than the current length stored in dp, update the dp value at the current position.

3. After iterating through the entire input array, the maximum value in the dp list represents the length of the longest increasing subsequence.

In [11]:
def longest_increasing_subsequence_tab(arr):
    n = len(arr)
    dp = [1] * n

    for i in range(1, n):
        for j in range(0, i):
            if arr[i] > arr[j] and dp[i] < dp[j] + 1:
                dp[i] = dp[j] + 1

    return max(dp)

longest_increasing_subsequence_tab([3,1,0,2,4,7,-6,9])

5

In [12]:
longest_increasing_subsequence_tab([10, 22, 9, 33, 21, 50, 41, 60, 80])

6

In [8]:
# tabulated solution rewritten to be more pythonic

def basic_lis(seq):
    L = [1] * len(seq)
    for cur, val in enumerate(seq):
        for pre in range(cur):
            if seq[pre] <= val:
                L[cur] = max(L[cur], 1 + L[pre])
    return max(L)

### Time and Space complexity of tabulated LIS solution

The time complexity of this solution is O(n^2), where n is the length of the input array. This is because, in the worst case, we have to compare each element with all previous elements. The space complexity is O(n) due to the dp list.

In [9]:
basic_lis([3,1,2,0,4])

3

In [83]:
basic_lis([5,4,3,2]) # should return 1

1

In [10]:
basic_lis([3,1,0,2,4,7,-6,9]) 

5

In [None]:
for i,n in enumerate("Valdis",start=100):  # just a Pythonic way of adding index to sequences
    print(i,n)

100 V
101 a
102 l
103 d
104 i
105 s


In [None]:
basic_lis([3,1,0,2,4,7,9,6])

5

In [None]:
# so basic_lis is tabulating and using two nested loops - thus O(n^2) (again not all nested loops will be n^2 but most are)
# can we do better for this increasing subsequence length

### Returning actual LIS

To return the actual longest increasing subsequence, we can modify the previous code by adding an auxiliary list to keep track of the subsequences at each position. After calculating the lengths of the longest increasing subsequences, we can find the maximum value in the dp list and retrieve the corresponding subsequence from the auxiliary list.

In [6]:
def longest_increasing_subsequence(arr):
    n = len(arr)
    dp = [1] * n
    subsequences = [[x] for x in arr]

    for i in range(1, n):
        for j in range(0, i):
            if arr[i] > arr[j] and dp[i] < dp[j] + 1:
                dp[i] = dp[j] + 1
                subsequences[i] = subsequences[j] + [arr[i]]

    max_index = dp.index(max(dp))
    return subsequences[max_index]

# Example usage:
arr = [10, 22, 9, 33, 21, 50, 41, 60, 80]
print(longest_increasing_subsequence(arr))
# Output: [10, 22, 33, 50, 60, 80]

[10, 22, 33, 50, 60, 80]


A crucial insight is that if more than one predecessor terminate subsequences of length m, it doesn’t matter which
one of them we use—they’ll all give us an optimal answer. Say, we want to keep only one of them around; which one
should we keep? The only safe choice would be to keep the smallest of them, because that wouldn’t wrongly preclude
any later elements from building on it. So let’s say, inductively, that at a certain point we have a sequence end of
endpoints, where end[idx] is the smallest among the endpoints we’ve seen for increasing subsequences of length idx+1
(we’re indexing from 0). Because we’re iterating over the sequence, these will all have occurred earlier than our current
value, val. All we need now is an inductive step for extending end, finding out how to add val to it. If we can do that, at
the end of the algorithm len(end) will give us the final answer—the length of the longest increasing subsequence.

This devilishly clever little algorithm was first was first described by Michael L. Fredman in 1975

## Optimal Longest Increasing Subsequence
https://www.sciencedirect.com/science/article/pii/0012365X7590103X

O(n log n) optimal solution

![Vis LIS](https://upload.wikimedia.org/wikipedia/commons/thumb/1/1d/LISDemo.gif/800px-LISDemo.gif)

In [85]:
# https://docs.python.org/3/library/bisect.html - basically we use binary search to find insertion points 
from bisect import bisect  # provides support for maintaining a list in sorted order without having to sort the list after each insertion. 
def lis(seq): # Longest increasing subseq.
    end = [] # End-values for all lengths
    for val in seq: # Try every value, in order
        idx = bisect(end, val) # Can we build on an end val?
        if idx == len(end): 
            end.append(val) # Longest seq. extended
        else: 
            end[idx] = val # Prev. endpoint reduced
    return len(end) # The longest we found

In [None]:
lis([3,1,0,2,4,7,9,6])

5

## Practical applications of LIS

The Longest Increasing Subsequence (LIS) problem has various practical applications in diverse domains. Some examples include:

1. Bioinformatics: In computational biology, the LIS problem is used for identifying conserved patterns in DNA, RNA, or protein sequences. By comparing sequences, researchers can better understand evolutionary relationships among species and identify functional elements within the sequences.

2. Scheduling: The LIS problem can be applied to scheduling problems where tasks have different priorities and deadlines. By finding the longest increasing subsequence based on priorities, one can create a schedule that maximizes the execution of high-priority tasks without violating deadlines.

3. Data compression: In data compression algorithms like Burrows-Wheeler Transform (BWT), the LIS problem can be used to find patterns within data that can be represented more efficiently. By identifying and encoding these patterns, it is possible to reduce the size of the data without losing any information.

4. Data mining and analysis: The LIS problem can be used for mining sequential patterns in time-series data or transaction data. In finance, for example, it can help identify periods of increasing stock prices or other trends. In retail, it can help identify sequences of items frequently bought together.

5. Performance analysis: In computer systems, the LIS problem can be used to analyze and optimize the performance of caching, paging, or resource allocation policies. By identifying sequences of increasing resource usage or access patterns, system designers can make better decisions about resource allocation and management.

6. Document comparison and analysis: The LIS problem can be used in text analysis and document comparison tasks, where the goal is to identify common substructures or patterns in two or more documents. For example, it can help detect plagiarism or determine the similarity of two texts.

# Conclusion

## So Top-Down
* Create a recursive solution
* memoize it - (cache it!)

## Bottom-Up
* figure out a way of storing previous results
* tabulate(calculate) up
* you might not need to store the full table maybe just previous row - can save space

## References

* 
* https://www.geeksforgeeks.org/longest-increasing-subsequence-dp-3/
* Introduction to Algorithms, Third Edition
By Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein
Chapter 15. Dynamic Programming - pg. 359-365
* Hetland, M. L. (2010). Python algorithms: Mastering basic algorithms in the python language (1st ed.). APress. Chapter 8. - Tangled Dependencies
and Memoization