When we were discussing recursion we had two ways of doing things. We either used dictionaries for a top-down/memoization approach or a bottom-up/tabulation approach using array indices. Both are valid and both can be quite fast if done correctly. Memoization tends to be easier to read and write while tabulation tends to require less memory. Both should have approximately the same Big O runtime if they are not resource constrained.

# Dynamic Programs
Let's recall our bellman equation

## Anatomy of a Dynamic Program
In order to make programming dynamic programs easier we should have the following
1. Base Case: a case for which the problem is trivial. In fibonacci this was the first two terms.
2. State: We want to organize our problem in such a way that we can have unique states. In the case of the egg drop we identified this as the number of floors we have remaining and the number of eggs we have left. States are independent of path, because once you get to a state the optimal behavior for the future does not depend on how you got there. Someone who won a lottery and now has \$1 million in their bank account should the same to a billionaire who lost their company and now has only \$1 million remaining.
3. Overall Value function: This is the recursive call. The value of a current policy decision is always going to be determined by the immediate value gained from a policy decision plus all future value that can be obtained by acting optimally. This is represented by the Bellman Equation.
$$\underbrace{V^*(s_t)}_{\text{optimal value}} = \underset{a}{max}\left \{ \underbrace{R(a,s_t)}_{\text{Immediate Value Function}} +
   \underbrace{\sum_{i>t}V^*(s_{i})}_{\text{Optimal Value of future states}} \right\}$$
4. Immediate Value function: This is the value that is immediately obtained from a single decision given a single state
5. Policy Decision: Policy decisions are the options available to you at a given state. We maximize our objective by choosing the optimal policy decisions.
6. Lookup/cache: to prevent repeated calculations you should have some way to store older results and extract them

# Top Down Memoization
So we have our equation as listed above. We can use memoization and recursion with a dictionary to create simple code that performs the task beautifully

In [1]:
cache = {}
def OptimalValue(state):
    if state in cache:
        return cache[state]
    else:
        optimal_decision = None
        max_value = -1
        for decision in possible_decisions:
            new_state = transition(state, decision)
            value = immediate_value(state,decision) + OptimalValue(new_state)
            if value > max_value:
                max_value = value
                optimal_decision = decision  
    cache[state] = (value, decision)
    return cache[state]

If everything was done correctly then the end result will be a lookup table that can find you the solution for any state that is included in the range. Usually your solution will be in one of the corners of the problem.