# Backward Induction and Dynamic Programming

## One Period Model

\begin{align}
v_{T}(m_{T})    & = \max_{c_{T}} u(c_{T}) \\
c_{T} & \leq m_{T} \\
u'(c)          & > 0
\end{align}


## Two Period Model (Separable)

$$
v_{T-1} = \max_{\{c_{T-1},c_{T}\}}~~ u(c_{T-1}) + \beta u(c_{T})
$$

$$
v_{T-1} = \max_{c_{T-1}}~~ u(c_{T-1}) + \beta v_{T}(m_{T})
$$


## Many Period Model (Geometric, Separable)

$$
v_{t}(m_{t}) = \max_{\{c_{t},c_{t+1},...,c_{T}\}}~~\sum_{n = 0}^{T-t} \beta^{n} u(c_{t+n})
$$

## Bellman Equation
$$
v_{t}(m_{t}) = \max_{c_{t}}~~u(c_{t}) + \beta v_{t+1}(m_{t+1})
$$




## Terminology:

1. "Backward induction"
1. "Dynamic programming"

## Requirements for Bellman Solution

#### State Variables Characterize Everything

* History does not matter
    - How you got to the state does not affect your optimal choices
* Problem is "1st order Markov"
* Specifically, it is a "Markov Decision Process"
    - This is most of what is studied in "machine learning" 
    - There are lots of ways to make the decisions
* Bellman problems are the subset where the decision is *optimal*

#### Time Consistency

* The choice that seems optimal to you in any given date, _also_ would seem optimal from the perspective of any _other_ date.

#### Example of Time Inconsistency

"Principle of Optimality" is another term for time consistency

Suppose that when you are young, you feel strongly that you want to leave a bequest to your kids.

$$
v_{T-1}(m_{T-1}) = u(c_{T-1}) + \beta u(c_{T}) + w(\text{bequest})
$$

From the perspective of period $T-1$, the optimal choice in period 2 is to allocate whatever resources you have such that marginal utility of $c_{T}$ = marginal welfare(bequest)

Now, when you are old, you decide you don't care about the kids after all.

Very interesting aspect of this is that it means that choices people make depend on "commitment technologies".   

"Naifs" vs "Sophisticates"

- Naif: Doesn't realize their future behavior will differ from what they want now
- Sophisticates: Understand their future self and strategize accordingly

Problem:
1. There may be multiple equilibria 
1. They tend to be very complicated

This kind of behavior has been a key underpinning of much of "behavioral economics"



#### Habits?

IF you make the assumption that the habit itself constitutes a state.

So, the rich-until-yesterday person will have a different "habit stock" from a person who was poor-until-yesterday.

Simplest case: 

$$
h_{t} = c_{t-1}
$$

More plausible case: For $\lambda < 1$
$$
h_{t} = c_{t-1} + \lambda c_{t-2} + \lambda^{T} c_{t-3} + ...
$$


You have to include the habit's effect in the utility function itself:

$$
u(c,h)
$$

If $h_{T-1} \neq h_{T}$ then
$$
u^{c}(100,h_{T-1}) \neq u^{c}(100,h_{T})
$$
