## Consumption problem example

Consider this version of the portfolio choice consumption saving problem.

Variable | Equation | Operation | Stage | Utility | Constraints
-- | -- | -- | -- | -- | --
$\eta$| ~Dist | Shock | c | -- | --
$\theta$ | ~Dist | Shock | c | -- | --
$\psi$ | ~Dist | Shock | c | -- | --
$\hat{R}$ | $$\hat{R} = \alpha * \eta + (1 - \alpha) * R$$ | Update | $c$ | -- | --
$b$ | $$b_{t} = a_{t-1} \hat{R}$$ | Update | c | -- | --
$p$ | $$p_{t}=p_{t-1}\psi_{t}$$ | Update | c | -- | --
$y$ | $$y_{t} = p_{t}\theta_{t}$$ | Update | c | -- | --
$m$ | $$m_{t} = b_{t} + y_{t}$$ | Update | c | -- | --
$c$ | $$c$$ | Control | c | U(c) | $c \leq m$
$a$ | $$a_{t} = m_{t} - c_{t}$$ | Update | $\alpha$ | -- | --
$\alpha$| $$\alpha$$ | Control | $\alpha$ | 0 | $0 \leq \alpha \leq 1$

This can be decomposed into ~~two~~ three ("stages").

Here, $A$ is the action space, $X$ is the input state space, $Y$ is the output state space. $\Gamma$ are restrictions. $F: A \times X \rightarrow \mathbb{R}$ is the stage reward. $T: X \times A \rightarrow Y$ is the transition function. $\beta$ is the discount factor.

The consumption stage:

* $c \in A_0 = \mathbb{R}$
* $m \in X_0 = \mathbb{R}$
* $a \in Y_0 = \mathbb{R}$
* $\Gamma_0$ ... restricts consumption $c \leq m$
* $F_0(m,c) = CRRA(c)$
* $T_0(m,c) = m - c$ 
* $\beta_0 = \beta $

The allocation stage. Note that this is a trivial transition function.:

* $\alpha \in A_1 = \mathbb{R}$
* $a \in X_1 = \mathbb{R}$
* $(a, \alpha) \in Y_1 = \mathbb{R}^2$
* $\Gamma_\alpha$ ... restricts allocation $0 \leq \alpha \leq 1$
* $F_\alpha(a,\alpha) = 0$
* $T_\alpha(a,\alpha) = (a, \alpha)$
* $\beta_\alpha = 1 $

The growth stage stage:

* $A_2 = \emptyset$
* $(a, \alpha) \in X_2 = \mathbb{R}^2$
* $F_2(a,\alpha) = 0$
* $T_2(a,\alpha) =  \frac{(\alpha \eta + (1 - \alpha) R) a + \theta}{\psi G} $ 

When decomposed in this way, it is clear that the allocation stage can be removed if $\alpha$ is set exogenously as a parameter or shock.

**Why are the allocation and growth stages separated?** Because if shocks are realized at the _beginning_ of a stage, and agents know the outcomes of shocks before their actions in a stage, then the allocation decision needs to happen in a different stage from $\eta$.

The stages have mutually recursive Bellman equation for its beginning-of-stage value in terms of next-stage value:

$$v_0(m) = \max_{c < m} F_0(c) + \beta v_1(m - c)$$

$$v_1(a) = \max_{0 < \alpha < 1} v_2(a, \alpha)$$

$$v_2(a, \alpha) = \mathbb{E}[v_0(T_2(a, \alpha))]$$

# Staging a problem: subproblems and composability

Consider the smallest problem to be solve by a consumer agent.

The agent:
 - begins in some input states $\vec{X} \in \vec{X}$
 - experiences some exogeneous shocks $\vec{k} \in \vec{K}$
 - can choose some actions $\vec{a} \in \vec{A}$
 - experience a reward $F: \vec{x} \times \vec{K} \times \vec{A} \rightarrow \mathbb{R}$
 - together, these determine some output states $\vec{y} \in \vec{Y}$ via...
 - a **deterministic** transition function $T: \vec{X} \times \vec{K} \times \vec{A} \rightarrow \vec{Y}$
   - _This is deterministic because shocks have been isolated to the beginning of the stage._
   - CDC thinks there needs to be an additional between-stage transition function.
 - The agent has a discount factor $\beta$ for future utility.
 
(The use of the _vector_ annotation here is to indicate that each set is potentially multidimensional, composed of more than one variable. Thus, if there are multiple shocks, $\vec{k} = <k_1, k_2, ...>$.
 
By _stage_, we mean a tuple, $g = (\vec{X}, \vec{K}, \vec{A}, \vec{Y}, R, T, \beta)$.

## Solving one stage

For any stage, consider two value functions.
 - $v_x$ is the value of its input states
 - $v_y$ is the value of its output states. Others migth write this $\mathfrak{v}$
 
The stage is solved with respect to a value function $v_y : \vec{Y} \rightarrow \mathbb{R}$ over the output states. The $q: \vec{X} \times \vec{K} \times \vec{A} \rightarrow \mathbb{R}$ is the value of a state, shock, action combination.

$$q(\vec{x}, \vec{k}, \vec{a}) = F(\vec{x}, \vec{k}, \vec{a}) + \beta v_y(T(\vec{x}, \vec{k}, \vec{a}))$$

where $\beta$ is the agent's discount factor for that stage. Note that there is no expecation taking in this operation because $T$ is deterministic.

The optimal policy $\pi: \vec{X} \times \vec{K} \rightarrow \vec{A}$ is:

$$\pi^*(\vec{x}, \vec{k}) = \mathrm{argmax}_{\vec{a} \in \vec{A}} q(\vec{x}, \vec{k}, \vec{a})$$

The optimal policy $\pi^*$ can then be used to derive the value function over the input states $V_x: \vec{X} \rightarrow \mathbb{R}$.

$$v_x(\vec{x}) = \mathbb{E}_{\vec{k} \in \vec{K}}[q(\vec{x}, \vec{k}, \pi^*(\vec{x}, \vec{k}))]$$

Note that this requires no optimization, but does require the taking of expectations over the probability distribution over the shocks.

### Degenerate stage forms

There are several ways in which a stage can be simple or degenerate. This is not bad; the simpler the stage, the easier it is to solve.

- If there is no explicit discount factor, it is assumed to be $1$.
- If there is no explicit reward function, it is assumed that $F(\cdot) = 0$
- If there are no shocks $\emptyset = \vec{K}$ then expectations need not be taken when solving the stage.
- If there are no actions $\emptyset = \vec{A}$ then there is no optimization for the stage.
- If there is no explicittransition function, then $T(\vec{x}) = \vec{y}$
- If the transition function is invertible, that's especially good (enables endogeneous gridpoints?)


## Sequencing problems

Stages can be sequenced according to certain conditions and rules.

Consider two stages, $g_a$ and $g_b$. When is the sequence $(g_a, g_b)$ a valid sequence?

Let $Y_a$ and $X_b$ be the set of _labeled dimensions_ or 'variables' in the output and input spaces of the two stages, respectively.

Simply, any intersection $Y_a \cap X_b$ is the basis for the sequence; the output of $g_a$ becomes the input of $g_b$ where possible.

If there is an element $z \in X_b$ and $z \notin Y_a$, then the sequence (g_a, g_b) requires an additional parameterization $p_b$ that assigns a value to $z$. Likewise, any parameter in $g_b$ is in principle possibly set as a state variable when preceded by $g_a$. $(g_a, (g_b, p_b))$

**What if $Y_a$ contains all the information needed for $X_b$ but has not exact intersection with $X_b$?** You can still concatenate the two sequences, but you need to either:
 - modify the transition equation of $g_a$ so that it outputs something that matches $g_b$ or...
 - built and 'adapter' stage $g_{ab}$ that maps $Y_a$ to $X_b$.

The adapter stage likely has no shocks and no actions, and so is a trivial step in the backwards induction solution process, especially if the transistion function is invertible.

## Solving for a sequence of stages

 - Start with the last stage $g_T$.
 - Use a trivial terminal value function $v^T_y$.
 - Solve $g_T$ for $v^T_x$

 - Repeat for $t : T-1 ... 0$
   - Let $v^t_y = v^{t+1}_x$
   - Solve for $v^t_x$

### Other algorithmic considerations

Assuming continuous spaces and gridpoints of values.

- $v_y$ is continuously interpolated over a grid $\tilde{Y}$
- shocks are discretized over grid $\tilde{K}$
- Possibility for endogenous gridpoints of $\tilde{X}$ under some conditions