# Distributed Value Function Iteration

Presented by Spencer Lyon

**Outline**:

* Dynamic programming basics
* Theoretical extension
* Computational Implementation

## Dynamic Programming Basics

> An optimal policy has the property that, whatever the initial state and decision [i.e. control] are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision. (Bellman, 1957)

### Contraction Mapping Theorem

Definition of contraction mapping: 

> Let $(S, \rho)$ be a metric space and $T: S \rightarrow S$ be a function mapping $S$ into itself. $T$ is a contraction mapping (with modulus $\beta$) if for some $\beta \in (0, 1)$, $\rho(Tx, Ty) \le \beta \rho(x, y), \; \forall x, y \in S$

Contraction mapping Theorem:

>If $(S, \rho)$ is a complete metric space and $T: S \rightarrow S$ is a contraction mapping with modulus $\beta$, then
> 1. $T$ has exactly one fixed point $v \in S$
> 2. for any $v_0 \in S, \; \rho(T^n, v_0, v) \le \beta^n \rho(v_0, v)$ where $T^n$ is equal to $T$ applied $n$ times.


Corollary:

> If $S'$ is a closed subset of $S$, such that $TS' \subseteq S'$, then $v \in S'$. If in addition, $TS' \subseteq S'' \subset S'$ then $v \in S''$

### Blackwell's Theorem

Theorem:


> Let $X \subseteq \mathbb{R}^l$ and let $B(X)$ be the space of bounded functions $f:X \rightarrow \mathbb{R}$ with the sup norm. Let $T: B(X) \rightarrow BI(X)$. $T$ is a contraction mapping if it satisfies the following:
> 1. **Monotonicity:** $f, g \in B(X)$ and $f(x) \le g(x) \; \forall x \in X $ implies that $Tf(x) \le T g(x)\; \forall x \in X$
> 2. **Discounting:** There exists some $\beta \in (0, 1)$ such that $[T(f + a)](x) \le (Tf)(x)  + \beta a \; \forall f \in B(X), a \ge 0, x \in X$

### Example -- Neoclassical growth model

#### Household problem

* Time is discrete: $t = 0, 1, \dots$
* In each period $t$, Consumers must split current wealth between
    * Consumption in current period: $c_t$
    * Investment in risky captial to be part of wealth tomorrow: $k_{t+1}$
* Captial saved in period $t$ ($k_{t+1}$) earns a rate of return $r_t$ and depreciates at a rate $\delta$
* Households order non-negative streams of consumption according to
$$
\sum_{t=0}^{\infty} \beta^t U(c_t), \beta \in (0, 1), \text{ where}
$$
* $U(x) = \frac{x^{1 -\gamma}}{1 - \gamma}$

#### Firm Problem

* A representative firm uses captial as input and produces consumption/savings good (all goods in units of consumption)
* Must pay a rental rate $r$ for each unit of capital
* Solves static problem to maximize revenue less cost:
$$\max_{k} F(k) - r k$$
* Solution pins down interest rate:


Applying Bellman's principle of optimality we can write these preferences recursively:

$$V(k_t) = \max_{c_t, k_{t+1}} u(c_t)  + \beta V(k_{t+1}) $$

* Solution approach is value function iteration (fixed point)
* Works when natural operator (taking V to V) is a contraction mapping
* Question: How to prove that this is a contraction mapping?

#### Applying Blackwell's conditions

* Let $V \in B(R)$ and define a map $T: B->B$ by:
$$T V = \max_{c, k} u(c) + \beta V(k)$$
* Need to show monotonicty and discounting:
    * Monotonicity: Let $f, g, \in B(R)$ such that $\forall x \in R, f(x) \le g(x)$. Then 
    $$Tf = \max_{c,k} u(c) + \beta f(k) \le \max_{c, k} u(c) + \beta g(k) = T g \blacksquare$$
    * Discounting: Let $f \in B(R)$ and $a \ge 0$. then 
    $$ [T(f + a)](x) = \max_{c, k} u(c) + \beta [f(k) + a) = \max_{c, k} u(c) + \beta f(k) + \beta a = Tf + \beta a \blacksquare$$
* So, by Blackwell's theorem $T$ is a contraction mapping
* Can now apply standard discritize and iterate algorithm

## Extension: Distributed VFI

* Value function iteration (VFI) is a widely applicable solution method
* Hinges on established theory of contraction mapping
* Applicable to many dynamic macroeconomic models. 
* But, it is relatively inefficient -- converges at a rate bound by model parameter $\beta$

### Alternatives to VFI

* Under certain conditions, more efficient algorithms exist:
    * Policy function iteration: iterate directly on function defining optimal control, instead of optimal value
    * Endogenously update grid points each iteration (cuts root finding step from many problems)
    * Simulation based methods (potentially more efficient depending on dimensionality and composition of state space)
    * Perturbation methods
* Problems with these methods: 
    * Usually more technical
    * Usually less widely applicable
 

### More efficient VFI

* Instead of a different algorithm, I propose a more efficient implementation
* Distribute the value function across multiple processes
* Sketch of basic algorithm:
    1. Discretize state space and choose a number of processes
    2. Scatter initial guess for value function equally across processes
    3. Have each process do one iteration of updating their section of value function 
    4. On root process, father entire value function at end of each iteration. Check convergence, update user
    5. Re-scatter value function and proceed with next iteration