Skip to content

Commit 31079a2

Browse files
committed
Minor edits to code and markdown
1 parent e2dcab9 commit 31079a2

File tree

3 files changed

+61
-39
lines changed

3 files changed

+61
-39
lines changed

lectures/ifp_discrete.md

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -100,11 +100,14 @@ Here
100100

101101
* $c_t$ is consumption and $c_t \geq 0$,
102102
* $a_t$ is assets and $a_t \geq 0$,
103-
* $R > 0$ is a gross rate of return, and
104-
* $(y_t)$ is labor income.
103+
* $R = 1 + r$ is a gross rate of return, and
104+
* $(y_t)_{t \geq 0}$ is labor income, taking values in some finite set $\mathsf Y$.
105105

106106
We assume below that labor income dynamics follow a discretized AR(1) process.
107107

108+
We set $\mathsf S := \mathbb{R}_+ \times \mathsf Y$, which represents the state
109+
space.
110+
108111
The **value function** $V \colon \mathsf S \to \mathbb{R}$ is defined by
109112

110113
```{math}
@@ -116,6 +119,9 @@ V(a, y) := \max \, \mathbb{E}
116119
\right\}
117120
```
118121

122+
where the maximization is over all feasible consumption sequences given $(a_0,
123+
y_0) = (a, y)$.
124+
119125
The Bellman equation is
120126

121127
$$
@@ -157,15 +163,18 @@ class Model(NamedTuple):
157163
Q: jnp.ndarray # Markov matrix for income
158164
159165
160-
def create_consumption_model(R=1.01, # Gross interest rate
161-
β=0.98, # Discount factor
162-
γ=2, # CRRA parameter
163-
a_min=0.01, # Min assets
164-
a_max=5.0, # Max assets
165-
a_size=150, # Grid size
166-
ρ=0.9, ν=0.1, y_size=100): # Income parameters
166+
def create_consumption_model(
167+
R=1.01, # Gross interest rate
168+
β=0.98, # Discount factor
169+
γ=2, # CRRA parameter
170+
a_min=0.01, # Min assets
171+
a_max=5.0, # Max assets
172+
a_size=150, # Grid size
173+
ρ=0.9, ν=0.1, y_size=100 # Income parameters
174+
):
167175
"""
168176
Creates an instance of the consumption-savings model.
177+
169178
"""
170179
a_grid = jnp.linspace(a_min, a_max, a_size)
171180
mc = qe.tauchen(n=y_size, rho=ρ, sigma=ν)
@@ -175,6 +184,10 @@ def create_consumption_model(R=1.01, # Gross interest rate
175184

176185
Now we define the right hand side of the Bellman equation.
177186

187+
We'll use a vectorized coding style reminiscent of Matlab and NumPy (avoiding all loops).
188+
189+
Your are invited to explore an alternative style based around `jax.vmap` in the Exercises.
190+
178191
```{code-cell} ipython3
179192
@jax.jit
180193
def B(v, model):
@@ -233,6 +246,7 @@ def get_greedy(v, model):
233246
return jnp.argmax(B(v, model), axis=2)
234247
```
235248

249+
236250
### Value function iteration
237251

238252
Now we define a solver that implements VFI.
@@ -260,6 +274,7 @@ def value_function_iteration_python(model, tol=1e-5, max_iter=10_000):
260274
Next we write a version that uses `jax.lax.while_loop`.
261275

262276
```{code-cell} ipython3
277+
@jax.jit
263278
def value_function_iteration(model, tol=1e-5, max_iter=10_000):
264279
"""
265280
Implements VFI using successive approximation.
@@ -341,11 +356,13 @@ print(f"Relative speed = {python_time / jax_without_compile:.2f}")
341356
In this exercise, we explore an alternative approach to implementing value function iteration using `jax.vmap`.
342357
343358
For this simple optimal savings problem, direct vectorization is relatively easy.
359+
344360
In particular, it's straightforward to express the right hand side of the
345361
Bellman equation as an array that stores evaluations of the function at every
346362
state and control.
347363
348364
However, for more complex models, direct vectorization can be much harder.
365+
349366
For this reason, it helps to have another approach to fast JAX implementations
350367
up our sleeves.
351368

lectures/ifp_opi.md

Lines changed: 34 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -16,17 +16,19 @@ kernelspec:
1616

1717
## Overview
1818

19-
In {doc}`ifp_discrete` we studied the income fluctuation problem and solved it using value function iteration (VFI).
19+
In {doc}`ifp_discrete` we studied the income fluctuation problem and solved it
20+
using value function iteration (VFI).
2021

21-
In this lecture we'll solve the same problem using **optimistic policy iteration** (OPI), which is a faster alternative to VFI.
22+
In this lecture we'll solve the same problem using **optimistic policy
23+
iteration** (OPI), which is very general, typically faster than VFI and only
24+
slightly more complex.
2225

2326
OPI combines elements of both value function iteration and policy iteration.
2427

25-
The algorithm can be found in [this book](https://dp.quantecon.org), where a PDF is freely available.
28+
A detailed discussion of the algorithm can be found in [DP1](https://dp.quantecon.org).
2629

27-
We will show that OPI provides significant speed improvements over standard VFI for the income fluctuation problem.
28-
29-
For details on the income fluctuation problem, see {doc}`ifp_discrete`.
30+
Here our aim is to implement OPI and test whether or not it yields significant
31+
speed improvements over standard VFI for the income fluctuation problem.
3032

3133
In addition to Anaconda, this lecture will need the following libraries:
3234

@@ -48,11 +50,6 @@ from time import time
4850
```
4951

5052

51-
We'll use 64 bit floats to gain extra precision.
52-
53-
```{code-cell} ipython3
54-
jax.config.update("jax_enable_x64", True)
55-
```
5653

5754
## Model and Primitives
5855

@@ -86,15 +83,18 @@ class Model(NamedTuple):
8683
Q: jnp.ndarray # Markov matrix for income
8784
8885
89-
def create_consumption_model(R=1.01, # Gross interest rate
90-
β=0.98, # Discount factor
91-
γ=2, # CRRA parameter
92-
a_min=0.01, # Min assets
93-
a_max=5.0, # Max assets
94-
a_size=150, # Grid size
95-
ρ=0.9, ν=0.1, y_size=100): # Income parameters
86+
def create_consumption_model(
87+
R=1.01, # Gross interest rate
88+
β=0.98, # Discount factor
89+
γ=2, # CRRA parameter
90+
a_min=0.01, # Min assets
91+
a_max=5.0, # Max assets
92+
a_size=150, # Grid size
93+
ρ=0.9, ν=0.1, y_size=100 # Income parameters
94+
):
9695
"""
9796
Creates an instance of the consumption-savings model.
97+
9898
"""
9999
a_grid = jnp.linspace(a_min, a_max, a_size)
100100
mc = qe.tauchen(n=y_size, rho=ρ, sigma=ν)
@@ -104,9 +104,9 @@ def create_consumption_model(R=1.01, # Gross interest rate
104104

105105
## Operators and Policies
106106

107-
We need to define several operators for implementing OPI.
107+
We repeat some functions from {doc}`ifp_discrete`.
108108

109-
First, the right hand side of the Bellman equation:
109+
Here is the right hand side of the Bellman equation:
110110

111111
```{code-cell} ipython3
112112
@jax.jit
@@ -139,7 +139,7 @@ def B(v, model):
139139
return jnp.where(c > 0, c**(1-γ)/(1-γ) + β * EV, -jnp.inf)
140140
```
141141

142-
The Bellman operator:
142+
Here's the Bellman operator:
143143

144144
```{code-cell} ipython3
145145
@jax.jit
@@ -148,7 +148,7 @@ def T(v, model):
148148
return jnp.max(B(v, model), axis=2)
149149
```
150150

151-
The greedy policy:
151+
Here's the function that computes a $v$-greedy policy:
152152

153153
```{code-cell} ipython3
154154
@jax.jit
@@ -157,7 +157,8 @@ def get_greedy(v, model):
157157
return jnp.argmax(B(v, model), axis=2)
158158
```
159159

160-
Now we define the policy operator $T_\sigma$, which is the Bellman operator with policy $\sigma$ fixed.
160+
Now we define the policy operator $T_\sigma$, which is the Bellman operator with
161+
policy $\sigma$ fixed.
161162

162163
For a given policy $\sigma$, the policy operator is defined by
163164

@@ -381,22 +382,26 @@ ax.set_title('OPI execution time vs step size m')
381382
plt.show()
382383
```
383384

384-
The results show interesting behavior across different values of m:
385+
Here's a summary of the results
385386

386-
* When m=1, OPI is actually slower than VFI, even though they should be mathematically equivalent. This is because the OPI implementation has overhead from computing the greedy policy and calling the policy operator, making it less efficient than the direct VFI approach for m=1.
387+
* When $m=1$, OPI is slight slower than VFI, even though they should be mathematically equivalent, due to small inefficiencies associated with extra function calls.
387388

388-
* The optimal performance occurs around m=25-50, where OPI achieves roughly 3x speedup over VFI.
389+
* OPI outperforms VFI for a very large range of $m$ values.
389390

390-
* For very large m (200, 400), performance degrades as we spend too much time iterating the policy operator before updating the policy.
391+
* For very large $m$, OPI performance begins to degrade as we spend too much
392+
time iterating the policy operator.
391393

392-
This demonstrates that there's a "sweet spot" for the OPI step size m that balances between policy updates and value function iterations.
393394

394395
## Exercises
395396

396397
```{exercise}
397398
:label: ifp_opi_ex1
398399
399-
Experiment with different parameter values for the income process ($\rho$ and $\nu$) and see how they affect the relative performance of VFI vs OPI.
400+
The speed gains achieved by OPI are quite robust to parameter changes.
401+
402+
Confirm this by experimenting with different parameter values for the income process ($\rho$ and $\nu$).
403+
404+
Measure how they affect the relative performance of VFI vs OPI.
400405
401406
Try:
402407
* $\rho \in \{0.8, 0.9, 0.95\}$

lectures/os.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -264,7 +264,7 @@ Now that we have the value function, it is straightforward to calculate the opti
264264
We should choose consumption to maximize the right hand side of the Bellman equation {eq}`bellman-cep`.
265265

266266
$$
267-
c^* = \argmax_{0 \leq c \leq x} \{u(c) + \beta v(x - c)\}
267+
c^* = \arg \max_{0 \leq c \leq x} \{u(c) + \beta v(x - c)\}
268268
$$
269269

270270
We can think of this optimal choice as a *function* of the state $x$, in which case we call it the **optimal policy**.

0 commit comments

Comments
 (0)