Minor edits to code and markdown

jstac · jstac · commit 31079a2a8977 · 2025-11-24T19:58:28.000+09:00
diff --git a/lectures/ifp_discrete.md b/lectures/ifp_discrete.md
@@ -100,11 +100,14 @@ Here
 
 * $c_t$ is consumption and $c_t \geq 0$,
 * $a_t$ is assets and $a_t \geq 0$,
-* $R > 0$ is a gross rate of return, and
-* $(y_t)$ is labor income.
+* $R = 1 + r$ is a gross rate of return, and
+* $(y_t)_{t \geq 0}$ is labor income, taking values in some finite set $\mathsf Y$.
 
 We assume below that labor income dynamics follow a discretized AR(1) process.
 
+We set $\mathsf S := \mathbb{R}_+ \times \mathsf Y$, which represents the state
+space.
+
 The **value function** $V \colon \mathsf S \to \mathbb{R}$ is defined by
 
 ```{math}
@@ -116,6 +119,9 @@ V(a, y) := \max \, \mathbb{E}
 \right\}
 ```
 
+where the maximization is over all feasible consumption sequences given $(a_0,
+y_0) = (a, y)$.
+
 The Bellman equation is
 
 $$   
@@ -157,15 +163,18 @@ class Model(NamedTuple):
     Q: jnp.ndarray        # Markov matrix for income
 
 
-def create_consumption_model(R=1.01,                    # Gross interest rate
-                             β=0.98,                    # Discount factor
-                             γ=2,                       # CRRA parameter
-                             a_min=0.01,                # Min assets
-                             a_max=5.0,                 # Max assets
-                             a_size=150,                # Grid size
-                             ρ=0.9, ν=0.1, y_size=100): # Income parameters
+def create_consumption_model(
+        R=1.01,                    # Gross interest rate
+        β=0.98,                    # Discount factor
+        γ=2,                       # CRRA parameter
+        a_min=0.01,                # Min assets
+        a_max=5.0,                 # Max assets
+        a_size=150,                # Grid size
+        ρ=0.9, ν=0.1, y_size=100   # Income parameters
+    ):
     """
     Creates an instance of the consumption-savings model.
+
     """
     a_grid = jnp.linspace(a_min, a_max, a_size)
     mc = qe.tauchen(n=y_size, rho=ρ, sigma=ν)
@@ -175,6 +184,10 @@ def create_consumption_model(R=1.01,                    # Gross interest rate
 
 Now we define the right hand side of the Bellman equation.
 
+We'll use a vectorized coding style reminiscent of Matlab and NumPy (avoiding all loops).
+
+Your are invited to explore an alternative style based around `jax.vmap` in the Exercises.
+
 ```{code-cell} ipython3
 @jax.jit
 def B(v, model):
@@ -233,6 +246,7 @@ def get_greedy(v, model):
     return jnp.argmax(B(v, model), axis=2)
 ```
 
+
 ### Value function iteration
 
 Now we define a solver that implements VFI.
@@ -260,6 +274,7 @@ def value_function_iteration_python(model, tol=1e-5, max_iter=10_000):
 Next we write a version that uses `jax.lax.while_loop`.
 
 ```{code-cell} ipython3
+@jax.jit
 def value_function_iteration(model, tol=1e-5, max_iter=10_000):
     """
     Implements VFI using successive approximation.
@@ -341,11 +356,13 @@ print(f"Relative speed = {python_time / jax_without_compile:.2f}")
 In this exercise, we explore an alternative approach to implementing value function iteration using `jax.vmap`.
 
 For this simple optimal savings problem, direct vectorization is relatively easy.
+
 In particular, it's straightforward to express the right hand side of the
 Bellman equation as an array that stores evaluations of the function at every
 state and control.
 
 However, for more complex models, direct vectorization can be much harder.
+
 For this reason, it helps to have another approach to fast JAX implementations
 up our sleeves.
 
diff --git a/lectures/ifp_opi.md b/lectures/ifp_opi.md
@@ -16,17 +16,19 @@ kernelspec:
 
 ## Overview
 
-In {doc}`ifp_discrete` we studied the income fluctuation problem and solved it using value function iteration (VFI).
+In {doc}`ifp_discrete` we studied the income fluctuation problem and solved it
+using value function iteration (VFI).
 
-In this lecture we'll solve the same problem using **optimistic policy iteration** (OPI), which is a faster alternative to VFI.
+In this lecture we'll solve the same problem using **optimistic policy
+iteration** (OPI), which is very general, typically faster than VFI and only
+slightly more complex.
 
 OPI combines elements of both value function iteration and policy iteration.
 
-The algorithm can be found in [this book](https://dp.quantecon.org), where a PDF is freely available.
+A detailed discussion of the algorithm can be found in [DP1](https://dp.quantecon.org).
 
-We will show that OPI provides significant speed improvements over standard VFI for the income fluctuation problem.
-
-For details on the income fluctuation problem, see {doc}`ifp_discrete`.
+Here our aim is to implement OPI and test whether or not it yields significant
+speed improvements over standard VFI for the income fluctuation problem.
 
 In addition to Anaconda, this lecture will need the following libraries:
 
@@ -48,11 +50,6 @@ from time import time
 ```
 
 
-We'll use 64 bit floats to gain extra precision.
-
-```{code-cell} ipython3
-jax.config.update("jax_enable_x64", True)
-```
 
 ## Model and Primitives
 
@@ -86,15 +83,18 @@ class Model(NamedTuple):
     Q: jnp.ndarray        # Markov matrix for income
 
 
-def create_consumption_model(R=1.01,                    # Gross interest rate
-                             β=0.98,                    # Discount factor
-                             γ=2,                       # CRRA parameter
-                             a_min=0.01,                # Min assets
-                             a_max=5.0,                 # Max assets
-                             a_size=150,                # Grid size
-                             ρ=0.9, ν=0.1, y_size=100): # Income parameters
+def create_consumption_model(
+        R=1.01,                    # Gross interest rate
+        β=0.98,                    # Discount factor
+        γ=2,                       # CRRA parameter
+        a_min=0.01,                # Min assets
+        a_max=5.0,                 # Max assets
+        a_size=150,                # Grid size
+        ρ=0.9, ν=0.1, y_size=100   # Income parameters
+    ):
     """
     Creates an instance of the consumption-savings model.
+
     """
     a_grid = jnp.linspace(a_min, a_max, a_size)
     mc = qe.tauchen(n=y_size, rho=ρ, sigma=ν)
@@ -104,9 +104,9 @@ def create_consumption_model(R=1.01,                    # Gross interest rate
 
 ## Operators and Policies
 
-We need to define several operators for implementing OPI.
+We repeat some functions from {doc}`ifp_discrete`.
 
-First, the right hand side of the Bellman equation:
+Here is the right hand side of the Bellman equation:
 
 ```{code-cell} ipython3
 @jax.jit
@@ -139,7 +139,7 @@ def B(v, model):
     return jnp.where(c > 0, c**(1-γ)/(1-γ) + β * EV, -jnp.inf)
 ```
 
-The Bellman operator:
+Here's the Bellman operator:
 
 ```{code-cell} ipython3
 @jax.jit
@@ -148,7 +148,7 @@ def T(v, model):
     return jnp.max(B(v, model), axis=2)
 ```
 
-The greedy policy:
+Here's the function that computes a $v$-greedy policy:
 
 ```{code-cell} ipython3
 @jax.jit
@@ -157,7 +157,8 @@ def get_greedy(v, model):
     return jnp.argmax(B(v, model), axis=2)
 ```
 
-Now we define the policy operator $T_\sigma$, which is the Bellman operator with policy $\sigma$ fixed.
+Now we define the policy operator $T_\sigma$, which is the Bellman operator with
+policy $\sigma$ fixed.
 
 For a given policy $\sigma$, the policy operator is defined by
 
@@ -381,22 +382,26 @@ ax.set_title('OPI execution time vs step size m')
 plt.show()
 ```
 
-The results show interesting behavior across different values of m:
+Here's a summary of the results
 
-* When m=1, OPI is actually slower than VFI, even though they should be mathematically equivalent. This is because the OPI implementation has overhead from computing the greedy policy and calling the policy operator, making it less efficient than the direct VFI approach for m=1.
+* When $m=1$, OPI is slight slower than VFI, even though they should be mathematically equivalent, due to small inefficiencies associated with extra function calls.
 
-* The optimal performance occurs around m=25-50, where OPI achieves roughly 3x speedup over VFI.
+* OPI outperforms VFI for a very large range of $m$ values.
 
-* For very large m (200, 400), performance degrades as we spend too much time iterating the policy operator before updating the policy.
+* For very large $m$, OPI performance begins to degrade as we spend too much
+  time iterating the policy operator.
 
-This demonstrates that there's a "sweet spot" for the OPI step size m that balances between policy updates and value function iterations.
 
 ## Exercises
 
 ```{exercise}
 :label: ifp_opi_ex1
 
-Experiment with different parameter values for the income process ($\rho$ and $\nu$) and see how they affect the relative performance of VFI vs OPI.
+The speed gains achieved by OPI are quite robust to parameter changes.
+
+Confirm this by experimenting with different parameter values for the income process ($\rho$ and $\nu$).
+
+Measure how they affect the relative performance of VFI vs OPI.
 
 Try:
 * $\rho \in \{0.8, 0.9, 0.95\}$
diff --git a/lectures/os.md b/lectures/os.md
@@ -264,7 +264,7 @@ Now that we have the value function, it is straightforward to calculate the opti
 We should choose consumption to maximize the right hand side of the Bellman equation {eq}`bellman-cep`.
 
 $$
-    c^* = \argmax_{0 \leq c \leq x} \{u(c) + \beta v(x - c)\}
+    c^* = \arg \max_{0 \leq c \leq x} \{u(c) + \beta v(x - c)\}
 $$
 
 We can think of this optimal choice as a *function* of the state $x$, in which case we call it the **optimal policy**.