@@ -98,12 +98,12 @@ sum of earnings
9898``` {math}
9999:label: obj_model
100100
101- {\mathbb E} \sum_{t=0}^\infty \beta^t u( y_t)
101+ {\mathbb E} \sum_{t=0}^\infty \beta^t y_t
102102```
103103
104104The constant $\beta$ lies in $(0, 1)$ and is called a ** discount factor** .
105105
106- The smaller is $\beta$, the more the agent discounts future utility relative to current utility .
106+ The smaller is $\beta$, the more the agent discounts future earnings relative to current earnings .
107107
108108The variable $y_t$ is income, equal to
109109
@@ -118,7 +118,7 @@ The worker faces a trade-off:
118118* Waiting too long for a good offer is costly, since the future is discounted.
119119* Accepting too early is costly, since better offers might arrive in the future.
120120
121- To decide optimally in the face of this trade-off, we use [ dynamic programming] ( https://dp.quantecon.org/ ) .
121+ To decide the optimal wait time in the face of this trade-off, we use [ dynamic programming] ( https://dp.quantecon.org/ ) .
122122
123123Dynamic programming can be thought of as a two-step procedure that
124124
@@ -137,25 +137,26 @@ In order to optimally trade-off current and future rewards, we need to think abo
137137To weigh these two aspects of the decision problem, we need to assign * values*
138138to states.
139139
140- To this end, let $v^* (w)$ be the total lifetime * value* accruing to an
140+ To this end, let $v^* (w)$ be the total lifetime value accruing to an
141141unemployed worker who enters the current period unemployed when the wage is
142142$w \in \mathbb{W}$.
143143
144144(In particular, the agent has wage offer $w$ in hand and can accept or reject it.)
145145
146- More precisely, $v^* (w)$ denotes the value of the objective function
147- {eq}` obj_model ` when an agent in this situation makes * optimal* decisions now
148- and at all future points in time.
146+ More precisely, $v^* (w)$ denotes the total sum of expected discounted earnings
147+ when an agent always behaves in an optimal way. points in time.
149148
150149Of course $v^* (w)$ is not trivial to calculate because we don't yet know
151150what decisions are optimal and what aren't!
152151
153- But think of $v^* $ as a function that assigns to each possible wage
154- $s$ the maximal lifetime value that can be obtained with that offer in
155- hand.
152+ If we don't know what opimal choices are, it feels imposible to calculate
153+ $v^* (w)$.
156154
157- A crucial observation is that this function $v^* $ must satisfy the
158- recursion
155+ But let's put this aside for now and think of $v^* $ as a function that assigns
156+ to each possible wage $w$ the maximal lifetime value $v^* (w)$ that can be
157+ obtained with that offer in hand.
158+
159+ A crucial observation is that this function $v^* $ must satisfy
159160
160161``` {math}
161162:label: odu_pv
@@ -175,6 +176,7 @@ ubiquitous in economic dynamics and other fields involving planning over time.
175176The intuition behind it is as follows:
176177
177178* the first term inside the max operation is the lifetime payoff from accepting current offer, since
179+ such a worker works forever at $w$ and values this income stream as
178180
179181$$
180182 \frac{w}{1 - \beta} = w + \beta w + \beta^2 w + \cdots
@@ -189,16 +191,24 @@ lifetime value from today, given current offer $w$.
189191
190192But this is precisely $v^* (w)$, which is the left-hand side of {eq}` odu_pv ` .
191193
194+ Putting this all together, we see that {eq}` odu_pv ` is valid for all $w$.
195+
192196
193197### The Optimal Policy
194198
195- Suppose for now that we are able to solve {eq}` odu_pv ` for the unknown function $v^* $.
199+ We still don't know how to compute $v^* $ (although {eq}` odu_pv ` gives us hints
200+ we'll return to below).
196201
197- Once we have this function in hand we can behave optimally (i.e., make the
198- right choice between accept and reject).
202+ But suppose for now that we do know $v^* $.
203+
204+ Once we have this function in hand we can easily make optimal choices (i.e., make the
205+ right choice between accept and reject given any $w$).
199206
200207All we have to do is select the maximal choice on the right-hand side of {eq}` odu_pv ` .
201208
209+ In other words, we make the best choice between stopping and continuing, given
210+ the information provided to us by $v^* $.
211+
202212The optimal action is best thought of as a ** policy** , which is, in general, a map from
203213states to actions.
204214
@@ -262,7 +272,7 @@ In view of {eq}`odu_pv`, this vector satisfies the nonlinear system of equations
262272
263273v^*(i)
264274= \max \left\{
265- \frac{w(i)}{1 - \beta}, \, c + \beta \sum_{1 \leq j \leq n}
275+ \frac{w(i)}{1 - \beta}, \, c + \beta \sum_{j=1}^n
266276 v^*(j) q (j)
267277 \right\}
268278\quad
@@ -284,7 +294,7 @@ Step 2: compute a new vector $v' \in \mathbb R^n$ via
284294
285295v'(i)
286296= \max \left\{
287- \frac{w(i)}{1 - \beta}, \, c + \beta \sum_{1 \leq j \leq n}
297+ \frac{w(i)}{1 - \beta}, \, c + \beta \sum_{j=1}^n
288298 v(j) q (j)
289299 \right\}
290300\quad
@@ -312,7 +322,7 @@ First, one defines a mapping $T$ from $\mathbb R^n$ to itself via
312322
313323(Tv)(i)
314324= \max \left\{
315- \frac{w(i)}{1 - \beta}, \, c + \beta \sum_{1 \leq j \leq n}
325+ \frac{w(i)}{1 - \beta}, \, c + \beta \sum_{j=1}^n
316326 v(j) q (j)
317327 \right\}
318328\quad
@@ -341,8 +351,7 @@ $\{ T^k v \}$ converges to the fixed point $v^*$ regardless of $v$.
341351
342352### Implementation
343353
344- Our default for $q$, the distribution of the state process, will be
345- [ Beta-binomial] ( https://en.wikipedia.org/wiki/Beta-binomial_distribution ) .
354+ Our default for $q$, the wage offer distribution, will be [ Beta-binomial] ( https://en.wikipedia.org/wiki/Beta-binomial_distribution ) .
346355
347356``` {code-cell} ipython3
348357n, a, b = 50, 200, 100 # default parameters
@@ -381,7 +390,23 @@ class McCallModel(NamedTuple):
381390 q: jnp.ndarray = q_default # array of probabilities
382391```
383392
384- We implement the Bellman operator $T$ from {eq}` odu_pv3 ` as follows
393+ We implement the Bellman operator $T$ from {eq}` odu_pv3 ` , which we can write in
394+ terms of array operations as
395+
396+ ``` {math}
397+ :label: odu_pv4
398+
399+ Tv
400+ = \max \left\{
401+ \frac{w}{1 - \beta}, \, c + \beta \sum_{j=1}^n v(j) q (j)
402+ \right\}
403+ \quad
404+ ```
405+
406+ (The first term inside the max is an array and the second is just a number -- here
407+ we mean that the max comparison against this number is done element-by-element for all elements in the array.)
408+
409+ We can code $T$ up as follows.
385410
386411``` {code-cell} ipython3
387412def T(model: McCallModel, v: jnp.ndarray):
@@ -416,7 +441,7 @@ plt.show()
416441You can see that convergence is occurring: successive iterates are getting closer together.
417442
418443Here's a more serious iteration effort to compute the limit, which continues
419- until measured deviation between successive iterates is below tol.
444+ until measured deviation between successive iterates is below ` tol ` .
420445
421446Once we obtain a good approximation to the limit, we will use it to calculate
422447the reservation wage.
@@ -459,8 +484,56 @@ print(res_wage)
459484Now that we know how to compute the reservation wage, let's see how it varies with
460485parameters.
461486
462- In particular, let's look at what happens when we change $\beta$ and
463- $c$.
487+ Here we compare the reservation wage at two values of $\beta$.
488+
489+ The reservation wages will be plotted alongside the wage offer distribution, so
490+ that we can get a sense of what fraction of offers will be accepted.
491+
492+ ``` {code-cell} ipython3
493+ fig, ax = plt.subplots()
494+
495+ # Get the default color cycle
496+ prop_cycle = plt.rcParams['axes.prop_cycle']
497+ colors = prop_cycle.by_key()['color']
498+
499+ # Plot the wage offer distribution
500+ ax.plot(w, q, '-', alpha=0.6, lw=2,
501+ label='wage offer distribution',
502+ color=colors[0])
503+
504+ # Compute reservation wage with default beta
505+ model_default = McCallModel()
506+ v_init = model_default.w / (1 - model_default.β)
507+ v_default, res_wage_default = compute_reservation_wage(
508+ model_default, v_init
509+ )
510+
511+ # Compute reservation wage with lower beta
512+ β_new = 0.96
513+ model_low_beta = McCallModel(β=β_new)
514+ v_init_low = model_low_beta.w / (1 - model_low_beta.β)
515+ v_low, res_wage_low = compute_reservation_wage(
516+ model_low_beta, v_init_low
517+ )
518+
519+ # Plot vertical lines for reservation wages
520+ ax.axvline(x=res_wage_default, color=colors[1], lw=2,
521+ label=f'reservation wage (β={model_default.β})')
522+ ax.axvline(x=res_wage_low, color=colors[2], lw=2,
523+ label=f'reservation wage (β={β_new})')
524+
525+ ax.set_xlabel('wage', fontsize=12)
526+ ax.set_ylabel('probability', fontsize=12)
527+ ax.tick_params(axis='both', which='major', labelsize=11)
528+ ax.legend(loc='upper left', frameon=False, fontsize=11)
529+ plt.show()
530+ ```
531+
532+ We see that the reservation wage is higher when $\beta$ is higher.
533+
534+ This is not surprising, since higher $\beta$ is associated with more patience.
535+
536+ Now let's look more systematically at what happens when we change $\beta$ and $c$.
464537
465538As a first step, given that we'll use it many times, let's create a more
466539efficient, jit-complied version of the function that computes the reservation
@@ -596,7 +669,6 @@ The big difference here, however, is that we're iterating on a scalar $h$, rathe
596669Here's an implementation:
597670
598671``` {code-cell} ipython3
599- @jax.jit
600672def compute_reservation_wage_two(
601673 model: McCallModel, # instance containing default parameters
602674 tol: float=1e-5, # error tolerance
@@ -711,7 +783,6 @@ And here's a solution using JAX.
711783``` {code-cell} ipython3
712784cdf = jnp.cumsum(q_default)
713785
714- @jax.jit
715786def compute_stopping_time(w_bar, key):
716787 """
717788 Compute stopping time by drawing wages until one exceeds `w_bar`.
@@ -734,7 +805,6 @@ def compute_stopping_time(w_bar, key):
734805 return t_final
735806
736807
737- @partial(jax.jit, static_argnames=('num_reps',))
738808def compute_mean_stopping_time(w_bar, num_reps=100000, seed=1234):
739809 """
740810 Generate a mean stopping time over `num_reps` repetitions by
@@ -753,6 +823,7 @@ def compute_mean_stopping_time(w_bar, num_reps=100000, seed=1234):
753823
754824c_vals = jnp.linspace(10, 40, 25)
755825
826+ @jax.jit
756827def compute_stop_time_for_c(c):
757828 """Compute mean stopping time for a given compensation value c."""
758829 model = McCallModel(c=c)
0 commit comments