Skip to content

Commit f18b093

Browse files
committed
updates
1 parent 5c62e88 commit f18b093

19 files changed

Lines changed: 853 additions & 750 deletions

lectures/_static/quant-econ.bib

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ @article{BakshiChabiYo2012
127127
number = {1},
128128
pages = {191--208},
129129
year = {2012},
130-
doi = {10.1016/j.jfineco.2011.10.004}
130+
doi = {10.1016/j.jfineco.2012.01.003}
131131
}
132132

133133
@article{BackusGregoryZin1989,
@@ -138,7 +138,7 @@ @article{BackusGregoryZin1989
138138
number = {3},
139139
pages = {371--399},
140140
year = {1989},
141-
doi = {10.1016/0304-3932(89)90033-X}
141+
doi = {10.1016/0304-3932(89)90027-5}
142142
}
143143

144144
@article{Hansen2012,
@@ -172,7 +172,8 @@ @article{Borovicka2020
172172
number = {1},
173173
pages = {206--251},
174174
year = {2020},
175-
publisher = {University of Chicago Press}
175+
publisher = {University of Chicago Press},
176+
doi = {10.1086/704072}
176177
}
177178

178179
@article{Sandroni2000Markets,

lectures/blackwell_kihlstrom.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -962,7 +962,7 @@ The Blackwell order says that, absent costs, more information is always better f
962962

963963
With costs, the consumer chooses quality investment $\theta$ to maximize *net value*.
964964

965-
If quality investment translates into experiment accuracy with diminishing returns say, accuracy $\phi(\theta) = 1 - e^{-a\theta}$ for a rate parameter $a$ then the marginal value of information eventually decreases in $\theta$.
965+
If quality investment translates into experiment accuracy with diminishing returns -- say, accuracy $\phi(\theta) = 1 - e^{-a\theta}$ for a rate parameter $a$ -- then the marginal value of information eventually decreases in $\theta$.
966966

967967
With a convex cost $c(\theta) = c \, \theta^2$, the increasing marginal cost eventually overtakes the declining marginal value, producing an interior optimum.
968968

lectures/cass_fiscal.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1133,7 +1133,7 @@ and capital stock across time:
11331133
- The jump in $\tau_c$ depresses $\bar{R}$ below $1$, causing a *sharp drop in consumption*.
11341134
- After $T = 10$:
11351135
- The effects of anticipated distortion are over, and the economy gradually adjusts to the lower capital stock.
1136-
- Capital must now rise, requiring *austerity* consumption plummets after $t = T$, indicated by lower levels of consumption.
1136+
- Capital must now rise, requiring *austerity* --consumption plummets after $t = T$, indicated by lower levels of consumption.
11371137
- The interest rate gradually declines, and consumption grows at a diminishing rate along the path to the terminal steady-state.
11381138
11391139
+++

lectures/cass_fiscal_2.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -498,7 +498,7 @@ This means that foreign households begin repaying part of their external debt by
498498
499499
We now explore the impact of an increase in capital taxation in the domestic economy $10$ periods after its announcement at $t = 1$.
500500
501-
Because the change is anticipated, households in both countries adjust immediatelyeven though the tax does not take effect until period $t = 11$.
501+
Because the change is anticipated, households in both countries adjust immediately--even though the tax does not take effect until period $t = 11$.
502502
503503
```{code-cell} ipython3
504504
shocks_global = {

lectures/chow_business_cycles.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -351,9 +351,9 @@ The second equation is the discrete Lyapunov equation for $\Gamma_0$.
351351
> But in reality the cycles ... are generally not damped.
352352
> How can the maintenance of the swings be explained?
353353
> ... One way which I believe is particularly fruitful and promising is to study what would become of the solution of a determinate dynamic system if it were exposed to a stream of erratic shocks ...
354-
> Thus, by connecting the two ideas: (1) the continuous solution of a determinate dynamic system and (2) the discontinuous shocks intervening and supplying the energy that may maintain the swingswe get a theoretical setup which seems to furnish a rational interpretation of those movements which we have been accustomed to see in our statistical time data.
354+
> Thus, by connecting the two ideas: (1) the continuous solution of a determinate dynamic system and (2) the discontinuous shocks intervening and supplying the energy that may maintain the swings--we get a theoretical setup which seems to furnish a rational interpretation of those movements which we have been accustomed to see in our statistical time data.
355355
>
356-
> Ragnar Frisch (1933) {cite}`frisch33`
356+
> -- Ragnar Frisch (1933) {cite}`frisch33`
357357
358358
Chow's main insight is that oscillations in the deterministic system are *neither necessary nor sufficient* for producing "cycles" in the stochastic system.
359359

@@ -1408,7 +1408,7 @@ plt.show()
14081408

14091409
As $v$ increases, eigenvalues approach the unit circle: oscillations become more persistent in the time domain (left), and the spectral peak becomes sharper in the frequency domain (right).
14101410

1411-
Complex roots produce a pronounced peak at interior frequenciesthe spectral signature of business cycles.
1411+
Complex roots produce a pronounced peak at interior frequencies--the spectral signature of business cycles.
14121412

14131413
```{solution-end}
14141414
```

lectures/hansen_singleton_1982.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,7 @@ The vector $z_t$ plays the role of **instruments**.
225225

226226
The conditional Euler equation $E_t[M_{t+1}R_{t+1}^i - 1] = 0$ says that the pricing error is unpredictable given *everything* in the agent's time-$t$ information set.
227227

228-
That is a very strong restriction it says the pricing error is orthogonal to every time-$t$ measurable random variable.
228+
That is a very strong restriction -- it says the pricing error is orthogonal to every time-$t$ measurable random variable.
229229

230230
We cannot use the entire information set in practice, but we can pick any finite collection of time-$t$ observable variables $z_t$ and the orthogonality must still hold.
231231

lectures/hansen_singleton_1983.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ kernelspec:
3636
> rational expectations econometrics. A rational expectations equilibrium is a
3737
> likelihood function. Maximize it.
3838
>
39-
> An Interview with Thomas J. Sargent {cite}`evans2005interview`
39+
> -- An Interview with Thomas J. Sargent {cite}`evans2005interview`
4040
4141
## Overview
4242

@@ -1869,7 +1869,7 @@ Our estimates reproduce the pattern that {cite:t}`MehraPrescott1985` later calle
18691869

18701870
- *Low estimated risk aversion:* The estimated $\hat\alpha$ values (and thus risk aversion $-\hat\alpha$) from the table above are similar to those in {cite:t}`hansen1983stochastic`, who report $\hat\alpha$ between $-0.32$ and $-1.25$.
18711871

1872-
- *Tiny return predictability:* The unrestricted-VAR $R_R^2$ values are comparable to the 0.02 to 0.06 range in {cite:t}`hansen1983stochastic` the predictable component of stock returns is small relative to the unpredictable component.
1872+
- *Tiny return predictability:* The unrestricted-VAR $R_R^2$ values are comparable to the 0.02 to 0.06 range in {cite:t}`hansen1983stochastic` -- the predictable component of stock returns is small relative to the unpredictable component.
18731873

18741874
- *Strong rejection for Treasury bills:* The Euler-equation restrictions are decisively rejected for the nominally risk-free Treasury bill return, just as in Table 4 of {cite:t}`hansen1983stochastic`.
18751875

lectures/inventory_q.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ A firm must decide how much stock to order each period, facing uncertain demand
3535
We approach the problem in two ways.
3636

3737
First, we solve it exactly using dynamic programming, assuming full knowledge of
38-
the model the demand distribution, cost parameters, and transition dynamics.
38+
the model -- the demand distribution, cost parameters, and transition dynamics.
3939

4040
Second, we show how a manager can learn the optimal policy from experience alone, using [Q-learning](https://en.wikipedia.org/wiki/Q-learning).
4141

@@ -475,15 +475,15 @@ All the manager needs to observe at each step is:
475475
4. the discount factor $\beta$, which is determined by the interest rate, and
476476
5. the next inventory level $X_{t+1}$ (which they can read off the warehouse).
477477

478-
These are all directly observable quantities no model knowledge is required.
478+
These are all directly observable quantities -- no model knowledge is required.
479479

480480

481481
### The Q-table and the role of the max
482482

483483
It is important to understand how the update rule relates to the manager's
484484
actions.
485485

486-
The manager maintains a **Q-table** a lookup table storing an estimate $q_t(x,
486+
The manager maintains a **Q-table** -- a lookup table storing an estimate $q_t(x,
487487
a)$ for every state-action pair $(x, a)$.
488488

489489
At each step, the manager is in some state $x$ and must choose a specific action
@@ -492,7 +492,7 @@ and next state $X_{t+1}$, and updates *that one entry* $q_t(x, a)$ of the
492492
table using the rule above.
493493

494494
It is tempting to read the $\max_{a'}$ in the update rule as prescribing the
495-
manager's next action that is, to interpret the update as saying "move to
495+
manager's next action -- that is, to interpret the update as saying "move to
496496
state $X_{t+1}$ and take an action in $\argmax_{a'} q_t(X_{t+1}, a')$."
497497

498498
But the $\max$ plays a different role.
@@ -512,7 +512,7 @@ The rule governing how the manager chooses actions is called the **behavior poli
512512

513513
Because the $\max$ in the update target always points toward $q^*$
514514
regardless of how the manager selects actions, the behavior policy affects only
515-
which $(x, a)$ entries get visited and hence updated over time.
515+
which $(x, a)$ entries get visited -- and hence updated -- over time.
516516

517517
In the reinforcement learning literature, this property is called **off-policy**
518518
learning: the convergence target ($q^*$) does not depend on the behavior policy.
@@ -521,8 +521,8 @@ As long as every $(x, a)$ pair is visited infinitely often (so that every entry
521521
of the Q-table receives infinitely many updates) and the learning rates satisfy
522522
standard conditions (see below), the Q-table converges to $q^*$.
523523

524-
The behavior policy affects the *speed* of convergence visiting important
525-
state-action pairs more frequently leads to faster learning but not the
524+
The behavior policy affects the *speed* of convergence -- visiting important
525+
state-action pairs more frequently leads to faster learning -- but not the
526526
*limit*.
527527

528528
In practice, we want the manager to mostly take good actions (to earn reasonable
@@ -555,11 +555,11 @@ The stochastic demand shocks naturally drive the manager across different invent
555555

556556
A simple but powerful technique for accelerating learning is **optimistic initialization**: instead of starting the Q-table at zero, we initialize every entry to a value above the true optimum.
557557

558-
Because every untried action looks optimistically good, the agent is "disappointed" whenever it tries one the update pulls that entry down toward reality. This drives the agent to try other actions (which still look optimistically high), producing broad exploration of the state-action space early in training.
558+
Because every untried action looks optimistically good, the agent is "disappointed" whenever it tries one -- the update pulls that entry down toward reality. This drives the agent to try other actions (which still look optimistically high), producing broad exploration of the state-action space early in training.
559559

560560
This idea is sometimes called **optimism in the face of uncertainty** and is widely used in both bandit and reinforcement learning settings.
561561

562-
In our problem, the value function $v^*$ ranges from about 13 to 18. We initialize the Q-table at 20 modestly above the true maximum to ensure optimistic exploration without being so extreme as to distort learning.
562+
In our problem, the value function $v^*$ ranges from about 13 to 18. We initialize the Q-table at 20 -- modestly above the true maximum -- to ensure optimistic exploration without being so extreme as to distort learning.
563563

564564
### Implementation
565565

@@ -581,7 +581,7 @@ def greedy_policy_from_q(q, K):
581581
return σ
582582
```
583583

584-
The Q-learning loop runs for `n_steps` total steps in a single continuous trajectory just as a real manager would learn from the ongoing stream of data.
584+
The Q-learning loop runs for `n_steps` total steps in a single continuous trajectory -- just as a real manager would learn from the ongoing stream of data.
585585

586586
At specified step counts (given by `snapshot_steps`), we record the current greedy policy.
587587

lectures/lqcontrol.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1267,7 +1267,7 @@ The parameters are $r = 0.05, \beta = 1 / (1 + r), \bar c = 1.5, \mu = 2, \sigm
12671267

12681268
Here’s one solution.
12691269

1270-
We use some fancy plot commands to get a certain style feel free to
1270+
We use some fancy plot commands to get a certain style -- feel free to
12711271
use simpler ones.
12721272

12731273
The model is an LQ permanent income / life-cycle model with hump-shaped

lectures/markov_perf.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,10 @@ v_i(q_i, q_{-i}) = \max_{\hat q_i}
140140
\left\{\pi_i (q_i, q_{-i}, \hat q_i) + \beta v_i(\hat q_i, f_{-i}(q_{-i}, q_i)) \right\}
141141
```
142142

143-
**Definition** A **Markov perfect equilibrium** of the duopoly model is a pair of value functions $(v_1, v_2)$ and a pair of policy functions $(f_1, f_2)$ such that, for each $i \in \{1, 2\}$ and each possible state,
143+
```{prf:definition} Markov Perfect Equilibrium
144+
:label: def-markov-perfect-equilibrium
145+
146+
A **Markov perfect equilibrium** of the duopoly model is a pair of value functions $(v_1, v_2)$ and a pair of policy functions $(f_1, f_2)$ such that, for each $i \in \{1, 2\}$ and each possible state,
144147
145148
* The value function $v_i$ satisfies Bellman equation {eq}`game4`.
146149
* The maximizer on the right side of {eq}`game4` equals $f_i(q_i, q_{-i})$.
@@ -150,6 +153,7 @@ The adjective "Markov" denotes that the equilibrium decision rules depend only o
150153
"Perfect" means complete, in the sense that the equilibrium is constructed by backward induction and hence builds in optimizing behavior for each firm at all possible future states.
151154
152155
* These include many states that will not be reached when we iterate forward on the pair of equilibrium strategies $f_i$ starting from a given initial state.
156+
```
153157

154158
### Computation
155159

0 commit comments

Comments
 (0)