Skip to content

Commit 02f9731

Browse files
committed
Merge branch 'main' into fix-codefromfile
2 parents e65ab10 + 609a775 commit 02f9731

File tree

5 files changed

+40
-41
lines changed

5 files changed

+40
-41
lines changed

lectures/exchangeable.md

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -140,16 +140,14 @@ and partial history $W_{t-1}, \ldots, W_0$ contains no information about the pro
140140

141141
So in the IID case, there is **nothing to learn** about the densities of future random variables from past data.
142142

143-
In the general case, there is something go learn from past data.
143+
In the general case, there is something to learn from past data.
144144

145-
We turn next to an instance of this general case in which there is something to learn from past data.
145+
We turn next to an instance of this general case.
146146

147147
Please keep your eye out for **what** there is to learn from past data.
148148

149149
## A Setting in Which Past Observations Are Informative
150150

151-
We now turn to a setting in which there **is** something to learn.
152-
153151
Let $\{W_t\}_{t=0}^\infty$ be a sequence of nonnegative
154152
scalar random variables with a joint probability distribution
155153
constructed as follows.
@@ -174,7 +172,7 @@ of them once and for all and then drew an IID sequence of draws from that distri
174172

175173
But our decision maker does not know which of the two distributions nature selected.
176174

177-
The decision maker summarizes his ignorance about this by picking a **subjective probability**
175+
The decision maker summarizes his ignorance with a **subjective probability**
178176
$\tilde \pi$ and reasons as if nature had selected $F$ with probability
179177
$\tilde \pi \in (0,1)$ and
180178
$G$ with probability $1 - \tilde \pi$.
@@ -276,7 +274,7 @@ as a **prior probability** that nature selected probability distribution $F$.
276274
DeFinetti {cite}`definetti` established a related representation of an exchangeable process created by mixing
277275
sequences of IID Bernoulli random variables with parameters $\theta$ and mixing probability $\pi(\theta)$
278276
for a density $\pi(\theta)$ that a Bayesian statistician would interpret as a prior over the unknown
279-
Bernoulli paramter $\theta$.
277+
Bernoulli parameter $\theta$.
280278

281279
## Bayes' Law
282280

@@ -287,7 +285,7 @@ But how can we learn?
287285

288286
And about what?
289287

290-
The answer to the *about what* question is about $\tilde pi$.
288+
The answer to the *about what* question is about $\tilde \pi$.
291289

292290
The answer to the *how* question is to use Bayes' Law.
293291

@@ -302,7 +300,7 @@ $$
302300
\pi = \mathbb{P}\{q = f \}
303301
$$
304302

305-
where we regard $\pi$ as the decision maker's **subjective probability** (also called a **personal probability**.
303+
where we regard $\pi$ as the decision maker's **subjective probability** (also called a **personal probability**).
306304

307305
Suppose that at $t \geq 0$, the decision maker has observed a history
308306
$w^t \equiv [w_t, w_{t-1}, \ldots, w_0]$.
@@ -486,12 +484,12 @@ learning_example()
486484
Please look at the three graphs above created for an instance in which $f$ is a uniform distribution on $[0,1]$
487485
(i.e., a Beta distribution with parameters $F_a=1, F_b=1$, while $g$ is a Beta distribution with the default parameter values $G_a=3, G_b=1.2$.
488486

489-
The graph in the left plots the likehood ratio $l(w)$ on the coordinate axis against $w$ on the coordinate axis.
487+
The graph on the left plots the likehood ratio $l(w)$ on the coordinate axis against $w$ on the ordinate axis.
490488

491489
The middle graph plots both $f(w)$ and $g(w)$ against $w$, with the horizontal dotted lines showing values
492490
of $w$ at which the likelihood ratio equals $1$.
493491

494-
The graph on the right side plots arrows to the right that show when Bayes' Law makes $\pi$ increase and arrows
492+
The graph on the right plots arrows to the right that show when Bayes' Law makes $\pi$ increase and arrows
495493
to the left that show when Bayes' Law make $\pi$ decrease.
496494

497495
Notice how the length of the arrows, which show the magnitude of the force from Bayes' Law impelling $\pi$ to change,

lectures/likelihood_bayes.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -59,15 +59,14 @@ We begin by reviewing the setting in {doc}`this lecture <likelihood_ratio_proces
5959
A nonnegative random variable $W$ has one of two probability density functions, either
6060
$f$ or $g$.
6161

62-
Before the beginning of time, nature once and for all decides whether she will draw a sequence of IID draws from either
63-
$f$ or $g$.
62+
Before the beginning of time, nature once and for all decides whether she will draw a sequence of IID draws from $f$ or from $g$.
6463

6564
We will sometimes let $q$ be the density that nature chose once and for all, so
6665
that $q$ is either $f$ or $g$, permanently.
6766

6867
Nature knows which density it permanently draws from, but we the observers do not.
6968

70-
We do know both $f$ and $g$ but we don’t know which density nature
69+
We do know both $f$ and $g$, but we don’t know which density nature
7170
chose.
7271

7372
But we want to know.

lectures/likelihood_ratio_process.md

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,10 @@ This lecture describes likelihood ratio processes and some of their uses.
3838

3939
We'll use a setting described in {doc}`this lecture <exchangeable>`.
4040

41-
Among the things that we'll learn about are
41+
Among things that we'll learn are
4242

4343
* A peculiar property of likelihood ratio processes
44-
* How a likelihood ratio process is the key ingredient in frequentist hypothesis testing
44+
* How a likelihood ratio process is a key ingredient in frequentist hypothesis testing
4545
* How a **receiver operator characteristic curve** summarizes information about a false alarm probability and power in frequentist hypothesis testing
4646
* How during World War II the United States Navy devised a decision rule that Captain Garret L. Schyler challenged and asked Milton Friedman to justify to him, a topic to be studied in {doc}`this lecture <wald_friedman>`
4747

@@ -111,8 +111,8 @@ Pearson {cite}`Neyman_Pearson`.
111111

112112
To help us appreciate how things work, the following Python code evaluates $f$ and $g$ as two different
113113
beta distributions, then computes and simulates an associated likelihood
114-
ratio process by generating a sequence $w^t$ from *some*
115-
probability distribution, for example, a sequence of IID draws from $g$.
114+
ratio process by generating a sequence $w^t$ from one of the two
115+
probability distributionss, for example, a sequence of IID draws from $g$.
116116

117117
```{code-cell} python3
118118
# Parameters in the two beta distributions.
@@ -322,7 +322,7 @@ Denote $q$ as the data generating process, so that
322322
$q=f \text{ or } g$.
323323

324324
Upon observing a sample $\{W_i\}_{i=1}^t$, we want to decide
325-
which one is the data generating process by performing a (frequentist)
325+
whether nature is drawing from $g$ or from $f$ by performing a (frequentist)
326326
hypothesis test.
327327

328328
We specify
@@ -341,7 +341,7 @@ where $c$ is a given discrimination threshold, to be chosen in a way we'll soon
341341
This test is *best* in the sense that it is a **uniformly most powerful** test.
342342

343343
To understand what this means, we have to define probabilities of two important events that
344-
allow us to characterize a test associated with given
344+
allow us to characterize a test associated with a given
345345
threshold $c$.
346346

347347
The two probabilities are:
@@ -370,7 +370,7 @@ alarm.
370370
Another way to say the same thing is that among all possible tests, a likelihood ratio test
371371
maximizes **power** for a given **significance level**.
372372

373-
To have made a confident inference, we want a small probability of
373+
To have made a good inference, we want a small probability of
374374
false alarm and a large probability of detection.
375375

376376
With sample size $t$ fixed, we can change our two probabilities by
@@ -412,7 +412,8 @@ moves toward $-\infty$ when $g$ is the data generating
412412
process, ; while log$(L(w^t))$ goes to
413413
$\infty$ when data are generated by $f$.
414414

415-
This diverse behavior is what makes it possible to distinguish
415+
That disparate behavior of log$(L(w^t))$ under $f$ and $q$
416+
is what makes it possible to distinguish
416417
$q=f$ from $q=g$.
417418

418419
```{code-cell} python3
@@ -499,9 +500,9 @@ of detection and a smaller probability of false alarm associated with
499500
a given discrimination threshold $c$.
500501

501502
As $t \rightarrow + \infty$, we approach the perfect detection
502-
curve that is indicated by a right angle hinging on the green dot.
503+
curve that is indicated by a right angle hinging on the blue dot.
503504

504-
For a given sample size $t$, a value discrimination threshold $c$ determines a point on the receiver operating
505+
For a given sample size $t$, the discrimination threshold $c$ determines a point on the receiver operating
505506
characteristic curve.
506507

507508
It is up to the test designer to trade off probabilities of
@@ -540,7 +541,7 @@ plt.show()
540541
The United States Navy evidently used a procedure like this to select a sample size $t$ for doing quality
541542
control tests during World War II.
542543

543-
A Navy Captain who had been ordered to perform tests of this kind had second thoughts about it that he
544+
A Navy Captain who had been ordered to perform tests of this kind had doubts about it that he
544545
presented to Milton Friedman, as we describe in {doc}`this lecture <wald_friedman>`.
545546

546547
## Sequels

lectures/navy_captain.md

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -106,8 +106,8 @@ impose on him.
106106
The decision maker pays a cost $c$ for drawing
107107
another $z$
108108

109-
We mainly borrow parameters from the quantecon lecture “A Problem that
110-
Stumped Milton Friedman except that we increase both $\bar L_{0}$
109+
We mainly borrow parameters from the quantecon lecture
110+
{doc}`A Problem that Stumped Milton Friedman <wald_friedman>` except that we increase both $\bar L_{0}$
111111
and $\bar L_{1}$ from $25$ to $100$ to encourage the
112112
frequentist Navy Captain to take more draws before deciding.
113113

@@ -270,7 +270,7 @@ Here
270270
not rejecting $H_0$ when $H_1$ is true
271271

272272
For a given sample size $t$, the pairs $\left(PFA,PD\right)$
273-
lie on a receiver operating characteristic curve and can be uniquely
273+
lie on a **receiver operating characteristic curve** and can be uniquely
274274
pinned down by choosing $d$.
275275

276276
To see some receiver operating characteristic curves, please see this
@@ -297,7 +297,7 @@ plt.legend()
297297
plt.show()
298298
```
299299

300-
We can compute sequneces of likelihood ratios using simulated samples.
300+
We can compute sequences of likelihood ratios using simulated samples.
301301

302302
```{code-cell} python3
303303
l = lambda z: wf.f0(z) / wf.f1(z)
@@ -312,7 +312,7 @@ L1_arr = np.cumprod(l1_arr, 1)
312312
```
313313

314314
With an empirical distribution of likelihood ratios in hand, we can draw
315-
receiver operating characteristic curves by enumerating
315+
**receiver operating characteristic curves** by enumerating
316316
$\left(PFA,PD\right)$ pairs given each sample size $t$.
317317

318318
```{code-cell} python3
@@ -450,7 +450,7 @@ plt.title('$\overline{V}_{fre}$')
450450
plt.show()
451451
```
452452

453-
The following shows how do optimal sample size $t$ and targeted
453+
The following shows how optimal sample size $t$ and targeted
454454
$\left(PFA,PD\right)$ change as $\pi^{*}$ varies.
455455

456456
```{code-cell} python3
@@ -471,7 +471,7 @@ plt.show()
471471

472472
## Bayesian Decision Rule
473473

474-
In this lecture {doc}`A Problem that Stumped Milton Friedman <wald_friedman>`,
474+
In {doc}`A Problem that Stumped Milton Friedman <wald_friedman>`,
475475
we learned how Abraham Wald confirmed the Navy
476476
Captain’s hunch that there is a better decision rule.
477477

@@ -603,7 +603,7 @@ plt.legend(borderpad=1.1)
603603
plt.show()
604604
```
605605

606-
The above figure portrays the value function plotted against decision
606+
The above figure portrays the value function plotted against the decision
607607
maker’s Bayesian posterior.
608608

609609
It also shows the probabilities $\alpha$ and $\beta$.
@@ -641,6 +641,7 @@ $$
641641

642642
where
643643
$\pi^{\prime}=\frac{\pi f_{0}\left(z^{\prime}\right)}{\pi f_{0}\left(z^{\prime}\right)+\left(1-\pi\right)f_{1}\left(z^{\prime}\right)}$.
644+
644645
Given a prior probability $\pi_{0}$, the expected loss for the
645646
Bayesian is
646647

@@ -843,7 +844,7 @@ It is always positive.
843844

844845
## More details
845846

846-
We can provide more insights by focusing soley the case in which
847+
We can provide more insights by focusing on the case in which
847848
$\pi^{*}=0.5=\pi_{0}$.
848849

849850
```{code-cell} python3
@@ -853,7 +854,7 @@ $\pi^{*}=0.5=\pi_{0}$.
853854
Recall that when $\pi^*=0.5$, the frequentist decision rule sets a
854855
sample size `t_optimal` **ex ante**
855856

856-
For our parameter settings, we can compute it’s value:
857+
For our parameter settings, we can compute its value:
857858

858859
```{code-cell} python3
859860
t_optimal
@@ -870,7 +871,7 @@ t_idx = t_optimal - 1
870871

871872
By using simulations, we compute the frequency distribution of time to
872873
deciding for the Bayesian decision rule and compare that time to the
873-
frequentist rule’sfixed $t$.
874+
frequentist rule’s fixed $t$.
874875

875876
The following Python code creates a graph that shows the frequency
876877
distribution of Bayesian times to decide of Bayesian decision maker,

lectures/wald_friedman.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -376,12 +376,12 @@ c + \int \min \{ (1 - \kappa(z', \pi) ) L_0, \kappa(z', \pi) L_1, h(\kappa(z',
376376
can be understood as a functional equation, where $h$ is the unknown.
377377

378378
Using the functional equation, {eq}`funceq`, for the continuation value, we can back out
379-
optimal choices using the RHS of {eq}`optdec`.
379+
optimal choices using the right side of {eq}`optdec`.
380380

381381
This functional equation can be solved by taking an initial guess and iterating
382-
to find the fixed point.
382+
to find a fixed point.
383383

384-
In other words, we iterate with an operator $Q$, where
384+
Thus, we iterate with an operator $Q$, where
385385

386386
$$
387387
Q h(\pi) =
@@ -529,7 +529,7 @@ def solve_model(wf, tol=1e-4, max_iter=1000):
529529

530530
## Analysis
531531

532-
Let's inspect the model's solutions.
532+
Let's inspect outcomes.
533533

534534
We will be using the default parameterization with distributions like so
535535

@@ -747,7 +747,7 @@ simulation_plot(wf)
747747

748748
Increased cost per draw has induced the decision-maker to take fewer draws before deciding.
749749

750-
Because he decides with less, the percentage of time he is correct drops.
750+
Because he decides with fewer draws, the percentage of time he is correct drops.
751751

752752
This leads to him having a higher expected loss when he puts equal weight on both models.
753753

@@ -939,4 +939,4 @@ We'll dig deeper into some of the ideas used here in the following lectures:
939939
* {doc}`this lecture <likelihood_ratio_process>` describes **likelihood ratio processes** and their role in frequentist and Bayesian statistical theories
940940
* {doc}`this lecture <likelihood_bayes>` discusses the role of likelihood ratio processes in **Bayesian learning**
941941
* {doc}`this lecture <navy_captain>` returns to the subject of this lecture and studies whether the Captain's hunch that the (frequentist) decision rule
942-
that the Navy had ordered him to use can be expected to be better or worse than the rule sequential rule that Abraham Wald designed
942+
that the Navy had ordered him to use can be expected to be better or worse than the rule sequential rule that Abraham Wald designed

0 commit comments

Comments
 (0)