# Problems 2,3,4,5

## Laplace's inference

Laplace noted that, in Paris between 1745 and 1770, the number of female to male births was 241,945 to 251,527.

Let $x = 241945$ and $n=241945+251527 =493472$, noting that $\frac{x}{n} \approx .4903$, so that about 49% of the observed births were female.

Laplace applied Bayes' Theorem and concluded that, given that one observed $x$-many female births (success) in $n$-many (ostensibly) independent Bernoulli trials with probability $\theta$, one could infer that the probability that $\theta>\frac{1}{2}$ was such an

> excessively small number, that one could regard it as morally certain that the differences observed in Paris between the births of males and females is due to a much greater possibility of a male birth ({cite}`Laplace1778-km` p. 431; cited in {cite}`Stigler1990-bs` p. 134).

Many of the below problems, particularly the mathematical parts of them, are mentioned and discussed in a standard text on Bayesianism, namely {cite}`Gelman2013-yi` pp. 31-32, 56, which is following {cite}`Stigler1990-bs` as far as the historical discussion goes.  Problems 2,3,4,5 are my attempt to make this kind of problem more accessible to philosophers.

*Note 1*: if you don't like the female-male example, replace it with the number of successes in some other kind of trial.

*Note 2*: this example was historically one of the first applications of Bayes' Theorem. From a cursory glance at [summaries based on more recent studies](https://ourworldindata.org/gender-ratio#the-sex-ratio-at-birth), Laplace's conclusion still holds up.

## Problem 2

Here is the formal setup that we work with:

- Suppose that $n,m\geq 1$ are fixed.

- Suppose that $X\sim p_{\theta}$, where $p_{\theta}$ is $\mathrm{Binom}(n,\theta)$. 

- Suppose that $\Theta= \{\frac{1}{m+1}, \ldots, \frac{m}{m+1}\}$, so that it has $m$-elements.

- Suppose that the prior is uniform, that is, $p(\theta)=\frac{1}{m}$.

To make it a little more concrete, note that:

- Suppose that $m=3$ and $n\geq 1$ is fixed.

- $\Theta = \{\frac{1}{4}, \frac{1}{2}, \frac{3}{4}\}$ and one has $p_{\frac{1}{4}}\sim \mathrm{Binom}(n,\frac{1}{4})$ and  $p_{\frac{1}{2}}\sim \mathrm{Binom}(n,\frac{1}{2})$ and $p_{\frac{3}{4}}\sim \mathrm{Binom}(n,\frac{3}{4})$.

- Hence we are dealing with $n$-flips of either a coin heavily biased towards tails, a fair coin, or a coin biased towards heads.

- In general, when $m$ is odd, the fair coin will be in the parameter space, and will be depicted with its pdf centered in a bell-like shape at $\frac{n}{2}$.

This problem has two parts. 

**First**, show that, since the prior is uniform, one can derive the following simplified version of Bayes' Theorem:

$$p(\theta \mid x) = \frac{p(x\mid \theta)}{\sum_{\theta\in \Theta} p(x\mid \theta)}$$

Your answer should just be a proof with 1-2 lines.

Laplace described this version of Bayes' Theorem as follows:

> If an event can be produced by a number of different causes, then the probabilities of these causes  given the event  are to each other as the probabilities of the event  given the causes , and the probability of the existence of each of these is equal to the probability of the event given the cause, divided by sum of each of these causes ({cite}`Laplace1774-st`, translation from {cite}`Stigler1990-bs` p. 102).

**Second**, rewrite this quotation word-for-word but include the mathematical expressions $\theta$, $\Theta$, $x$, $p(x\mid \theta)$, $p(\theta\mid x)$, and $\sum_{\theta\in \Theta} p(x\mid \theta)$ where appropriate. The aim is simply to make sure that we see the correspondence between the mathematical formlism of Bayes' theorem (in the case of the uniform prior) and what its original developers thought it meant: they were clearly thinking of the elements of the parameter space as partially indicative of causes which reveal their effects through the sample space.



## Before proceeding further

Before proceeding further, please follow the following instructions:

1. Click on the <i class="fa fa-rocket" aria-hidden="true"></i> icon at the top center-right, which will launch the page in a binder (experience suggests that Firefox and Chrome work best)

2. After it loads (it takes about 1-2 minutes), use the menu bars to access:

- Run | Run All Cells


## Problem 3

The below graph to the left shows two pieces of information:

- For each element $\theta$ of the parameter space, it displays its pdf, drawn in different colored lines. If a colored line is associated to $\theta$, then for a value $x$ on the $x$-axis, the value of the colored line answers the question: if the random variable were distributed according to $p_{\theta}$, then how probable it is that the random variable would take value $x$? Since we are dealing with Bernoulli distributions $\mathrm{Binom}(n,\theta)$, the possible values of $x$ are $0,1,2,\ldots, n$, and they should be clustered in a bell like shape around $\theta$. 

- The dotted purple line displays the evidence, that is $p(x)=\sum_{\theta\in \Theta} p(\theta)\cdot p(\theta\mid x)$. 

The graph in the center displays three pieces of interlinked information:

- For each $x$ on the $x$-axis, it displays immediately above it the values of the posterior $p(\theta\mid x)$, as $\theta$ ranges over the parameter space. Note that the values right above each $x$ thus sum to one. However, when one focuses on a fixed colored line associated to a specific parameter $\theta$ and follows it from left to right as the values of $x$ get bigger, one is considering how the posterior $p(\theta\mid x)$ changes as a function of $x$.

- It allows you to select a specific value of $x$ and focus on what is directly above it, marked with a dotted black line. This allows one to hone in the posterior $p(\theta\mid x)$ as $x$ is fixed and $\theta$ is allowed to vary.

- For the value of $x$ which you chose, it displays the top two highest values of $p(\theta\mid x)$ by building a horizontal line that takes one from this value over to the $y$-axis, so you can quickly tell by looking what these top two values are. The top value is in gold, and the second top value is in silver.

The graph to the right displays one piece of information:

- In addition to the $x$ that you already chose, it allows you to vary the choice of the parameter $\iota=\frac{i+1}{m+1}$ for $i=0, \ldots, m-1$, by varying $i$: as you choose $i$ closer to $m$ you are choosing a parameter with a higher bias towards heads. The graph then displays the value $\sum_{\theta> \iota} p(\theta\mid x)$, which is the probability, conditional on having observed that the random variable took value $x$, that the parameter is $>\iota$, which we hence abbreviate as $Pr(\theta> \iota\mid x)$.

In [10]:
from ipynb.fs.full.Problem02back import *

interact(bayes_binomials_uniform_prior, n=n_slider, m=m_slider, x=x_slider, ι=ι_slider)

interactive(children=(IntSlider(value=1793, description='n', max=5000, min=1), IntSlider(value=9, description=…

<function ipynb.fs.full.Problem02back.bayes_binomials_uniform_prior(n, m, x=None, ι=None)>

Answer the following questions in 1-2 complete English sentences each:

1. Why does increasing the value of $n$ make the bell-like shapes of the posterior change into more rectangle-like shapes? 

2. Why does increasing the value of $m$ lower the posterior? 

3. For small values of $n,m$ (say $n=100$ and $m=9$), as one increases the value of $x$ the difference between the gold and silver horizontal lines fluctuates. Why is this? 

(Note: you will only be able to change the values once you moved to the binder).

## Problem 4

For each of the following three settings, write down $Pr(\theta>\iota\mid x)$ and $Binom(n,\theta)$ from the rightmost graph:

1. $n=500$, $m=9$, $x=244$ (approximately 49% of 500), and $\iota = 4$.

2. $n=5000$, $m=9$, $x=2450$ (approximately 49% of 5000), and $\iota = 4$.

3. $n=5000$, $m=99$, $x=2450$ (approximately 49% of 5000), and $\iota = 49$.

In a paragraph (4-5 sentences), describe whether this on the whole supports Laplace's conclusion or not.

*Note 1*: we used $n=5000$ as our maximum because if we tried Laplace's actual number our machine would take about 5-10 minutes to do it. If you want to try just type the following in a new cell and press shift return and wait 5-10 minutes (note that if you do this, the above graph is going to become non-responsive):

```
bayes_binomials_uniform_prior(241945+251527, 99, 241945, 49)
```

*Note 2*: For all these numbers $n,m,x,\iota$, the toggles might only allow you to get within a small range of these values; don't worry about that, just get as close as you can.

*Note 3*: note that

- the value $Pr(\theta>\iota\mid x)$ is the place where the red line meets the $y$-axis; just eyeball it.

- the values of $n,\theta$ are displayed in the title $\iota \sim Binom(n,\theta)$.


## Problem 5

Laplace's inference can be compactly expressed as follows:

1. Since 49% of observed births in a large sample are female, it is morally certain that it there is a much greater possibility of male birth.

Many of our inferences are contrastive in nature. Consider the following variations:

2. Since 49% of observed births in a large sample  are female, it is significantly more plausible that there is much greater possibility of male birth than that our data was generated by a fluke of measurement. 

3. Since 49% of observed births in a large sample  are female, it is significantly more plausible that there is much greater possibility of male birth than that there is a much greater possibility of female birth.

4. Since 49% of *observed past* births in a large sample  are female, it is significantly plausible that 49% of *unobserved future* births are female. 

In a paragraph (4-5 sentences), indicate whether you think that any of 2, 3, or 4 (and if so which one) is a good representation of Laplace's inference in 1.