# TWO MEANINGS OF PROBABILITY

# Content
- a frequentist confidence interval  
- a Bayesian coverage interval  

## Overview

This notebook illustrates two distinct interpretations of a **probability distribution**
- A frequentist interpretation as **relative frequencies** anticipated to occur in a large i.i.d. sample  
- A Bayesian interpretation as a **personal opinion** (about a parameter or list of parameters) after seeing a collection of observations  

We do this  by inviting you to  write some  Python code.

It would be especially useful if you tried doing this after each question that we pose for you. We provide our own answers as the notebook unfolds, but you’ll learn more if you try writing your own code before reading and running ours.

**Code for answering questions:**

This notebook will deploy the following library:

In [2]:
import numpy as np
import pandas as pd
import prettytable as pt
import matplotlib.pyplot as plt
from scipy.stats import binom
import scipy.stats as st

## Frequentist Interpretation

Consider the following classic example. The random variable  $ X $ takes on possible values $ k = 0, 1, 2, \ldots, n $  with probabilties

$$
\textrm{Prob}(X =  k | \theta) =
\left(\frac{n!}{k! (n-k)!} \right) \theta^k (1-\theta)^{n-k}
$$

where the fixed parameter $ \theta \in (0,1) $. This is called   the **binomial distribution**.

Here
- $ \theta $ is the probability that one toss of a coin will be a head, an outcome that we encode as  $ Y = 1 $.  
- $ 1 -\theta $ is the probability that one toss of the coin will be a tail, an outcome that we denote $ Y = 0 $.  
- $ X $ is the total number of heads that came up after flipping the coin $ n $ times.  


Consider the following experiment: Take $ I $ **independent** sequences of $ n $  **independent** flips of the coin. Notice the repeated use of the adjective **independent**:
- we use it once to describe that we are drawing $ n $ independent times from a **Bernoulli** distribution with parameter $ \theta $ to arrive at one draw from a **Binomial** distribution with parameters
  $ \theta,n $.  
- we use it again to describe that we are then drawing $ I $  sequences of $ n $ coin draws.  


Let $ y_h^i \in \{0, 1\} $ be the realized value of $ Y $ on the $ h $ th flip during the $ i $ th sequence of flips. Let $ \sum_{h=1}^n y_h^i $ denote the total number of times  heads come up during the $ i $ th sequence of $ n $ independent coin flips. Let $ f_k $ record the fraction of samples of length $ n $ for which $ \sum_{h=1}^n y_h^i = k $:

$$
f_k^I = \frac{\textrm{number of samples of length n for which } \sum_{h=1}^n y_h^i = k}{
    I}
$$

The probability  $ \textrm{Prob}(X =  k | \theta) $ answers the following question:
- As $ I $ becomes large, in what   fraction of  $ I $ independent  draws of  $ n $ coin flips should we anticipate  $ k $ heads to occur?  


As usual, a law of large numbers justifies this answer.
## Exercise 10.1

1. Please write a Python class to compute $ f_k^I $  
1. Please use your code to compute $ f_k^I, k = 0, \ldots , n $ and compare them to
  $ \textrm{Prob}(X =  k | \theta) $ for various values of $ \theta, n $ and $ I $  
1. With the Law of Large numbers in mind, use your code to say something  