In [1]:
import random

faces = ["H", "T"]

# First Statistical Tests

Now let's focus on finding how often 6 or fewer heads occurs. Let's print information about those extreme results. To do this, we will use an `if` statement inside our simulation loop to check if the number of heads observed is less than or equal to 6 and call a `print` statement if that condition is satisfied. 

In [2]:
num_sims = 100
flips = 20

threshold = 6  # best practice is to put any thresholds outside the simulation loop

for sim in range(num_sims):
    coins = random.choices(faces, k=flips)
    num_heads = coins.count("H")
    if num_heads <= threshold:
        print(sim, ": ", num_heads, "Heads")

4 :  6 Heads
12 :  5 Heads
14 :  5 Heads
69 :  6 Heads
78 :  5 Heads
92 :  4 Heads


We are going to be using variations on this basic function with different parameters later. Whenever you anticipate doing this, it is a good idea to turn the code block into a function. This is very easy in Jupyter:
1. Copy the last version of the simulation and paste it into a new code cell
2. Selecting the entire code block in the new Input cell, using the mouse or Command-A (Mac) or Control-A (Windows/Linux)
3. Indent the selected block by pressing Command-] (Mac) or Control-] (Windows/Linux)
4. Click at the very left edge of the first line of the cell and type `def coinsim_print():` and press Enter
5. Now finish populating the function by cutting the parameters from the code block and pasting them into the function signature as parameters with default values. Put the parameters in the order shown
6. Finally, add a docstring that explains what the function does

In [3]:
def coinsim_print(num_sims=100, threshold=6, flips=20):
    """
    Simulate multiple experiments, where each experiment involves flipping a coin
    a specified number of times and printing the results of the experiment if the
    number of heads observed is <= a threshold
    
    Inputs:
    threshold: will print if the number of heads observed is <= this value
    num_sims: the total number of experiments to simulate
    flips: the number of coin flips in one experiment
    """

    for sim in range(num_sims):
        coins = random.choices(faces, k=flips)
        num_heads = coins.count("H")
        if num_heads <= threshold:
            print(sim, ": ", num_heads, "Heads")

In [4]:
coinsim_print()

8 :  6 Heads
9 :  6 Heads
12 :  5 Heads
14 :  5 Heads
60 :  6 Heads
70 :  6 Heads
74 :  6 Heads
75 :  6 Heads
81 :  5 Heads
91 :  6 Heads


We really don't care about the particular experiment on which those extreme results occur. Instead, our goal is to estimate the probability of seeing a result that satisfies our threshold condition, and this probabilty can be estimated using the relative frequency of those results:

Let's create a new function that calculates the relative frequency of getting 6 or fewer heads on 20 flips of a fair coin. The primary changes are to:
1. Add an "event counter" to count how many times we see an experimental result that matches our criterion
2. Instead of printing information about the experiments that meet the criterion, increment the event counter
3. After the simulation loop is completed, calculate and print the relative frequency

```{warning}
Note that we give the new function a different name. It is possible to reuse the same function name, but this will produce ambiguities in Jupyter. When you call the function, the function definition that is used is the last one to be run. You can go back and rerun cells in Jupyter, which can then result in you not knowing which version of a function you are running. 

**Best practice:** Do not reuse function names unless you are completely deleting the previous function definition.
```

In [5]:
num_sims = 1000
flips = 20

threshold = 6

event_count = 0  # count how many experiments satisfy the given criteria
for sim in range(num_sims):
    coins = random.choices(faces, k=flips)
    num_heads = coins.count("H")
    if num_heads <= threshold:
        event_count += 1

print("Relative frequency of", threshold, "or fewer heads is", event_count / num_sims)

Relative frequency of 6 or fewer heads is 0.056


In [6]:
def coinsim(num_sims=1000, threshold=6, flips=20):
    """
    Simulate multiple experiments, where each experiment involves flipping a coin
    a specified number of times and printing the results of the experiment if the
    number of heads observed is <= a threshold
    
    Inputs:
    threshold: will print if the number of heads observed is <= this value
    num_sims: the total number of experiments to simulate
    flips: the number of coin flips in one experiment
    """

    event_count = 0  # count how many experiments satisfy the given criteria
    for sim in range(num_sims):
        coins = random.choices(faces, k=flips)
        num_heads = coins.count("H")
        if num_heads <= threshold:
            event_count += 1

    print(
        "Relative frequency of", threshold, "or fewer heads is", event_count / num_sims
    )

In [7]:
coinsim()

Relative frequency of 6 or fewer heads is 0.073


Note that the relative frequency can change when the simulation is rerun. How much it changes depends on the experiment, the criterion that defines the result we are looking for, and the number of experiments simulated. To provide an accurate estimate of the probability, the number of experiments simulated should be sufficiently large that the relative frequency does not change significantly when the simulation is re-run. (For this experiment with a threshold of 6 heads, one million simulation experiments is sufficient.)

In [8]:
coinsim(1_000_000)

Relative frequency of 6 or fewer heads is 0.057744


**So could the 6 heads be reasonable with a fair coin?** 

With 1,000,000 experiments in the simulation, the relative frequency is approximately 0.058. That means that with a fair coin, we will see 6 or fewer heads about 6% of the time. Since we previously decided to use a criteria that the probabilty must be less than 0.01, we cannot reject the possibility that the coin is fair.

**If we got 4 heads, could that be reasonable with a fair coin?**

In [None]:
coinsim(1_000_000, threshold=4)

In this case, the estimate of the probability is about $6 \times 10^{-3}$, which falls below the 0.01 threshold we selected. So, we would reject the possibility that the coin is fair. 

We would believe that the coin is biased towards heads but at this point have no estimate of the size of that bias.

```{index} simulation; to estimate a probability
```

## Basic Simulation to Estimate a Probability 


Let's use $R$ to denote some result for which we are trying to estimate the probability via computer simulation. As before, we will actually calculate the relative frequency of $R$ and use that as an estimate of the probability of $R$. 

Then the basic simulation structure is as follows:
1. Initialize two counters to zero:
    * an event counter, $N_R=0$, and 
    * a loop counter, $i=0$; in Python, the loop counter can be implicitly initialized and tracked using a `for ... in range()` statement
    
1. simulate the outcome of the experiment
1. if $R$ occurred, increment the event counter: $N_R=N_R+1$
1. increment the loop counter: $i=i+1$
1. If $i$ matches the target number of iterations, then calculate and print the relative frequency; otherwise go to step 2.

**We will be using variations on this basic computer simulation structure throughout this course!**