In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("pre07.ipynb")

<table style="width: 100%;">
<tr style="background-color: transparent;">
<td width="100px"><img src="https://cs104williams.github.io/assets/cs104-logo.png" width="90px" style="text-align: center"/></td>
<td>
  <p style="margin-bottom: 0px; text-align: left; font-size: 18pt;"><strong>CSCI 104: Data Science and Computing for All</strong><br>
                Williams College<br>
                Fall 2025</p>
</td>
</tr>


# Prelab 7: Hypothesis Testing and P-values

**Instructions**
- Before you begin, execute the cell at the TOP of the notebook to load the provided tests, as well as the following cell to setup the notebook by importing some helpful libraries. Each time you start your server, you will need to execute these cells again.  
- Be sure to consult your [Python Reference](https://cs104williams.github.io/assets/python-library-ref.html)!
- Complete this notebook by filling in the cells provided. 
- Please be sure to not re-assign variables throughout the notebook.  For example, if you use `max_temperature` in your answer to one question, do not reassign it later on. Otherwise, you will fail tests that you thought you were passing previously.
- There are no hidden tests in prelabs.

<hr/>
<h2>Setup</h2>


In [None]:
# Run this cell to set up the notebook.
# These lines import the numpy, datascience, and cs104 libraries.

import numpy as np
from datascience import *
from cs104 import *
%matplotlib inline

<hr style="margin-bottom: 0px; padding:0; border: 2px solid #500082;"/>


## 1. Incubating green sea turtle eggs (20 pts)



<font color='#B1008E'>
    
##### Learning objectives
- Construct an inference problem using a null and alternative hypothesis.
- Use `simulate_sample_statistic` to evaluate the model for the null hypothesis.
- Determine whether observed data is consistent with the null model.
    
</font>

Green sea turtles exhibit temperature-dependent sex determination.  That is, the temperature at which their eggs are incubated impacts the sex of the hatchlings.  Incubating at 29.3° Celcius leads to a 50% chance that each hatchling is female, but higher temperatures increase the chance of a hatchling being female.  Even an increase of a couple degrees can cause the vast majority of eggs to hatch as females.

**The scenario:** Steve and Katie put 100 green sea turtle eggs in an incubator set to 29.3°.
When the eggs hatch, they discover they have 67 females and 33 males.
They are suspicious that their incubator's temperature sensor is not accurate.  Should they send it out for repairs?

**The Null and Alt hypotheses:**  Let's decide whether the incubator is broken via hypothesis testing.  We begin with the following two hypotheses:
* **Null hypothesis**: The incubator's temperature control properly maintains 29.3°, and each hatchling has a 50% chance of being female.  
* **Alt. hypothesis**: The incubator is temperature control is broken and does not maintains 29.3°, and each hatchling does not have a 50% chance of being female.

**Test Statistic:**  Under the null hypothesis, we would expect 50% of our incubated hatchlings to be female, with any variation due to random chance.  We'll design a statistic to measure how close a sample is to that expectation:

    abs(percent-female-in-sample - 50%)

#### Part 1.1 Sampling under null hypothesis (5 pts)


We begin the model for the null hypothesis by creating an array containing the proportions of female and male hatchlings incubated at 29.3°.

In [None]:
hatchling_proportions = make_array(0.5, 0.5)

Write the following function `sample_hatchlings_under_null` to simulate samples from the null hypothesis. 

In this function, use `sample_proportions` to create a sample of a given size using the `hatchling_proprtions` (the null hypothesis proportions). The output of this function is array with two items: the **percent** of females in the sample and the **percent** of males in the sample.

In [None]:
def sample_hatchlings_under_null(sample_size):
    """
    Returns a sample of size sample_size as 
    an array of [% female, % male].
    """
    ...

# One sample with sample size 100 
# Run this a few times to verify that your samples have some variability.
# If they are all [50,50], there is something wrong!
sample_hatchlings_under_null(100)

In [None]:
grader.check("p1.1")

#### Part 1.2 Implement the test statistic (5 pts)


Now, implement the function `abs_difference_from_null`. This function returns a statistic on the sample which computes how far a sample's female percentage is from the null model's parameter -- 50% of hatchlings will be female.

In [None]:
def abs_difference_from_null_parameter(sample):
    """
    Takes a sample as an array [% female, % male] and returns
    the absolute difference between % female in the sample and 50%
    """    
    ...

# Should be 10:
abs_difference_from_null_parameter(make_array(40,60))

In [None]:
grader.check("p1.2")

#### Part 1.3 Statistic for observed sample (5 pts)


Use the function you just wrote to calculate this statistic for Steve and Katie's observed brood of 67% female and 33% male.

In [None]:
steve_and_katie_brood_percents = make_array(67, 33)
steve_and_katie_brood_statistic = ...

steve_and_katie_brood_statistic

In [None]:
grader.check("p1.3")

#### Part 1.4 Favor null or alt hypothesis? (5 pts)


We are now ready to simulate the null hypothesis and evaluate whether Steve's and Katie's brood is consistent with its assumptions.  The following code does that using our two helper functions, a sample size of 100, and 10,000 trials.

In [None]:
hatchling_statistics = simulate_sample_statistic(sample_hatchlings_under_null, 100, abs_difference_from_null_parameter, 10000)

plot = Table().with_columns('Statistic: abs(percent female - 50)', hatchling_statistics).hist()
plot.set_title('Null hypothesis empirical distrubtion')

# A red dot for Steve's and Katie's observed brood.
plot.dot(steve_and_katie_brood_statistic)

Set reject_null to True or False to indicate whether or not we can reject the null hypothesis based on Steve and Katie's brood.  In this case, it should be obvious whether we can reject it without using p-values.

In [None]:
reject_null = ...

In [None]:
grader.check("p1.4")

<hr style="margin-bottom: 0px; padding:0; border: 2px solid #500082;"/>


## 2. Calculating p-values (30 pts)



<font color='#B1008E'>
    
##### Learning objectives
- Build intuition for a p-value 
- Implement the function that calculates a p-value.
    
</font>

#### Part 2.1 Definitions (5 pts)


Assign the variable `answer` below to the integer that correponds to the **true** statement. 

1. A **p-value** is the probability that the null hypothesis is true.

2. A **p-value** is the probability that the alternative is true. 

3. A **p-value** is the probability under the null hypothesis of obtaining a statistic at least as extreme as the observed statistic. 

4. A **p-value** is the probability the observed statistic is produced by random chance alone. 


In [None]:
answer = ...

In [None]:
grader.check("p2.1")

#### Part 2.2 Intuition of "extreme" values (5 pts)


Here, we present a tiny subset of data from a sea turtle egg simulation (similar to the large simulation we ran above).  

This data reflects that:
- We simulated statistics from the null hypothesis for ten trials.  These are recorded in `tiny_simulated_statistics`.  
- In this new observed sample, Steve and Katie observed 60% females. Since the null hypothesis parameter is 50% females, this gives a statistic of 10 (recorded in `tiny_observed_statistic`). 

In [None]:
# No need to change anything in this cell, just run
tiny_simulated_statistics = make_array(4, 4, 6, 8, 8, 9, 9, 10, 11, 12) 
tiny_observed_statistic = 10

Without using any code, look at the array above above and count how many elements of  `tiny_simulated_statistics` are the "same or more extreme" (greater than or equal to) than `tiny_observed_statistic`. 

Assign your answer to `count_more_extreme` below.

In [None]:
count_more_extreme = ...

In [None]:
grader.check("p2.2")

#### Part 2.3 Calculating p-values (5 pts)


The p-value is a proportion. Specifically, it is the proportion of `tiny_simulated_statistics` that are "same or more extreme" than `tiny_observed_statistic`.  

Use `count_more_extreme` and the `tiny_simulated_statistics` array to calculate the p-value. 

In [None]:
p_value = ...
print("The p-value is", p_value)

In [None]:
grader.check("p2.3")

#### Part 2.4 Writing a generic p-value function (5 pts)


Congrats! You calculated your first p-value! Now that you understand what a p-value is, let's use a fundamental principle of computing, *abstraction*. Write a function below, called `empirical_pvalue`,  that calculates a p-value given the following two arguments: 
- `null_statistics`: An array where each item in the array is the statistic for a single sample simulated from the null hypothesis
- `observed_statistic`: A float that is the statistic for the observed data. 

You will use this function many times in this lab and future labs. 

In [None]:
def empirical_pvalue(null_statistics, observed_statistic): 
    """
    Return the proportion of the null statistics that are greater than 
    or equal to the observed statistic.
    """
    ...

In [None]:
# Check that your function gives the same pvalue you calculated
empirical_pvalue(tiny_simulated_statistics, tiny_observed_statistic)

In [None]:
grader.check("p2.4")

#### Part 2.5  P-value for Steve and Katie's brood (5 pts)


Let's now compute the p-value for Steve and Katie's observed brood in Question 1.  We repeat the following code to show the null hypothesis empirical distribution and the observed data:

In [None]:
plot = Table().with_columns('Statistic: abs(percent female - 50)', hatchling_statistics).hist(left_end=steve_and_katie_brood_statistic)
plot.set_title('Null hypothesis empirical distrubtion')

# A red dot for Steve's and Katie's observed brood.
plot.dot(steve_and_katie_brood_statistic)

Using your `empirical_pvalue` function, compute the p-value for `steve_and_katie_brood_statistic`.

In [None]:
p_value = ...
p_value

In [None]:
grader.check("p2.5")

In this case, we can easily reject the null hypothesis with a 5% (or even a 1%) p-value cutoff.

#### Part 2.6  P-value for a different brood (5 pts)


Suppose Steve and Katie ended up with 59 females and 41 males.  The following computes the statistic for this observation and plots it as before.

In [None]:
steve_and_katie_brood_statistic2 = abs_difference_from_null_parameter(make_array(0.59, 0.41) * 100)

plot = Table().with_columns('Statistic: abs(percent female - 50)', hatchling_statistics).hist(left_end=steve_and_katie_brood_statistic2)
plot.set_title('Null hypothesis empirical distrubtion')

# A red dot for Steve's and Katie's observed brood.
plot.dot(steve_and_katie_brood_statistic2)

Using your `empirical_pvalue` function, compute the p-value for `steve_and_katie_brood_statistic2`.

In [None]:
p_value2 = ...
p_value2

In [None]:
grader.check("p2.6")

In this case, we cannot reject the null hypothesis because the observation is above the conventional 5% p-value cutoff.

<hr class="m-0" style="border: 3px solid #500082;"/>

# You're Done!
Follow these steps to submit your work:
* Run the tests and verify that they pass as you expect. 
* Choose **Save Notebook** from the **File** menu.
* **Run the final cell** and click the link below to download the zip file. 

Once you have downloaded that file, go to [Gradescope](https://www.gradescope.com/) and submit the zip file to 
the corresponding assignment. For Prelab N, the assignment will be called "Prelab N Autograder".

Once you have submitted, your Gradescope assignment should show you passing all the tests you passed in your assignment notebook.


## Submission

Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. The cell below will generate a zip file for you to submit. **Please save before exporting!**

In [None]:
# Save your notebook first, then run this cell to export your submission.
grader.export(pdf=False, run_tests=True)