In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("lab08.ipynb")

<table style="width: 100%;">
<tr style="background-color: transparent;">
<td width="100px"><img src="https://cs104williams.github.io/assets/cs104-logo.png" width="90px" style="text-align: center"/></td>
<td>
  <p style="margin-bottom: 0px; text-align: left; font-size: 18pt;"><strong>CSCI 104: Data Science and Computing for All</strong><br>
                Williams College<br>
                Fall 2023</p>
</td>
</tr>


# Lab 8: Estimation and Confidence Intervals

<hr style="margin: 0px; border: 3px solid #500082;"/>

<h2>Instructions</h2>

- Before you begin, execute the cell at the TOP of the notebook to load the provided tests, as well as the following cell to setup the notebook by importing some helpful libraries. Each time you start your server, you will need to execute these cells again.  
- Be sure to consult your [Python Reference](https://cs104williams.github.io/assets/python-library-ref.html)!
- Complete this notebook by filling in the cells provided. For problems asking you to write explanations, you **must** provide your answer in the designated space. 
- Please be sure to not re-assign variables throughout the notebook.  For example, if you use `max_temperature` in your answer to one question, do not reassign it later on. Otherwise, you will fail tests that you thought you were passing previously.
- This lab has hidden tests on it. That means even though tests may say 100% passed, doesn't mean your final grade will be 100%. We will be running more tests for correctness once everyone turns in the lab.
- To use one or more late days on this lab, please fill out our [late day form](https://forms.gle/4sD16h3hN1xRqQM27) **before** the due date.

<hr/>
<h2>Setup</h2>


In [None]:
# Run this cell to set up the notebook.
# These lines import the numpy, datascience, and cs104 libraries.

import numpy as np
from datascience import *
from cs104 import *
%matplotlib inline

<hr style="margin-bottom: 0px; padding:0; border: 2px solid #500082;"/>


## 1. Warm-Up and Bootstrapping (20 pts)



<font color='#B1008E'>
    
##### Learning objectives
- Practice gathering outcomes of a simulation in an array.
- Implement a function for bootstrapping.

</font>

#### Part 1.1 Review loops to gather outcomes in an array (5 pts)


Let's revisit one of our central computation tools from the second half of the semester: *simulation loops to gather an array of outcomes*.  

Here's the basic structure of these loops: 
- Initalize an empty array that collects results.
- Then within a for-loop, "do something" many times and add the result to the array.

For this problem, you will implement the "do something" step as randomly choosing one integer from an array of four integers: 1, 2, 3, 4. 

Below, write a line that implements this. *Hint*: You should use the `numpy (np)` library for this. 

In [None]:
integer_array = make_array(1, 2, 3, 4)
choice = ...
choice

Now let's write a simulation loop to repeat 1000 times: randomly choosing from integers 1, 2, 3, 4 and saving these results in an array. 

*Note:* Do *not* call `simulate` here -- write the loop from scratch.

In [None]:
results = ...
for i in np.arange(...):
    choice = ...
    results = ... 


print("Num trials=", len(results))

In [None]:
grader.check("p1.1")

#### Part 1.2 Bootstrapping (5 pts)


Bootstrapping uses the same loop idiom as above. However, for boostrapping, the "do something" consists of two steps: 

- **Step 1**: Resample with replacement from the original data. The resample should have the same sample size as the original data. 
- **Step 2**: Calculate the statistic of interest on the resample. 

For **Step 1**, we will use the `np.random.choice` function again.

In [None]:
# Run this cell 
# This is the "original sample"
original_sample = make_array(1, 2, 3, 4, 5)
original_sample

In [None]:
# This is the command to 'sample with replacement' 
# Run this cell many times to get different results
resample = np.random.choice(original_sample, 5)
resample

Now, let's write code to test our code! 

Write a `check` to make sure that our requirement for bootstrapping is satisified: that the length of the `resample` array is the same as the `original_sample`. 

In [None]:
...

For **Step 2**, calculating the statistic of interest, we need a function that computes the statistic. In this case, we could also just use `np.mean` function directly.  However, since you'll need to write new need to write new statistics functions below, we'll make create a new function `sample_mean` that we'll use to compute our statistic of interest.)

In [None]:
def sample_mean(sample_data):
    return np.mean(sample_data)

check(sample_mean(original_sample) == 3)

Now, let's put it together! Complete the function below for bootstrapping. This should combine our loop idiom for gathering results into an array with the two steps we outlined above.  

You will find it **very** useful to review our bootstrapping lecture's [notebook](https://www.cs.williams.edu/~cs104/lectures/24-bootstrapping.html).

In [None]:
def bootstrap_statistic(observed_sample, compute_statistic, num_trials): 
    """
    Creates num_trials resamples of the initial sample.
    Returns an array of the provided statistic for those samples.

    * observed_sample: the initial sample, as an array.
    
    * compute_statistic: a function that takes a sample as 
                         an array and returns the statistic for that
                         sample. 
    
    * num_trials: the number of bootstrap samples to create.

    """
    bootstrap_statistics = ...
    
    for i in np.arange(...):
        simulated_resample = ...
        resample_statistic = ...
        bootstrap_statistics = np.append(bootstrap_statistics, ...)

    return bootstrap_statistics
    

# Run a small bootstrap and verify the results are reasonable.
tiny_bootstrapped_statistics = bootstrap_statistic(original_sample, sample_mean, 5)
tiny_bootstrapped_statistics

In [None]:
grader.check("p1.2")

#### Part 1.3 A bigger bootstrap (5 pts)


We'll now run a much larger bootstrap on the same sample.

In [None]:
bootstrapped_statistics = bootstrap_statistic(original_sample, sample_mean, 2000)
results = Table().with_column('Bootstrap Samples Mean', bootstrapped_statistics)
plot = results.hist()
original_statistic = sample_mean(original_sample)
plot.dot(original_statistic)

Here are a couple checks to ensure it is working properly.

In [None]:
grader.check("p1.3")

<!-- BEGIN QUESTION -->

#### Part 1.4 Interpreting the results (5 pts)


If you implemented bootstrapping correctly, you should see the sample mean (the red dot) at the center of your boostrap empirical distribution (the blue histogram). 

Why is the bootstrap distribution *centered* around the sample mean? 

<hr style="margin:0; border: 1px solid #FFBE0A;"/><font color='#FFBE0A'>Written Answer:</font>

_Type your answer here, replacing this text._


<hr style="margin:0; border: 1px solid #FFBE0A;"/>

<!-- END QUESTION -->

<hr style="margin-bottom: 0px; padding:0; border: 2px solid #500082;"/>


## 2. Spring Street Restaurants (45 pts)



<font color='#B1008E'>
    
##### Learning objectives
- Use bootstrapping to estimate confidence intervals for a data sample. 
    
</font>

Our goal is to estimate the "most popular" Williamstown restaurant on Spring Street. 

We surveyed 1,500 Williams students, faculty, and community members selected uniformly at random and asked each person which of the following four restaurants is the best. (*Note: This data is entirely fabricated for the purposes of this homework -- we actually love eating at all these restaurants!*) The choices of restaurants are Pera, Blue Mango, Spring St. Market, and Taste of India. After compiling the results, we release the following percentages from their sample:

| Restaurant  | Percentage|
|:------------ |:------------:|
|Pera | 8.2% |
|Blue Mango | 52.8% |
|Spring St. Market | 25% |
|Taste of India | 14% |

Now, we will attempt to estimate the corresponding *parameters*, or the percentage of the votes that each restaurant would receive if we asked every member of the population (i.e. all Williams students, faculty, and community members). We will use confidence intervals to compute a range of values that reflects the uncertainty of our estimates.

The array `observed_votes` contains the results of this survey. 

In [None]:
# Just run this cell
votes_table = Table().read_table('votes.csv')
observed_votes = votes_table.column('Vote')

np.random.choice(observed_votes, 10) #sampling to look at the variety

#### Part 2.1 (5 pts)


We have given you the function `percent_of_vote` below. It returns the **percentage** of votes in the given array of `votes`. 

In [None]:
def percent_of_vote(votes, restaurant):
    single_percentage = (sum(votes == restaurant) / len(votes)) * 100
    return single_percentage

percent_of_vote(observed_votes, 'Pera')

Complete the function `percent_blue_mango` that uses the `percent_of_vote` function to return the percent of votes for just Blue Mango. 

In [None]:
def percent_blue_mango(votes): 
    ...

In [None]:
grader.check("p2.1")

#### Part 2.2 (5 pts)


Now, let's use the `bootstrap` function you implemented previously. 

Below, complete the arguments of the function such that it simulates and returns an array of bootstrapped estimates of the percentage of voters who will vote for **Blue Mango**. 

In [None]:
num_trials = 5 #start with just a few resamples
tiny_bootstrap_votes_blue_mango = bootstrap_statistic(...,
                                                      ...,
                                                      num_trials)
    

tiny_bootstrap_votes_blue_mango

In [None]:
grader.check("p2.2")

In the following cell, run the same `bootstrap_statistic()` function again. But this time with 5000 bootstrap resamples. 

*Note:* This might take a few seconds to run.

#### Part 2.3 (5 pts)


Now run the bootstrap for 5,000 trials.

In [None]:
bootstrap_votes_blue_mango = bootstrap_statistic(...,
                                       ...,
                                       5000)
    

# Plot the histogram (no need to change this line) 
Table().with_column('Estimated Percentage', bootstrap_votes_blue_mango).hist("Estimated Percentage")

#### Part 2.4 (5 pts)


Using the array `bootstrap_votes_blue_mango`, find the values at the two edges of the middle 95% of the bootstrapped percentage estimates. That is, compute the lower and upper ends of the interval, named `blue_mango_lower_bound` and `blue_mango_upper_bound`, respectively.  You should use `percentile` function.

In [None]:
blue_mango_lower_bound = ...
blue_mango_upper_bound = ...
print('Bootstrapped 95% confidence interval for the percentage of Blue Mango voters in the population:\n',
      np.round(make_array(blue_mango_lower_bound, blue_mango_upper_bound), 2))

In [None]:
grader.check("p2.4")

Here is the plot from above, this time with your 95% confidence interval shown.

In [None]:
# Plot the histogram (no need to change this line) 
plot = Table().with_column('Estimated Percentage', bootstrap_votes_blue_mango).hist("Estimated Percentage")
plot.interval(blue_mango_lower_bound, blue_mango_upper_bound)

<!-- BEGIN QUESTION -->

#### Part 2.5 (5 pts)


Below is a visualization to let you experiment with different sample sizes and number of samples.  Run the cell and adjust the parameters to get a sense of how important sample size and number of trials is to this bootstrap.  Try to predict what happens when the sample size becomes small -- is that what you see?

In [None]:
def visualize_bootstrap(sample_size, num_trials):
    # start with a subset of the original votes, drawn at random
    sample = np.random.choice(observed_votes, sample_size)
    resampled_percentages = bootstrap_statistic(sample, percent_blue_mango, num_trials)
    plot = Table().with_column('Estimated Percentage', resampled_percentages).hist("Estimated Percentage")
    plot.set_title('sample_size=' + str(sample_size) + '; num_trials=' + str(num_trials))
    plot.interval(confidence_interval(95,resampled_percentages))

interact(visualize_bootstrap, sample_size=Slider(1,500), num_trials=Slider(10,2010, 100))

Describe what you observe.  What changes when you increase or decrease the sample size?  What changes when you increase the number of trials?

<hr style="margin:0; border: 1px solid #FFBE0A;"/><font color='#FFBE0A'>Written Answer:</font>

_Type your answer here, replacing this text._


<hr style="margin:0; border: 1px solid #FFBE0A;"/>

<!-- END QUESTION -->

#### Part 2.6 (5 pts)


The survey results seem to indicate that Blue Mango is beating all the other restaurants combined among voters. We would like to use confidence intervals to determine a range of likely values for Blue Mango's true lead over all the other restaurants combined. The calculation for Blue Mango's lead over Pera, Spring St. Market, and Taste of India combined is:

$$ \text{Blue Mango's % of the vote} - (\text{100 %} - \text{Blue Mango's % of Vote})$$

Define the function `percent_blue_mango_lead` that returns **exactly one value**" Blue Mango's percentage lead over Pera, Spring St. Market, and Taste of India combined in `votes_table`. 

*Hints:* 
- Blue Mango's lead can be negative.
- Your solution should use `percent_of_vote` or `percent_blue_mango` from one of the previous questions. 

In [None]:
def percent_blue_mango_lead(votes):
    ...
    
percent_blue_mango_lead(observed_votes)

In [None]:
grader.check("p2.6")

<!-- BEGIN QUESTION -->

#### Part 2.7 (5 pts)


Now use `bootstrap_statistic()` to compute bootstrapped estimates of Blue Mango's lead over Pera, Spring St. Market, and Taste of India combined. Plot a histogram of the resulting samples.  

*Hint:* Your function should use `percent_blue_mango_lead`.

In [None]:
num_trials = 2000
bootstrap_blue_mango_leads = bootstrap_statistic(...,
                                                 ...,
                                                 num_trials)
    

Table().with_column('Estimated Lead', bootstrap_blue_mango_leads).hist("Estimated Lead")

In [None]:
grader.check("p2.7")

<!-- END QUESTION -->

#### Part 2.8 (5 pts)


Use the simulated data in `bootstrap_blue_mango_leads` from the previous question and the function `confidence_interval` from [our library](https://www.cs.williams.edu/~cs104/auto/inference-library-ref.html) to compute an approximate 90% confidence interval for Blue Mango's true lead over Pera, Spring St. Market, and Taste of India combined. 

In [None]:
lead_ci = ...
lead_lower_bound, lead_upper_bound = lead_ci
print("Bootstrapped 90% confidence interval for Blue Mango's true lead over Pera, Spring St. Market, and Taste of India combined:\n", 
      np.round(make_array(lead_lower_bound, lead_upper_bound), 2))

In [None]:
grader.check("p2.8")

<!-- BEGIN QUESTION -->

#### Part 2.9 (5 pts)


Suppose you were consulting with Blue Mango and they wanted to know how popular they are as a business compared to their competition? 

How would you explain this to them?

Use all the previous parts you calculated (including the confidence intervals). 

<hr style="margin:0; border: 1px solid #FFBE0A;"/><font color='#FFBE0A'>Written Answer:</font>

_Type your answer here, replacing this text._


<hr style="margin:0; border: 1px solid #FFBE0A;"/>

<!-- END QUESTION -->

<hr style="margin-bottom: 0px; padding:0; border: 2px solid #500082;"/>


## 3. Interpreting Confidence Intervals (25 pts)



<font color='#B1008E'>
    
##### Learning objectives
- Practice interpreting confidence intervals. 
- Reason about the factors that make a confidence interval narrower (more certainty) or wider (less certainty). 
</font>

We computed the following 95% confidence interval for the percentage of Blue Mango voters: 

$$[50.40, 55.40]$$

(Your answer may have been a bit different due to randomness; that doesn't mean it was wrong!)

<!-- BEGIN QUESTION -->

#### Part 3.1 (5 pts)


 The staff also created 70%, 90%, and 99% confidence intervals from the same sample, but we forgot to label which confidence interval represented which percentages! First, match each confidence level (70%, 90%, 99%) with its corresponding interval in the cell below (e.g. if we had a part D and 80% level we would assign `D=80`). **Then**, explain your thought process and how you came up with your answers. 

The intervals are below:

* A: [51.47, 54.20]
* B: [49.60, 56.13]
* C: [50.80, 55.00]

*Hint*: It may be helpful to draw these out with pen and paper. 

<hr style="margin:0; border: 1px solid #FFBE0A;"/><font color='#FFBE0A'>Written Answer:</font>

Type your answer here:

* A is ... [one of 70% CI, 90% CI, 99% CI]
* B is ... [one of 70% CI, 90% CI, 99% CI]
* C is ... [one of 70% CI, 90% CI, 99% CI]

Justification: ...

<hr style="margin:0; border: 1px solid #FFBE0A;"/>

<!-- END QUESTION -->

#### Part 3.2 (5 pts)


 Suppose we produced 5,000 new samples (each one a uniform random sample of 1,500 voters) from the population and created a 95% confidence interval from each one. Roughly how many of those 5,000 intervals do you expect will actually contain the true percentage of the population? 

Assign your answer to `num_intervals_with_true`.



In [None]:
num_intervals_with_true = ...

In [None]:
grader.check("p3.2")

#### Part 3.3 (5 pts)


Recall the second bootstrap confidence interval you created for estimating Blue Mango's lead over Pera, Spring St. Market, and Taste of India combined. Among
voters in the sample, Blue Mango's lead was 6%. Our 95% confidence interval for the true lead (in the population of all voters) was:

$$[0.933\%, 10.933\%]$$

Suppose we are interested in testing a simple yes-or-no question:

> "Is the percentage of votes for Blue Mango equal to the percentage of votes for Pera, Spring St. Market, and Taste of India combined?"

Our null hypothesis is that the percentages are equal, or equivalently, that Blue Mango's lead is exactly 0. Our alternative hypothesis is that Blue Mango's lead is not equal to 0.  In the questions below, don't compute any confidence interval yourself - use only the staff's 95% confidence interval.

 Say we use a 5% p-value cutoff. Do we reject the null hypothesis, fail to reject the null hypothesis, or are we unable to tell using the staff's confidence interval? 

Assign `restaurants_equal` to the number corresponding to the correct answer.

1. Reject the null hypothesis / Data is consistent with the alternative hypothesis
2. Fail to reject the null hypothesis / Data is consistent with the null hypothesis
3. Unable to tell using the confidence interval given. 

*Hint:* Consider the relationship between the p-value cutoff and confidence. If you're confused, take a look at [this chapter](https://inferentialthinking.com/chapters/13/4/Using_Confidence_Intervals.html) of the textbook.



In [None]:
restaurants_equal = ...

In [None]:
grader.check("p3.3")

#### Part 3.4 (5 pts)


 What if, instead, we use a p-value cutoff of 1%? Do we reject the null, fail to reject the null, or are we unable to tell using our staff confidence interval? 

Assign `cutoff_one_percent` to the number corresponding to the correct answer.

1. Reject the null / Data is consistent with the alternative hypothesis
2. Fail to reject the null / Data is consistent with the null hypothesis
3. Unable to tell using our staff confidence interval



In [None]:
cutoff_one_percent = ...

In [None]:
grader.check("p3.4")

#### Part 3.5 (5 pts)


 What if we use a p-value cutoff of 10%? Do we reject, fail to reject, or are we unable to tell using our confidence interval? 

Assign `cutoff_ten_percent` to the number corresponding to the correct answer.

1. Reject the null / Data is consistent with the alternative hypothesis
2. Fail to reject the null / Data is consistent with the null hypothesis
3. Unable to tell using our staff confidence interval



In [None]:
cutoff_ten_percent = ...

In [None]:
grader.check("p3.5")

<hr class="m-0" style="border: 3px solid #500082;"/>

# You're Done!
Follow these steps to submit your work:
* Run the tests and verify that they pass as you expect. 
* Choose **Save Notebook** from the **File** menu.
* **Run the final cell** and click the link below to download the zip file. 

Once you have downloaded that file, go to [Gradescope](https://www.gradescope.com/) and submit the zip file to 
the corresponding assignment. For Lab N, the assignment will be called "Lab N Autograder".

Once you have submitted, your Gradescope assignment should show you passing all the tests you passed in your assignment notebook.


## Submission

Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. The cell below will generate a zip file for you to submit. **Please save before exporting!**

In [None]:
# Save your notebook first, then run this cell to export your submission.
grader.export(run_tests=True)