### Problem 1: Extending the Lady Tasting Tea Experiment

In this task, we extend the classic Lady Tasting Tea experiment by increasing the number of cups  from 8 (4 tea-first, 4 milk-first) to 12 (8 tea-first, 4 milk-first). We will estimate the probability of correctly identifying all cups by chance using NumPy simulations and compare the results with the original 8-cup experiment to assess extreme outcomes and discuss possible adjustments to the p-value threshold.

---


### Problem Breakdown:
 This problem is split into five parts for a step-by-step approach:
   1. Libraries: Import the necessary libraries.
   2. Part A: Simulate the Original Tea Experiment (8 cups of tea).
   3. Part B: Simulate the Extended Tea Experiment (12 cups of tea).
   4. Part C: Compare the Original Tea Experiment vs. Extended Tea Experiment.
   5. Conclusion: Interpret the results and discuss the implications for statistical significance.
   

   ---


### - Libraries:

---


In [87]:
import numpy as np  # For numerical operations and simulations

import math  # For mathematical functions

import matplotlib.pyplot as plt  # For visualizing results

import random  # For random selections

### Part A: Original Tea Experiment (8 Cups of Tea)
The original Lady Tasting Tea experiment involved 8 cups, 4 tea-first and 4 milk-first. We will calculate the probability of correctly identifying the tea-first cups by chance.



- Part A: Original Tea Experiment (8 Cups of Tea). The original Lady Tasting Tea experiment involved 8 cups, 4 tea-first and 4 milk-first. We will calculate the probability of correctly identifying the tea-first cups by chance.

---


In [88]:
# Number of cups in the original experiment:
no_cups = 8
no_cups_milk_first = 4
no_cups_tea_first = 4

In [89]:
# Calculate the number of combinations (order does not matter, no element is selected more than once):
comb = math.comb(no_cups, no_cups_milk_first)
print("The number of possible combinations is:", comb)

The number of possible combinations is: 70


In [90]:
# Calculate the probability of selecting the correct cups
prob = 1 / comb
print('Probability of correctly identifying the Original Tea Experiment (8 Cups of Tea) is:', prob)


Probability of correctly identifying the Original Tea Experiment (8 Cups of Tea) is: 0.014285714285714285


In [91]:
# Simulation using NumPy to verify the theoretical probability
# Set up the array representing the cups (1 = tea-first, 0 = milk-first)
cups_8 = np.array([1]*4 + [0]*4)

# Number of simulations
trials = 150000
successes = 0

# Simulation loop
for _ in range(trials):
    np.random.shuffle(cups_8)  # Shuffle the cups randomly
    guessed_cups = cups_8[np.random.choice(range(8), size=4, replace=False)]  # Randomly select 4 cups
    
    if np.sum(guessed_cups) == 4:  # Check if all selected cups are tea-first
        successes += 1

# Estimate probability from the simulation
simulated_probability = successes / trials
print("Estimated probability from simulation:", simulated_probability)


Estimated probability from simulation: 0.01398


> **Simulation Using NumPy to Verify the Theoretical Probability**
>
> To verify the theoretical probability of correctly identifying all tea-first cups by chance, we will simulate the **Lady Tasting Tea** experiment using **NumPy**. The simulation process will randomly shuffle the cups and estimate the probability based on numerous trials. Here's a step-by-step breakdown of how the simulation works:
>
> ---
>
> 1. **Setting Up the Cups**:  
>    We represent the cups as an array where:
>    - `1` represents a **tea-first** cup, and
>    - `0` represents a **milk-first** cup.
>    
>    In the original experiment, we have 4 tea-first cups and 4 milk-first cups, so our array will look like this:
>    ```python
>    cups_8 = np.array([1]*4 + [0]*4)
>    ```
>    This creates an array with 4 tea-first cups (`1`s) and 4 milk-first cups (`0`s).
>
> 2. **Simulation Setup**:  
>    We run the simulation for a large number of trials (in this case, 150,000) to ensure the results are statistically significant and reliable.  
>    The variable `trials` is set to 150,000, and we initialize the `successes` counter to 0, which will keep track of how many times the participant guesses the cups correctly.
>    
>    ```python
>    trials = 150000
>    successes = 0
>    ```
>
> 3. **Simulation Loop**:  
>    For each trial, the following steps occur:
>    - **Shuffle the Cups**:  
>      We randomly shuffle the cups array to simulate a random order of cups. This step represents the random arrangement of cups the participant is presented with.
>      ```python
>      np.random.shuffle(cups_8)
>      ```
>    - **Randomly Select 4 Cups**:  
>      The participant then randomly selects 4 cups from the 8 available cups. The `np.random.choice()` function is used to randomly select indices from the array.
>      ```python
>      guessed_cups = cups_8[np.random.choice(range(8), size=4, replace=False)]
>      ```
>    - **Check If All Selected Cups Are Tea-First**:  
>      After selecting the 4 cups, we check if all selected cups are tea-first (i.e., if the sum of the selected cups equals 4). The sum of the array will be equal to 4 if all 4 selected cups are tea-first (`1`s).
>      ```python
>      if np.sum(guessed_cups) == 4:
>          successes += 1
>      ```
>
> 4. **Estimating the Probability**:  
>    After running all 150,000 simulations, we estimate the probability of guessing all tea-first cups correctly by dividing the number of successes by the total number of trials.
>    ```python
>    simulated_probability = successes / trials
>    print("Estimated probability from simulation:", simulated_probability)
>    ```
>    The result will provide an empirical estimate of the probability of correctly identifying all tea-first cups by random chance.
>
> ---
>
> ### **Conclusion**
> This simulation allows us to estimate the probability based on random guessing, and compare it with the theoretical probability calculated earlier. If the probability from the simulation is close to the theoretical value, it indicates that the simulation is a valid representation of the theoretical model. The larger the number of trials, the more accurate the simulation results will be.
>
> By running this simulation, we gain insight into the likelihood of randomly identifying all tea-first cups in the experiment, further confirming the low probability of such an outcome occurring by chance.


---

### Part B: Extended Tea Experiment (12 Cups of Tea)
- In this part, we extend the experiment by increasing the total number of cups to 12, with 8 tea-first cups and 4 milk-first cups. We will calculate the new probability and compare it with the original experiment.

---

In [92]:
# Number of cups in the extended experiment
no_cups_ext = 12
no_cups_milk_first_ext = 4
no_cups_tea_first_ext = 8

In [93]:
# Calculate the number of combinations for the extended experiment
comb_ext = math.comb(no_cups_ext, no_cups_milk_first_ext)
print("The number of possible combinations is:", comb_ext)


The number of possible combinations is: 495


In [94]:
# Calculate the probability of correctly identifying the cups
prob_ext = 1 / comb_ext
print('Probability of correctly identifying the Extended Tea Experiment (12 Cups of Tea) is:', prob_ext)

Probability of correctly identifying the Extended Tea Experiment (12 Cups of Tea) is: 0.00202020202020202


In [95]:

# Simulation using NumPy to verify the extended experiment
# Set up the array representing the cups in the extended experiment
cups_12 = np.array([1]*8 + [0]*4)

# Number of simulations
successes_12 = 0

# Simulation loop for the extended experiment
for _ in range(trials):
    np.random.shuffle(cups_12)  # Shuffle the cups randomly
    guessed_cups = cups_12[np.random.choice(range(12), size=8, replace=False)]  # Select 8 cups
    
    if np.sum(guessed_cups) == 8:  # Check if all selected cups are tea-first
        successes_12 += 1

# Estimate probability for the extended experiment
simulated_probability_12 = successes_12 / trials
print("Estimated probability from simulation (12 cups):", simulated_probability_12)

Estimated probability from simulation (12 cups): 0.0019866666666666665


- Part C: Comparing Original Tea Experiment vs. Extended Tea Experiment. The extended 12-cup experiment makes it significantly harder to guess all cups correctly by chance. The probability drops from 1 in 70 (Original Tea Experiment) to 1 in 495 (Extended Tea Experiment). This is more than a 7-fold decrease in chance success.


---

### Conclusion:

- The comparison of the original and extended experiments shows a significant difference in the likelihood of randomly guessing the cups correctly.
- The extended experiment (12 cups) is much more difficult to get correct by chance, making the evidence against random guessing stronger.
- The simulations confirm the theoretical calculations and demonstrate that the extended experiment is a more reliable test.

---

## Problem 2: Normal Distribution:

In this problem, we examine how sampling variability affects estimates of the standard deviation. Specifically, we generate 100,000 samples, each consisting of 10 values drawn from a standard normal distribution. For each of these samples, we compute two different types of standard deviations:

Sample Standard Deviation (using ddof=1)

Population Standard Deviation (using ddof=0)

We then plot the histograms of these two standard deviations on the same axes to visualize their differences and explain the results. We’ll also discuss how the gap between these two estimators changes if the sample size is increased.

In [1]:
import numpy as np  # For numerical operations
import matplotlib.pyplot as plt  # For plotting
