# Confidence Intervals
This notebook is to help you get familiar and comfortable with confidence intervals on proportion (percentage of total) metrics such as sensitivity.

To run a code block press `shift`+`enter`

In [None]:
from IPython.display import clear_output
from time import sleep

from helpers import run_many_proportion_trials, display_trials, run_single_proportion_trial, display_single_trial, within_confidence_interval

# Variables

| Variable | Description |
|-|-|
| true_sensitivity | The true sensitivity is the sensitivity one would get if they sampled the entire population. |
| num_trials | The number of trials is how many independent trials are simulated. |
| num_observations | The number of observations per trial. |
| confidence | The percentage used to generate the confidence intervals. This percentage is the probability the true answer is within the bounds. |

Run the code block below to initialize the default variables

In [None]:
default_true_sensitivity = 0.70
default_num_trials = 50000
default_num_observations = 200
default_confidence = 0.95

# Trial Distribution
The code below runs `num_trials` trials each with `num_observations` observations. These are then plotted in a histogram where the x-axis is the sensitivity observed in the simulated trial and the y-axis is the count of simulated trials with that sensitivity.

Run the code below then change the variables around and explore what happens.

In [None]:
# Variables to play with Note: percentages are done as ratios (e.g. 95% => 0.95)
true_sensitivity = default_true_sensitivity
num_trials = default_num_trials
num_observations = default_num_observations
percentile = default_confidence

# Run the trials
trials = run_many_proportion_trials(
    true_sensitivity,
    num_trials=num_trials,
    num_observations=num_observations,
)

# Display trials
display_trials(
    trials,
    true_value=true_sensitivity,
    percentile_bounds=percentile,
)

## Questions
* Are the red bars in the above graph confidence intervals?
* Is the above graph a normal distribution? What if you change `true_sensitivity` to 0.99?

# Run Trial With Observed Value
The code below runs a single trial with a `single_trial_sensitivity` sensitivity and `single_trial_num_observations` observations. This single trial is displayed ontop of the distribution from the previous section.

Run the code below then change the variables around and explore what happens.

In [None]:
# Variables to play with Note: percentages are done as ratios (e.g. 95% => 0.95)
single_trial_true_sensitivity = default_true_sensitivity
single_trial_num_observations = default_num_observations
confidence = default_confidence

# Run new trial
single_trial = run_single_proportion_trial(
    single_trial_true_sensitivity,
    num_observations=single_trial_num_observations,
)

# Display new trial along with previously generated trials
display_single_trial(
    single_trial,
    confidence=confidence,
    true_value=true_sensitivity,
    distribution=trials,
)

## Questions
* What happens when you rerun the new trial?
* What happens when the single trials sensitivity is set to a diffrent number?

# P-Hacking
P-Hacking refers to running many statistical tests in a row till one eventually succeeds. For the code below assume the `true_sensitivity` is the null hypothesis and we want to show that our trial performs diffrent (can be higher or lower). The code below keeps running trials untill one eventually "disproves" the null-hypothesis.

Run the code below and change up variables to see what happens.

In [None]:
# Variables to play with Note: percentages are done as ratios (e.g. 95% => 0.95)
trial_sensitivity = default_true_sensitivity
confidence = default_confidence

# Initialize variables
single_trial = (true_sensitivity, 0.1)
count = 0
# Keep running while observed sensitivity is within confidence intervals
while within_confidence_interval(single_trial, true_sensitivity, confidence):
    # Run new trial
    single_trial = run_single_proportion_trial(
        true_sensitivity,
        num_observations=num_observations,
    )
    sleep(0.5)
    clear_output()
    print(f'trial {count+1}')
    # Display trial
    display_single_trial(
        single_trial,
        confidence=confidence,
        true_value=true_sensitivity,
        distribution=trials,
    )
    count +=1

# Display last trial
clear_output()
print(f'Ran {count} trials')
display_single_trial(
    single_trial,
    confidence=confidence,
    true_value=true_sensitivity,
    distribution=trials,
)

## Questions
* Does the "model" we derived running above actually have a diffrent sensitivity?
* How might we see this occur in real life?