# Hypothesis Testing for Population Mean

* Goal is to evaluate the validity of a claim about a population
1. Testing whether a vaccine is effective or not
2. Testing whether frogs in higher latituses have higher infection rates or not

We have a pair of hypothesis: Mutually Exclusive and Exhuastive


1. Null Hypothesis $(H_0)$
    - Attempt to disprove
    - Must always have equaility sign because it must allow us to generate a single probability distribution
    - $H_0: \mu = 100$
    - $H_0: \mu \leq 100$
    
2. Alternative Hypothesis $(H_a)$
    - Opposite of $H_0$
    - Where we hope to find evidence
    - $H_a: \mu \ne 100$
    - $H_a: \mu \gt 100$

Goal is to not prove $H_a$ but to disprove $H_0$

We either reject $H_0$ or fail to reject $H_0$

## Steps
1. Assume $H_0$ is true
2. Collect sample data
3. Ask what is chance of obtaining the given data if null hypothesis is true?
4. Small chance, then reject $H_0$ at reasonable significance level
5. Not so small chance, then don't reject or accept null hypothesis


In [18]:
from ConfidenceInterval.SingleMean import SingleMean
from typing import Tuple

In [24]:
def is_in_confidence_interval(confidence_interval: Tuple[float, float], value: float) -> bool:
    lower, upper = confidence_interval
    return value > lower and value < upper

In [50]:
def hypothesis_testing_for_mean(
    claimed_mean: float,
    sample_mean: float,
    sample_variance: float,
    confidence_level: float,
    sample_size: int,
    tails: str,
    score: str
) -> bool:
    single_mean = SingleMean(mean=sample_mean, 
                             variance=sample_variance,
                             sample_size=sample_size,
                             score=score, 
                             confidence_level=confidence_level,
                             tails=tails)
    # # Testing with CI
    print(f"{single_mean.confidence_interval=}")
    print(f"{single_mean.critical_value=}")
    # return not is_in_confidence_interval(single_mean.confidence_interval, mean)
    
    test_statistic = (sample_mean - claimed_mean)/single_mean.standard_error
    test_star = single_mean.critical_value
    
    print(f"{test_statistic=}")
    if tails == "=":
        return test_statistic < -test_star or test_statistic > test_star
    elif tails == ">=":   
        return test_statistic < test_star
    return test_statistic > test_star
        
    

In [33]:
hypothesis_testing_for_mean(claimed_mean=120, 
                            sample_mean=130.1,
                            sample_variance=21.21**2,
                            confidence_level=0.95,
                            sample_size=100,
                            tails="=",
                            score="t")

test_statistic=4.761904761904758


True

In [35]:
hypothesis_testing_for_mean(claimed_mean=75, 
                            sample_mean=68,
                            sample_variance=10**2,
                            confidence_level=0.95,
                            sample_size=16,
                            tails=">=",
                            score="z")

test_statistic=-2.8


True

In [36]:
hypothesis_testing_for_mean(claimed_mean=65000, 
                            sample_mean=64000,
                            sample_variance=4000**2,
                            confidence_level=0.95,
                            sample_size=64,
                            tails=">=",
                            score="z")

test_statistic=-2.0


True

In [39]:
hypothesis_testing_for_mean(claimed_mean=60,
                            sample_mean=62.75,
                            sample_variance=10**2,
                            confidence_level=0.95,
                            sample_size=52,
                            tails="<=",
                            score="z")

single_mean.confidence_interval=(65.0310015740794, inf)
test_statistic=1.9830532015051943


True

In [51]:
hypothesis_testing_for_mean(claimed_mean=4,
                            sample_mean=4.3,
                            sample_variance=1.2**2,
                            confidence_level=0.9,
                            sample_size=9,
                            tails="=",
                            score="t")

single_mean.confidence_interval=(3.556180784990863, 5.043819215009137)
single_mean.critical_value=1.8595480375228424
test_statistic=0.7499999999999996


False