## Learning objectives

By the end of this train, you should be able to:
* Calculate a test statistic.
* Determine critical values based on a chosen significance level.
* Make a decision on whether to reject the null hypothesis.

## Exercises

Given a population where the average weight of a species of butterfly is believed to be 150 mg with a standard deviation of 20 mg, you collect a sample of 30 butterflies and find their average weight to be 145 mg. Use a 5% significance level to test the hypothesis that the population mean weight is 150 mg against the alternative hypothesis that it is not 150 mg (determining whether the observed average weight of the sample of butterflies significantly deviates from the presumed population average weight).

In the exercises below, we will develop different functions, each responsible for a specific part of the hypothesis-testing process.

### Import libraries and dataset

In [30]:
from scipy.stats import norm

### Exercise 1

State the null and alternative hypotheses.

In [31]:
# Your solution here...
#h0 = 150
#h1 < h0

### Exercise 2

Write a function that calculates the test statistic (z-value) based on the sample mean, population mean, population standard deviation, and sample size.

The test statistic can be calculated using the formula for a one-sample z-test:

> z = (x̄ - μ) / (σ / √n)

where:

- `x̄` is the sample mean.
- `μ` is the population mean under the null hypothesis.
- `σ` is the population standard deviation.
- `n` is the sample size.

In [32]:
# Your solution here...
bf_weight = 150 
sample_mean = 145 # sample mean
bf_std = 20 # standard deviation
sample_size = 30 


pop_mean = 150 #hypothesis
def calculate_z_value(sample_mean, pop_mean, bf_std, sample_size):
    z = (sample_mean - pop_mean) / (bf_std / (sample_size**0.5))
    return z

print("Z value: ", calculate_z_value(sample_mean, pop_mean, bf_std, sample_size))

Z value:  -1.3693063937629153


### Exercise 3

Write a function that returns the lower and upper critical z-values for a two-sided test at a specific significance level.

In [33]:
# Your solution here...
significance = 0.05

def calculate_critical_z_values(significance_level):
    # Calculate the critical z-value for a two-sided test
    # For a two-sided test, we divide the significance level by 2
    # The lower critical value is the quantile at alpha/2
    # The upper critical value is the quantile at 1 - alpha/2
    lower_critical = norm.ppf(significance_level / 2)
    upper_critical = norm.ppf(1 - significance_level / 2)
    
    return lower_critical, upper_critical

lower_z, upper_z = calculate_critical_z_values(0.05)
print(f"Lower critical z-value: {lower_z}")
print(f"Upper critical z-value: {upper_z}")

Lower critical z-value: -1.9599639845400545
Upper critical z-value: 1.959963984540054


### Exercise 4

Write a function which makes a decision on whether to reject or not reject the null hypothesis based on the test statistic and critical values.

In [34]:
# Your solution here...
def decide_null_hypothesis(z, lower_z, upper_z):
    if z < lower_z or z > upper_z:
        print("Reject the null hypothesis")
    else:
        print("Fail to reject the null hypothesis")
# since the z value is less than the lower critical z value, we reject the null hypothesis

### Exercise 5

Run the following code that puts together all the functions we have created in the previous exercises to perform the hypothesis test for the given butterfly population data.

> What does our decision mean in the context of the butterfly population?

In [35]:
# Given data
sample_mean = 145  # sample mean weight in mg
population_mean = 150  # hypothesised population mean weight in mg
population_std = 20  # population standard deviation in mg
n = 30  # sample size
alpha = 0.05  # significance level

# Calculate the z-value
z = calculate_z_value(sample_mean, population_mean, population_std, n)

# Get critical z-values
lower_z, upper_z = calculate_critical_z_values(alpha)

# Make a decision
decision = decide_null_hypothesis(z, lower_z, upper_z)

print(f"Test Statistic (z-value): {z}")
print(f"Critical z-value (lower): {lower_z}, Upper: {upper_z}")
print(f"Decision: {decision}")

Fail to reject the null hypothesis
Test Statistic (z-value): -1.3693063937629153
Critical z-value (lower): -1.9599639845400545, Upper: 1.959963984540054
Decision: None
