# Hypothesis Testing in Python
This project explores the application of statistical hypothesis testing techniques using Python in a Jupyter Notebook environment. The goal is to analyze data and draw meaningful inferences through statistical methods.

**The project covers:**

- Z-tests: For comparing sample and population means under known variance.
- T-tests: For small sample sizes or unknown population variance.
- Chi-square tests: For categorical data analysis and independence testing.
- ANOVA tests: For comparing means across multiple groups.
- Confidence Intervals and Margin of Error: For estimating the precision and reliability of sample statistics.

In [1]:
# Import libraries
import pandas as pd
import numpy as np
from statsmodels.stats.weightstats import ztest as ztest
import math as math
import scipy.stats as stats
from scipy.stats import f

## Z-Test

### Question 1:
The average heights of all residents in a city is 168cm with a population std sigma = 3.9.
A doctor believes the mean to be different. He measured the hegiht of 36 individuals and found the average height to be 169.5 cm.  
a) State Null and Alternate Hypothesis.  
b) At a 95% C.I, is there enough evidence to reject the Null Hypothesis.

In [2]:
# Tow-tailed z-test

# given data
mu = 168
sigma = 3.9
n = 36
x_bar = 169.5
alpha = 0.05

z_test = np.round((x_bar - mu) / (sigma / math.sqrt(n)), 2)  # formula for Z-test statistic

z_critical = np.round(stats.norm.ppf(alpha / 2), 2)          # formula for z table critical value

# print results
print(f"-z critical: {z_critical}")
print(f"z critical: {-z_critical}")
print(f"z test: {z_test}")

if (z_test < -z_critical) & (z_test > z_critical):
    print("Fail to reject the Null Hypothesis")
else:
    print("Reject the Null Hypothesis")

-z critical: -1.96
z critical: 1.96
z test: 2.31
Reject the Null Hypothesis


### Conclusion
Null Hypothesis H0 mu = 168 cm  
Alternative Hypothesis H1 mu != 168 cm

Since the calculated z-test value (2.31) is greater than the critical z-values of -1.96 and 1.96, we reject the null hypothesis at the 95% confidence level. There is sufficient evidence to conclude that the mean height of the population is different from 168 cm.

### Question 2:
A factory manufactures bulbs with an average warranty of 5 years with a standard deviation of 0.50. A worker believes the light bulb will malfunction in less than 5 years. He tests a sample of 40 bulbs and finds the average time to be 4.8 years.  
a) State null and alternate hypothesis.  
b) At a 2% significance level, is there enough evidence to support the idea that the warranty should be revised?

In [3]:
# left-tailed z-test

# Given data
mu = 5          # Population mean
sigma = 0.5     # Population standard deviation
x_bar = 4.8     # Sample mean
n = 40          # Sample size
alpha = 0.02    # Significance level

z_critical = np.round(stats.norm.ppf(alpha), 2)  # find critical value for left tail

z_test = np.round((x_bar - mu) / (sigma / math.sqrt(n)), 2)  # formula for Z-test statistic

# print results
print(f"z critical: {z_critical}")
print(f"z test: {z_test}")

if z_test < z_critical:
    print("reject the null hypothesis")
else:
    print("fail to reject the null hypothesis")

z critical: -2.05
z test: -2.53
reject the null hypothesis


### Conclusion
Hypotheses:
- Null Hypothesis (H₀): The average warranty time of the bulbs is 5 years.  
H0: μ = 5  

- Alternative Hypothesis (H₁): The average warranty time of the bulbs is less than 5 years.  
H1 : μ < 5  

Decision:
- The calculated z-test value of -2.53 is less than the critical z-value of -2.05.
- Since -2.53 falls in the rejection region (i.e., less than -2.05), we reject the null hypothesis.

At the 2% significance level, there is enough evidence to support the worker's belief that the bulbs malfunction in less than 5 years. Therefore, the warranty may need to be revised.

## T-Test

### Question
In the population, the average IQ is 100. A team of researchers wants to test a new medication to see if it has either a positive or negative effect on intelligence or no effect at all. A sample of 30 participants who have taken the medication has a mean of 140 with a standard deviation of 20. Did the medication affect intelligence? At 95% C.I.

In [4]:
# Two-tailed t-test

# Given data
mu = 100
s = 20
n = 30
x_bar = 140
alpha = 0.05
df = n-1

t_criticl = np.round(stats.t.ppf(alpha / 2, df), 3)            # formula for t critical

t_test = np.round((x_bar - mu) / (s / np.sqrt(n)), 2)          # t-test formula

# print results
print(f"-z critical: {z_critical}")
print(f"z critical: {-z_critical}")
print(f"t-test: {t_test}")

if (t_test > -t_criticl) | (t_test < t_criticl):
    print("Reject the Null Hypothesis")
else:
    print("Fail to reject the Null Hypothesis")

-z critical: -2.05
z critical: 2.05
t-test: 10.95
Reject the Null Hypothesis


### Conclusion
Based on the results of the hypothesis test, the medication appears to have a significant effect on intelligence. The calculated t-statistic of 10.95 exceeds the critical value of ±2.05, leading to the rejection of the null hypothesis. This suggests that the medication has a statistically significant impact on IQ, with the sample showing a mean IQ of 140, which is notably higher than the population average of 100. Therefore, we conclude that the medication likely has a positive effect on intelligence.

## Confidence Interval and Margin of Error

### Question
On the verbal section of CAT exam, the standard deviation is known to be 100. A sample of 25 test takers has a mean of 520. Construct the 95% confidence interval around the mean?

In [5]:
# Given data
mu = 5          # Population mean
sigma = 100     # Population standard deviation
x_bar = 520     # Sample mean
n = 25          # Sample size
alpha = 0.05    # Significance level
ci = 0.95       # Confidence interval

z_critical = np.round(stats.norm.ppf(1 - alpha / 2), 2)                    # z critical formula

se = sigma / np.sqrt(n)                                                    # Standard error (se) formula

MoE = z_critical * se                                                      # Margin of error (MoE) formula

lower_ci = np.round(x_bar - z_critical * sigma / math.sqrt(n), 2)          # Lower-bound CI formula
upper_ci = np.round(x_bar + z_critical * sigma / math.sqrt(n), 2)          # Upper-bound CI formula

# print results
print(f"Standard Error: {se}")
print(f"Margin of Error: {MoE}")
print(F"Lower-bound CI: {lower_ci}")
print(F"Upper-bound CI: {upper_ci}")

Standard Error: 20.0
Margin of Error: 39.2
Lower-bound CI: 480.8
Upper-bound CI: 559.2


### Answer
I am 95% confident that the true mean CAT score of the population lies between 480.8 and 559.2.

## Chi-Square Test

### Question
In the 2010 census of the city, the weight of the individuals in a small city was found to be the following:  
<50=100, 50-75=150, >75=250  

In 2020, ages of 500 individuals were sampled. Below are the results:  
<50=140, 50-75=160, >75=200  

Using a significance level of 0.05, would you conclude the population difference of weights has changed in the last 10 years? 

In [6]:
exp_freq = pd.DataFrame({'<50': [100],
                        '50-75': [150],
                        '>75': [250]})

obs_freq = pd.DataFrame({'<50': [140],
                        '50-75': [160],
                        '>75': [200]})

alpha = 0.05
df = exp_freq.shape[1] - 1

chi_critical = np.round(stats.chi2.ppf(1 - alpha, df), 3)

chi_square_test = round((((obs_freq - exp_freq).values**2) / exp_freq).values.sum(), 2)

print(f"chi critical: {chi_critical}")
print(f"chi-square: {chi_square_test}")

if chi_square_test > chi_critical:
    print("Reject the Null Hypothesis")
else:
    print("Fail to reject the Null Hypothesis")

chi critical: 5.991
chi-square: 26.67
Reject the Null Hypothesis


### Conclusion
Hypotheses:
- The null hypothesis (H₀) states that there is no significant difference in the weight distribution between 2010 and 2020.
- The alternative hypothesis (H₁) suggests that there is a significant difference in the weight distribution between the two years.
The chi-square statistic calculated is 26.67, which is greater than the critical value of 5.991.

Decision Rule: If the chi-square statistic is greater than the critical value (5.991), we reject the null hypothesis.

Since 26.67 > 5.991, we reject the null hypothesis at the 0.05 significance level.

There is sufficient evidence to conclude that the population's weight distribution has significantly changed in the last 10 years (from 2010 to 2020).

## Anova Test

### Question 1:
Doctors want to test a new medication which reduces headache. They splits the participant into 3 condition [15mg, 30mg, 45mg]. Later on the doctor ask the patient to rate the headache between [1-10]. Are there any differences between the 3 conditions using alpha=0.05 ?

In [7]:
# Data
df = pd.DataFrame({'15mg': [9, 8, 7, 8, 8, 9, 8],
                   '30mg': [7, 6, 6, 7, 8, 7, 6],
                   '45mg': [4, 3, 2, 3, 4, 3, 2]})

# Parameters
alpha = 0.05
a = 3
n = 7
N = a * n
dfbet = a - 1
dfwit = N - a

# Sum of Squares
ssbet = round((df.sum()**2).sum() / n - df.values.sum()**2 / N, 2)
sswit = round((df**2).sum().sum() - (df.sum()**2).sum() / n, 2)

# Mean Squares
msbet = round(ssbet / dfbet, 2)
mswit = round(sswit / dfwit, 2)

# F-statistic
f_stat = round(msbet / mswit, 2)

# F-critical value
f_critical = round(f.ppf(1 - alpha, dfbet, dfwit), 4)

# Decision
print(f"f_critical: {f_critical}")
print(f"f: {f_stat}")
if f_stat < f_critical:
    print("Accept the Null Hypothesis")
else:
    print("Reject Null Hypothesis. Yes, there is a difference, and I am 95% confident.")

f_critical: 3.5546
f: 86.56
Reject Null Hypothesis. Yes, there is a difference, and I am 95% confident.


### Conclusion

Based on the ANOVA results, the F-statistic **(f = 86.56)** is much greater than the critical value **(fcritical = 3.5546)**
at **𝛼 = 0.05**.
Therefore, we reject the null hypothesis and conclude that there is a statistically significant difference in headache ratings between the three dosage groups (15mg, 30mg, 45mg).

### Question 2:
A nutritionist wants to test the effects of three different diet plans on weight loss. The participants are split into three groups, each following a specific diet plan: Diet A, Diet B, and Diet C. After 4 weeks, the nutritionist records the weight loss (in kg) for each participant:

- **Diet A**: [4, 5, 3, 6, 5, 4, 5]
- **Diet B**: [7, 6, 5, 6, 7, 8, 6]
- **Diet C**: [10, 11, 9, 10, 10, 11, 12]
Using an alpha level of 0.05, determine if there is a significant difference in the mean weight loss among the three diet plans.

In [8]:
# Data
df = pd.DataFrame({'Diet A': [4, 5, 3, 6, 5, 4, 5],
                   'Diet B': [7, 6, 5, 6, 7, 8, 6],
                   'Diet C': [10, 11, 9, 10, 10, 11, 12]})

# Parameters
alpha = 0.05
a = 3
n = 7
N = a * n
dfbet = a - 1
dfwit = N - a

# Sum of Squares
ssbet = round((df.sum()**2).sum() / n - df.values.sum()**2 / N, 2)
sswit = round((df**2).sum().sum() - (df.sum()**2).sum() / n, 2)

# Mean Squares
msbet = round(ssbet / dfbet, 2)
mswit = round(sswit / dfwit, 2)

# F-statistic
f_stat = round(msbet / mswit, 2)

# F-critical value
f_critical = round(f.ppf(1 - alpha, dfbet, dfwit), 4)

# Decision
print(f"f_critical: {f_critical}")
print(f"f: {f_stat}")
if f_stat < f_critical:
    print("Accept the Null Hypothesis")
else:
    print("Reject Null Hypothesis. Yes, there is a difference, and I am 95% confident.")

f_critical: 3.5546
f: 66.02
Reject Null Hypothesis. Yes, there is a difference, and I am 95% confident.


### Conclusion
Based on the ANOVA results, the F-statistic **(F = 66.02)** is much greater than the critical value **(Fcritical = 3.5546)** at 
**α=0.05**.
Therefore, we reject the null hypothesis. 
This indicates that there is a statistically significant difference in the mean weight loss among the three diet plans (Diet A, Diet B, and Diet C).
The nutritionist can confidently conclude that the diet plans have varying effects on weight loss.

### Question 3
An e-commerce company wants to test whether three different marketing strategies have significantly different effects on the number of daily website visits. The company runs a 2-week campaign with the following strategies:
- **Strategy A**: Social media ads
- **Strategy B**: Email marketing
- **Strategy C**: Search engine optimization (SEO)

Using an alpha level of 0.05, determine if there is a statistically significant difference in the mean daily visits among the three marketing strategies.

In [9]:
# Data
df = pd.DataFrame({'Strategy A': [120, 130, 125, 128, 135, 140, 132],
                   'Strategy B': [150, 155, 145, 152, 158, 160, 148],
                   'Strategy C': [180, 190, 185, 188, 192, 195, 200]})

# Parameters
alpha = 0.05
a = 3
n = 7
N = a * n
dfbet = a - 1
dfwit = N - a

# Sum of Squares
ssbet = round((df.sum()**2).sum() / n - df.values.sum()**2 / N, 2)
sswit = round((df**2).sum().sum() - (df.sum()**2).sum() / n, 2)

# Mean Squares
msbet = round(ssbet / dfbet, 2)
mswit = round(sswit / dfwit, 2)

# F-statistic
f_stat = round(msbet / mswit, 2)

# F-critical value
f_critical = round(f.ppf(1 - alpha, dfbet, dfwit), 4)

# Decision
print(f"f_critical: {f_critical}")
print(f"f: {f_stat}")
if f_stat < f_critical:
    print("Accept the Null Hypothesis")
else:
    print("Reject Null Hypothesis. Yes, there is a difference, and I am 95% confident.")

f_critical: 3.5546
f: 167.28
Reject Null Hypothesis. Yes, there is a difference, and I am 95% confident.


### Conclusion
For the e-commerce study, the F-statistic **(f = 167.28)** is also much greater than the critical value **(fcritical = 3.5546)**
at **α=0.05**. Thus, we reject the null hypothesis. This suggests that there is a statistically significant difference in the mean number of daily website visits among the three marketing strategies (Strategy A: Social Media Ads, Strategy B: Email Marketing, Strategy C: SEO). The e-commerce company can conclude that the marketing strategies differ significantly in their effectiveness.