# Hypothesis Testing in Python

## Chapter 1: Introduction to Hypothesis Testing

In [10]:
import numpy as np
import pandas as pd
from scipy.stats import norm

In [4]:
late_shipments = pd.read_feather("late_shipments.feather")
late_shipments

Unnamed: 0,id,country,managed_by,fulfill_via,vendor_inco_term,shipment_mode,late_delivery,late,product_group,sub_classification,...,line_item_quantity,line_item_value,pack_price,unit_price,manufacturing_site,first_line_designation,weight_kilograms,freight_cost_usd,freight_cost_groups,line_item_insurance_usd
0,36203.0,Nigeria,PMO - US,Direct Drop,EXW,Air,1.0,Yes,HRDT,HIV test,...,2996.0,266644.00,89.00,0.89,"Alere Medical Co., Ltd.",Yes,1426.0,33279.83,expensive,373.83
1,30998.0,Botswana,PMO - US,Direct Drop,EXW,Air,0.0,No,HRDT,HIV test,...,25.0,800.00,32.00,1.60,"Trinity Biotech, Plc",Yes,10.0,559.89,reasonable,1.72
2,69871.0,Vietnam,PMO - US,Direct Drop,EXW,Air,0.0,No,ARV,Adult,...,22925.0,110040.00,4.80,0.08,Hetero Unit III Hyderabad IN,Yes,3723.0,19056.13,expensive,181.57
3,17648.0,South Africa,PMO - US,Direct Drop,DDP,Ocean,0.0,No,ARV,Adult,...,152535.0,361507.95,2.37,0.04,"Aurobindo Unit III, India",Yes,7698.0,11372.23,expensive,779.41
4,5647.0,Uganda,PMO - US,Direct Drop,EXW,Air,0.0,No,HRDT,HIV test - Ancillary,...,850.0,8.50,0.01,0.00,Inverness Japan,Yes,56.0,360.00,reasonable,0.01
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,13608.0,Uganda,PMO - US,Direct Drop,DDP,Air,0.0,No,ARV,Adult,...,121.0,9075.00,75.00,0.62,"Janssen-Cilag, Latina, IT",Yes,43.0,199.00,reasonable,12.72
996,80394.0,"Congo, DRC",PMO - US,Direct Drop,EXW,Air,0.0,No,HRDT,HIV test,...,292.0,9344.00,32.00,1.60,"Trinity Biotech, Plc",Yes,99.0,2162.55,reasonable,13.10
997,61675.0,Zambia,PMO - US,Direct Drop,EXW,Air,1.0,Yes,HRDT,HIV test,...,2127.0,170160.00,80.00,0.80,"Alere Medical Co., Ltd.",Yes,881.0,14019.38,expensive,210.49
998,39182.0,South Africa,PMO - US,Direct Drop,DDP,Ocean,0.0,No,ARV,Adult,...,191011.0,861459.61,4.51,0.15,"Aurobindo Unit III, India",Yes,16234.0,14439.17,expensive,1421.41


In [11]:
# Calculate the proportion of late shipments
late_prop_samp = (late_shipments["late"]=="Yes").mean()

# Print the results
print(late_prop_samp)

0.061


In [6]:
late_shipments_boot_distn = []
for i in range(5000):

    late_shipments_boot_distn.append(
        np.mean(
            late_shipments.sample(frac=1, replace=True)["late"]=="Yes"
        )

    )

In [7]:
# Hypothesize that the proportion is 6%
late_prop_hyp = 6/100

# Calculate the standard error
std_error = np.std(late_shipments_boot_distn, ddof=1)

# Find z-score of late_prop_samp
z_score = (late_prop_samp - late_prop_hyp)/std_error

# Print z_score
print(z_score)

0.13219357400759651


Hypothesis Testing
- Either $H_A$ or $H_0$ is true (not both)
- Initially, $H_0$ is assumed to be true
- The test ends in either "reject $H_0$" or "fail to reject $H_0$"

One-tailed and Two-tailed Tests
- Two-tailed: Alternative DIFFERENT from null
- Left-tailed: Alternative LESS than null
- Right-tailed: Alternative GREATER than null

**p-values**: probability of obtaining a result, assuming the null hypothesis is true

- Large p-value, large support for $H_0$
    - Statistic likely not in the tail of the null distribution
- Small p-value, strong evidence against $H_0$
    - Statistic likely in the tail of the null distribution
- Large p-value → fail to reject null hypothesis
- Small p-value → reject null hypothesis

Calculating the p-value:
- norm.cdf() is normal CDF from ```scipy.stats```
- Left-tailed test → use ```norm.cdf()```
- Right-tailed test → use ```1 - norm.cdf()```
- Two-tailed test → use ```norm.cdf(-z_score) + 1 - norm.cdf(z_score)```

Type of Errors:
- Type 1: False positives
- Type 2: False negatives


|--------------|-------------------------------|------------------------------|
|              | actual $H_0$                  | actual $H_A$                 |
|--------------|-------------------------------|------------------------------|
| chosen $H_0$ | correct                       | false negative<br/> (Type 2) |
| chosen $H_A$ | false positive<br/> (Type 1)  | correct                      |
|--------------|-------------------------------|------------------------------|

In [8]:
# Calculate the p-value
p_value = 1 - norm.cdf(z_score, loc=0, scale=1)

# Print the p-value
print(p_value)

0.4474155918577094


In [9]:
# Calculate 95% confidence interval using quantile method
lower = np.quantile(late_shipments_boot_distn, 0.025)
upper = np.quantile(late_shipments_boot_distn, 0.975)

# Print the confidence interval
print((lower, upper))


(0.046, 0.076)


When you have a confidence interval width equal to one minus the significance level, if the hypothesized population parameter is within the confidence interval, you should fail to reject the null hypothesis.


- Type 1, False Positive: Reject the null hypothesis when in fact the null hypothesis is true
- Type 2, False Negative: Failing to reject the null hypothesis when in fact the null hypothesis is false
- No Error: Rejecting the null hypothesis when in fact the null hypothesis is false
- No Error: Failing to reject the null hypothesis when in fact the null hypothesis is false
