## 1. Introduction: 
This Jupyter Notebook performs a two-sample t-test to investigate whether the mean weight of on-time shipments is less than the mean weight of late shipments using the `late_shipments.feather` dataset.
### 1.1 Problem Statement
While trying to determine why some shipments are late, you may wonder if the weight of the shipments that were on time is less than the weight of the shipments that were late.
- **Null Hypothesis:**  The mean weight of shipments that weren't late is the same as the mean weight of shipments that were late. 
- **Alternate Hypothesis:** The mean weight of shipments that weren't late is **less than** the mean weight of shipments that were late.

## 2. Import Libraries and Load Data
This step imports the necessary Python libraries and loads the dataset into a pandas DataFrame, and sets the significance level to 0.05
- `pandas` is imported as pd for data manipulation and analysis.
- `numpy` is imported as np for numerical operations, particularly for calculating the square root.
- `scipy.stats.t` is imported to access functions related to the t-distribution, specifically for calculating the p-value.
- `alpha` is set to 0.05. This is the chosen significance level, which will be used to compare against the p-value to decide whether to reject the null hypothesis.


In [21]:
import pandas as pd
import numpy as np
from scipy.stats import t
#set significance level
alpha = 0.05
#load data
late_shipments = pd.read_feather('../../data/late_shipments.feather')

## 3. Group Data and Calculate Descriptive Statistics
This step groups the shipment data based on whether they were late or not and then calculates the mean, standard deviation, and count of 'weight_kilograms' for each group.
- `late_shipments.groupby('late')['weight_kilograms']` creates a grouped object (grp_by_late) that allows for operations on `weight_kilograms` separately for shipments that were `late` (Yes/No).
- `xbar = grp_by_late.mean()` calculates the mean weight for both the 'No' (on-time) and 'Yes' (late) groups.
- `numerator = xbar['No'] - xbar['Yes']` computes the difference between the mean weight of on-time shipments and late shipments. This forms the numerator of the t-statistic formula.
- `s = grp_by_late.std()` calculates the standard deviation of weights for both groups. `s_yes` and `s_no` store these values for late and on-time shipments, respectively.
- `count = grp_by_late.count()` calculates the number of observations (sample size) for each group. `count_yes` and `count_no` store these values.
- `denominator = np.sqrt(s_yes ** 2 / count_yes + s_no ** 2 / count_no)` calculates the **standard error** of the difference between the two sample means. This value is used as the denominator in the t-statistic formula.




In [24]:
# Group the data by 'late' column and select 'weight_kilograms'
grp_by_late = late_shipments.groupby('late')['weight_kilograms']

# Calculate the mean weight for each group
xbar = grp_by_late.mean()
numerator = xbar['No'] - xbar['Yes']
print(f"Difference in means (On-time - Late): {numerator}")

# Calculate the standard deviation for each group
s = grp_by_late.std()
s_yes = s['Yes'] # Standard deviation of late shipments
s_no = s['No']   # Standard deviation of on-time shipments

# Calculate the count (sample size) for each group
count = grp_by_late.count()
count_yes = count['Yes'] # Count of late shipments
count_no = count['No']   # Count of on-time shipments

# Calculate the denominator for the t-statistic
denominator = np.sqrt(s_yes ** 2 / count_yes + s_no ** 2 / count_no)
print(f"Standard Error of the Difference: {denominator}")


Difference in means (On-time - Late): -817.8808638418964
Standard Error of the Difference: 341.68543274794337


## 4. Calculate T-statistic and Degrees of Freedom
This step computes the t-statistic based on the calculated means, standard deviations, and sample sizes, and then determines the degrees of freedom for the t-distribution.
- t_stat = numerator / denominator calculates the t-statistic by dividing the `difference in means` (numerator) by the `standard error` of the difference (denominator).
- `df = count_yes + count_no - 2` calculates the degrees of freedom for the two-sample t-test. For two independent samples, a common approximation is the sum of the sample sizes minus 2.


In [27]:
# Calculate the t-statistic
t_stat = numerator / denominator
print(f"T-statistic: {t_stat}")

# Calculate the degrees of freedom
# For a two-sample t-test with unequal variances, a common approximation for df is N1 + N2 - 2.
df = count_yes + count_no - 2
print(f"Degrees of Freedom: {df}")

T-statistic: -2.3936661778766433
Degrees of Freedom: 998


## 5. Calculate P-value
This step calculates the p-value using the cumulative distribution function (CDF) of the t-distribution.
- `p_value = t.cdf(t_stat, df=df)` calculates the p-value. Since the alternative hypothesis is a left-tailed test, we use the cumulative distribution function (t.cdf) to find the probability of observing a t-statistic as extreme as, or more extreme than, the calculated t_stat in the left tail of the t-distribution with the calculated degrees of freedom.

In [30]:
p_value = t.cdf(t_stat, df=df)
print(f"P-value: {p_value}")

P-value: 0.008432382146249525


## 6. Make a decision and interpret results
This step compares the calculated p-value to the pre-defined significance level (alpha) to decide whether to reject the null hypothesis and then interprets the result in the context of the problem.
- If `p_value < alpha`: The output will be **Reject Null Hypothesis**. This means there is sufficient statistical evidence at the 5% significance level to conclude that the mean weight of shipments that were on time is less than the mean weight of shipments that were late.
- If `p_value >= alpha`: The output will be **Failed to reject null hypothesis**. This means there is not enough statistical evidence at the 5% significance level to conclude that the mean weight of shipments that were on time is less than the mean weight of shipments that were late.

In [33]:
if p_value < alpha:
    print("Reject Null Hypothesis")
else:
    print("Failed to reject null hypothesis")

Reject Null Hypothesis


## 7. Result Interpretation:
In this particular execution, the calculated p-value was approximately 0.0084. Since $0.0084 \< 0.05$, the null hypothesis is rejected. This suggests that there is statistically significant evidence to support the claim that the mean weight of on-time shipments is indeed less than the mean weight of late shipments.