**TOST (two one-sided test or equivalence test):**

It is a test two of equivalence that is based on the classical t test used to test the hypothesis of equality between two means. So we will have two samples, a theoretical difference between the means as well as a range within which we can say that the sample means are equivalent.

To do TOST, we should set up an **equivalent interval** (usually +/- 5% or 10%), and **calculate means and standard deviation** of two groups of data first. Next, use the formula *SE = sqrt((SD1^2/n1)+SD2^2/n2)* to calculate the standard error. Finally, it is important to calculate upper t value and lower t value by formula *tlower = ((M1-M2)-(-EI))/SE*; *tupper = ((M1-M2)-(EI))/SE* and **check the critical point of t distribution**.

**Python model that can be used:**

1. statsmodels.stats.weightstats (2025)
2. scipy.stats

**Comparison:**

For traditional t test and ANOVA are used to test whether two groups of data have "significant difference (when p-value < 0.05)". If you want to confirm there is no significant different, it is not enough to use t-test and ANOVA.

ref: https://www.xlstat.com/en/solutions/features/equivalence-test-tost

In [4]:
!pip install statsmodels



In [14]:
import numpy as np
from statsmodels.stats.weightstats import ttost_ind

# Assume twp groups of data
group1 = np.array([10.2, 9.8, 10.5, 10.1, 9.9])
group2 = np.array([10.1, 9.9, 10.4, 10.2, 9.8])

# Set up equivalence boundary（e.g. ±0.5）
low_eqbound = -0.5
high_eqbound = 0.5

# Undergo TOST 
p_value, (t_stat_low, p_value_low, df_low), (t_stat_high, p_value_high, df_high) = ttost_ind(
    group1, group2, low_eqbound, high_eqbound, usevar='unequal'
)

print("===== TOST results =====")
print(f"TOST p value：{p_value:.4f}")
print(f"lower t value ：{t_stat_low:.4f}, lower p value：{p_value_low:.4f}, degree of freedom：{df_low:.2f}")
print(f"upper t value：{t_stat_high:.4f}, upper p value：{p_value_high:.4f}, degree of freedom：{df_high:.2f}")


if p_value < 0.05:
    print("\n✅ These two sets of data are within the equivalent range and can be considered equivalent samples!")
else:
    print("\n❌ The two sets of data failed to demonstrate equivalence and may be significantly different.")

===== TOST results =====
TOST p value：0.0093
lower t value ：3.2004, lower p value：0.0065, degree of freedom：7.85
upper t value：-2.9542, upper p value：0.0093, degree of freedom：7.85

✅ These two sets of data are within the equivalent range and can be considered equivalent samples!


In [15]:
import numpy as np
import scipy.stats as stats
import pandas as pd

# Input the data
soft_group1_modulus = np.array([7.77, 8.23, 8.29])  
soft_group2_modulus = np.array([10.62, 8.1, 13.83, 11.41, 7.19, 13.36]) 

soft_group1_hardness = np.array([0.3095, 0.3166, 0.3671])
soft_group2_hardness = np.array([0.48, 0.32, 0.898, 0.71853, 0.302, 0.91668])

# Set up the equivalence value (+/-10%)
soft_modulus_mean = np.mean(np.concatenate([soft_group1_modulus, soft_group2_modulus]))
soft_hardness_mean = np.mean(np.concatenate([soft_group1_hardness, soft_group2_hardness]))

soft_modulus_eq_bound = 0.1 * soft_modulus_mean
soft_hardness_eq_bound = 0.1 * soft_hardness_mean

# Calculate the t-test
t_stat_soft_modulus, p_value_soft_modulus = stats.ttest_ind(soft_group1_modulus, soft_group2_modulus, equal_var=False)
t_stat_soft_hardness, p_value_soft_hardness = stats.ttest_ind(soft_group1_hardness, soft_group2_hardness, equal_var=False)

# TOST 
tost_p_value_soft_modulus_low = stats.ttest_1samp(soft_group1_modulus - np.mean(soft_group2_modulus), -soft_modulus_eq_bound)[1]
tost_p_value_soft_modulus_high = stats.ttest_1samp(soft_group1_modulus - np.mean(soft_group2_modulus), soft_modulus_eq_bound)[1]

tost_p_value_soft_hardness_low = stats.ttest_1samp(soft_group1_hardness - np.mean(soft_group2_hardness), -soft_hardness_eq_bound)[1]
tost_p_value_soft_hardness_high = stats.ttest_1samp(soft_group1_hardness - np.mean(soft_group2_hardness), soft_hardness_eq_bound)[1]

# result
soft_results = {
    "Property": ["Reduced Modulus", "Hardness"],
    "Mean Group 1": [np.mean(soft_group1_modulus), np.mean(soft_group1_hardness)],
    "Mean Group 2": [np.mean(soft_group2_modulus), np.mean(soft_group2_hardness)],
    "Equivalence Bound": [soft_modulus_eq_bound, soft_hardness_eq_bound],
    "t-test p-value": [p_value_soft_modulus, p_value_soft_hardness],
    "TOST p-value Lower": [tost_p_value_soft_modulus_low, tost_p_value_soft_hardness_low],
    "TOST p-value Upper": [tost_p_value_soft_modulus_high, tost_p_value_soft_hardness_high],
}

# Transform to DataFrame
df_soft_results = pd.DataFrame(soft_results)

# Show the result
print(df_soft_results)

          Property  Mean Group 1  Mean Group 2  Equivalence Bound  \
0  Reduced Modulus      8.096667     10.751667           0.986667   
1         Hardness      0.331067      0.605868           0.051427   

   t-test p-value  TOST p-value Lower  TOST p-value Upper  
0        0.060855            0.009554            0.002028  
1        0.059465            0.006525            0.003075  
