<a href="https://colab.research.google.com/github/sivanujands/StatisticalTests/blob/main/UnrelatedSamples/ParametricTests/Independent_Samples_t_test.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import pandas as pd
from scipy import stats
import numpy as np

# 1. Data
method_a_scores = np.array([85, 88, 90, 78, 92, 80, 83, 87, 89, 79, 91, 84, 86, 77, 93, 81, 85, 88, 90, 78, 82, 86, 89, 91, 79, 83, 87, 85, 90, 80])
method_b_scores = np.array([75, 78, 80, 72, 81, 70, 73, 76, 79, 68, 82, 74, 77, 65, 83, 71, 75, 78, 80, 72, 76, 79, 81, 83, 69, 74, 77, 75, 80, 70, 72, 78, 79, 81, 73])

print(f"Method A Scores (n={len(method_a_scores)}): {method_a_scores.tolist()}")
print(f"Method B Scores (n={len(method_b_scores)}): {method_b_scores.tolist()}")
print("\n")

# Optional: Check descriptive statistics for each group
print("Descriptive Statistics for Method A:")
print(pd.Series(method_a_scores).describe())
print("\nDescriptive Statistics for Method B:")
print(pd.Series(method_b_scores).describe())
print("\n")

# 2. Check Homogeneity of Variances using Levene's Test
# H0: Variances are equal
# H1: Variances are not equal
levene_statistic, levene_p_value = stats.levene(method_a_scores, method_b_scores)

print(f"Levene's Test Statistic: {levene_statistic:.3f}")
print(f"Levene's Test P-value: {levene_p_value:.3f}")

alpha = 0.05
if levene_p_value < alpha:
    print("Conclusion: Variances are significantly different (p < 0.05). Will use Welch's t-test (equal_var=False).")
    equal_variances = False
else:
    print("Conclusion: Variances are not significantly different (p >= 0.05). Will use standard t-test (equal_var=True).")
    equal_variances = True

print("\n")

# 3. Perform the Independent Samples t-test
# stats.ttest_ind automatically handles the equal_var parameter for Welch's or standard t-test
t_statistic, p_value = stats.ttest_ind(method_a_scores, method_b_scores, equal_var=equal_variances)

print(f"Independent Samples T-statistic: {t_statistic:.3f}")
print(f"P-value: {p_value:.3f}")
print("\n")

# 4. Set the Significance Level (already defined as alpha = 0.05)

# 5. Make a Decision and Draw a Conclusion
print(f"Significance Level (alpha): {alpha}")

if p_value < alpha:
    print(f"Since p-value ({p_value:.3f}) < alpha ({alpha}), we reject the null hypothesis.")
    print("Conclusion: There is a statistically significant difference in the average test scores between Method A and Method B.")
else:
    print(f"Since p-value ({p_value:.3f}) >= alpha ({alpha}), we fail to reject the null hypothesis.")
    print("Conclusion: There is no statistically significant difference in the average test scores between Method A and Method B.")

# Additional context
print(f"\nMean score for Method A: {np.mean(method_a_scores):.2f}")
print(f"Mean score for Method B: {np.mean(method_b_scores):.2f}")
print(f"Difference in means: {np.mean(method_a_scores) - np.mean(method_b_scores):.2f}")

Method A Scores (n=30): [85, 88, 90, 78, 92, 80, 83, 87, 89, 79, 91, 84, 86, 77, 93, 81, 85, 88, 90, 78, 82, 86, 89, 91, 79, 83, 87, 85, 90, 80]
Method B Scores (n=35): [75, 78, 80, 72, 81, 70, 73, 76, 79, 68, 82, 74, 77, 65, 83, 71, 75, 78, 80, 72, 76, 79, 81, 83, 69, 74, 77, 75, 80, 70, 72, 78, 79, 81, 73]


Descriptive Statistics for Method A:
count    30.000000
mean     85.200000
std       4.686003
min      77.000000
25%      81.250000
50%      85.500000
75%      89.000000
max      93.000000
dtype: float64

Descriptive Statistics for Method B:
count    35.000000
mean     75.885714
std       4.555355
min      65.000000
25%      72.500000
50%      76.000000
75%      79.500000
max      83.000000
dtype: float64


Levene's Test Statistic: 0.070
Levene's Test P-value: 0.793
Conclusion: Variances are not significantly different (p >= 0.05). Will use standard t-test (equal_var=True).


Independent Samples T-statistic: 8.110
P-value: 0.000


Significance Level (alpha): 0.05
Since p-value (0

**Explanation of the Output:**

* The output will first show the results of Levene's test for equality of variances.

* Levene's Test P-value: This is crucial for determining which version of the t-test to use.

* If levene_p_value is less than alpha (e.g., 0.05), it suggests that the variances are significantly different, and you should use equal_var=False in ttest_ind (which runs Welch's t-test).

* If levene_p_value is greater than or equal to alpha, it suggests that the variances are not significantly different, and you can use equal_var=True (the standard pooled t-test).

* Then, the output for the Independent Samples t-test will be displayed:

* T-statistic: This indicates the magnitude and direction of the difference between the group means relative to the variability within the groups. A positive T-statistic means the mean of the first group (method_a_scores) is higher than the second group (method_b_scores).

* P-value: This is the probability of observing a t-statistic as extreme as, or more extreme than, the one calculated, assuming there's no true difference between the population means.

* Based on the comparison of the p-value with the chosen alpha (0.05):

* If p_value < alpha, you reject the null hypothesis, concluding that there is a statistically significant difference between the average test scores of students taught by Method A and Method B.

* If p_value >= alpha, you fail to reject the null hypothesis, concluding that there is not enough evidence to claim a statistically significant difference between the average test scores of students taught by Method A and Method B.