## Z-Score T-Score

## Q-1

1. create a random sample of 50 exam scores(between 40 to 100)
2. calculate the sample mean and standard deviation
3. use scipy.stats.t.interval() to compute the 95% confidence level
4. increase the sample size 500 and recalculate the confidence level
5. compare how the interval changes with a larger sample size

In [5]:
import numpy as np
from scipy import stats

# Sample of 50 exam scores
np.random.seed(0)
sample_50 = np.random.randint(40, 101, size=50)
mean_50 = np.mean(sample_50)
std_50 = np.std(sample_50, ddof=1)

# 95% CI for sample of 50
ci_50 = stats.t.interval(0.95, df=49, loc=mean_50, scale=std_50/np.sqrt(50))

# Sample of 500 exam scores
sample_500 = np.random.randint(40, 101, size=500)
mean_500 = np.mean(sample_500)
std_500 = np.std(sample_500, ddof=1)

# 95% CI for sample of 500
ci_500 = stats.t.interval(0.95, df=499, loc=mean_500, scale=std_500/np.sqrt(500))

print("CI for n=50:", ci_50)
print("CI for n=500:", ci_500)

CI for n=50: (np.float64(62.027838994510084), np.float64(71.7321610054899))
CI for n=500: (np.float64(68.45626218216972), np.float64(71.57173781783027))


## Q-2

1. Assume the population mean is 70 and standard deviation is 10
2. generate a random sample of 100
3. use he z-test formula manually or statmodels.stats.weightstats.ztest() to check the sample mean is significuntly different from the population mean
4. change the sample mean slightly  and re-run the z-test
5. Interprete the z-test and p-value

In [8]:
# !pip install statsmodels
from statsmodels.stats.weightstats import ztest

# Population parameters
pop_mean = 70
pop_std = 10

# Sample of 100
np.random.seed(1)
sample = np.random.normal(loc=72, scale=pop_std, size=100)

# Z-test
z_stat, p_value = ztest(sample, value=pop_mean)
print("Z-statistic:", z_stat)
print("P-value:", p_value)

# Slightly change sample mean
sample_shifted = sample - 2  # shift mean closer to 70
z_stat2, p_value2 = ztest(sample_shifted, value=pop_mean)
print("Z-statistic (shifted):", z_stat2)
print("P-value (shifted):", p_value2)

Z-statistic: 2.929162786062353
P-value: 0.003398763736912441
Z-statistic (shifted): 0.6810004356008119
P-value (shifted): 0.4958712148029991


## Q-3

1. create a sample of 30 weights of people 
2. assume the population mean is 70
3. use scipy.test.ttest_1samp() to perform the one sample t-test
4. interprete the result :
    - p-value < 0.05 --> reject null hypothesis
    - p-value > 0.05 --> fail to reject null hypothesis

In [9]:
# Sample of 30 weights
np.random.seed(2)
weights = np.random.uniform(50, 90, size=30)

# One-sample t-test
t_stat, p_val = stats.ttest_1samp(weights, popmean=70)
print("T-statistic:", t_stat)
print("P-value:", p_val)

# Interpretation
if p_val < 0.05:
    print("Reject the null hypothesis: sample mean is significantly different from 70.")
else:
    print("Fail to reject the null hypothesis: no significant difference.")

T-statistic: -2.8249948961378486
P-value: 0.008467626846853473
Reject the null hypothesis: sample mean is significantly different from 70.
