In [9]:
import scipy.stats as ss
import numpy as np
import math

Confidence interval that has 95% confidence
When we calculate using our formula, it should in 95% samplings contain the true mean
We will simulate if this is really true

1. Pick our distribution (mean = 10, standard deviation = 2)

In [2]:
mean = 10
stdev = 2
rv = ss.norm(loc=mean,scale=stdev)

2. Check if we really get 20-ish % below one standard deviation away

In [7]:
vals = []
for i in range(100):
    r = rv.rvs(size=1000)
    k = [s <= mean - stdev for s in r].count(True)
    ratio = k / len(r)
    vals.append(ratio)



print(f"Ratio: {np.mean(vals)}")

Ratio: 0.15997


The above seems to work correct. We should expect 15.8% of values to be below one standard deviation away from the mean.

3. Calculate 95% confidence interval for 4 samples and then test if it behaves as expected

Formula: 
- $n = 4$
- sample mean $\bar{x}$
- then we take the interval as $(\bar{x} - z \cdot \frac{\sigma}{\sqrt{n}},\bar{x} + z \cdot \frac{\sigma}{\sqrt{n}})$

In [95]:
num_samples = 4
trials = 2000
succ = 0
for i in range(trials):
    r = rv.rvs(size=num_samples)
    s_mean = np.mean(r)

    c_i = (s_mean - (1.15 * stdev / math.sqrt(num_samples)),s_mean + (1.15 * stdev / math.sqrt(num_samples)))
    if mean >= c_i[0] and mean <= c_i[1]:
        succ +=1

print(f"Ratio: {succ/trials}")

Ratio: 0.752
