All excersizes are from [Khan Academy: Sampling Distributions](https://www.khanacademy.org/math/statistics-probability/sampling-distributions-library)

# 1. Sampling distribution of a sample proportion

Standard deviation of a sampling distribution:

$\sigma_\hat{p} = \sqrt{\frac{p(1-p)}{n}}$

where:

$\sigma_\hat{p}$ is standard deviation of the sample

**p** is population proportion

**n** is sample size

![](img/sample_distributions_p1.png)

In [1]:
import math
import scipy.stats as ss

SAMPLE_SIZE = 600
POPULATION_MU = 0.63
# since it's probability, both values should be: 0 < p < 1
SAMPLE_LOWER_BOUND = 0.61
SAMPLE_UPPER_BOUND = 0.65

sigma = math.sqrt(POPULATION_MU*(1-POPULATION_MU)/SAMPLE_SIZE)

cdf_lower = ss.norm.cdf(SAMPLE_LOWER_BOUND, POPULATION_MU, sigma)
cdf_upper = ss.norm.cdf(SAMPLE_UPPER_BOUND, POPULATION_MU, sigma)

print("Probability of %.2f < sample values < %.2f for sample of %d, when probability of population is %.2f: %.2f" 
      % (SAMPLE_LOWER_BOUND, SAMPLE_UPPER_BOUND, SAMPLE_SIZE, POPULATION_MU, cdf_upper-cdf_lower))

Probability of 0.61 < sample values < 0.65 for sample of 600, when probability of population is 0.63: 0.69


# 2. Sampling distribution of a sample mean

Standard deviation of sampling means:

$\sigma_\bar{p} = \frac{\sigma}{\sqrt{n}}$

where:

$\sigma_\bar{p}$ is sample mean

$\sigma$ is population mean

**n** is sample size

![](img/sample_distributions_p2.png)

In [2]:
import math
import scipy.stats as ss

SAMPLE_SIZE = 100
POPULATION_MU = 10
POPULATION_SIGMA = 2

# we can calculate probabilities for a sample only if population is normally distributed 
# or if sample size is > 30
# if upper bound is not defined we can use math.inf
MU_SAMPLE_LOWER_BOUND = 9.6
MU_SAMPLE_UPPER_BOUND = 10.4

# sample mean and population mean is the same
print("Sample's mean is %.2f" % POPULATION_MU)

sample_sigma = POPULATION_SIGMA / math.sqrt(SAMPLE_SIZE)
print("Sample's standard deviation is %.2f" % sample_sigma)

if MU_SAMPLE_LOWER_BOUND and MU_SAMPLE_UPPER_BOUND:
    cdf_lower = ss.norm.cdf(MU_SAMPLE_LOWER_BOUND, POPULATION_MU, sample_sigma)
    cdf_upper = ss.norm.cdf(MU_SAMPLE_UPPER_BOUND, POPULATION_MU, sample_sigma)
    print("Probability of %.2f < sample mean < %.2f for a sample of %d: %.2f" 
      % (MU_SAMPLE_LOWER_BOUND, MU_SAMPLE_UPPER_BOUND, SAMPLE_SIZE, cdf_upper-cdf_lower))

Sample's mean is 10.00
Sample's standard deviation is 0.20
Probability of 9.60 < sample mean < 10.40 for a sample of 100: 0.95
