## Sampling Distribution of a Sample Proportion

### Standard Error
If we know the population proportion, we can find the sample standard devation (also called the standard error) like this:

$$ \sigma_{ \hat{p} }^2 = \frac{ p(1-p) }{ n } $$

$$ \sigma_{ \hat{p} } = \sqrt{ \frac{ p(1-p) }{ n } }$$

Where:
- $\sigma_{ \hat{p} }$ -> Standard Error (standard deviation of the sampling distribution of the sample proportion)
- $ \sigma_{ \hat{p} }^2 $ -> Variance of the sampling distribution
- $p$ -> Population Proportion
- $n$ -> Sample size

In [32]:
import math

p = .6
n = 30
standard_error = math.sqrt( p * ( 1 - p ) / n )

print( 'Standard Error: %6.4f' % standard_error )

Standard Error: 0.0894


If we know that the population and/or the sample distribution is normally distributed, then we can find the probability of getting a particular outcome in the sampling distribution.

For example, let's say we have normal conditions and we want to calculate the probability of getting a particular value or range within a sample. We can calculate the standard error, then use it to find the cumulative distribution.

In [38]:
## https://stackoverflow.com/a/33824283/254046

import math
from scipy.stats import norm

p = .09
n = 350
range1 = .12
range2 = .68

standard_error = math.sqrt( p * ( 1 - p ) / n )
cdf1 = norm.cdf(range1, p, standard_error)
cdf2 = norm.cdf(range2, p, standard_error)

print( 'p=%.2f, samples=%d' % ( p, n ) )
print( 'standard_error=%.3f' % ( standard_error ) )
print()
print( 'P(x < %.2f) = %.3f' % ( range1, cdf1 ) )
print( 'P(x > %.2f) = %.3f' % ( range1, 1 - cdf1 ) )
print( 'P(%.2f < x < %.2f) = %.3f' % ( range1, range2, cdf2 - cdf1 ) )

p=0.09, samples=350
standard_error=0.015

P(x < 0.12) = 0.975
P(x > 0.12) = 0.025
P(0.12 < x < 0.68) = 0.025


## Sampling Distribution of a Sample Mean



### Standard Error
If we know the population standard deviation, we can find the sample standard devation (also called the standard error) like this:

$$ \sigma_{ \bar{x} }^2 = \frac{ \sigma^2 }{ n } $$

$$ \sigma_{ \bar{x} } = \frac{ \sigma }{ \sqrt{n} } $$

Where:
- $\sigma_{ \bar{x} }$ -> Standard Error (standard deviation of the sampling distribution of the sample mean)
- $ \sigma_{ \bar{x} }^2 $ -> Variance of the sampling distribution
- $\sigma^2$ -> Population Variance
- $\sigma$ -> Population Standard Deviation
- $n$ -> Sample size

In [2]:
import math

sigma = 1.5
n = 4
standard_error = sigma / math.sqrt( n )

print( 'Standard Error: %6.4f' % standard_error )

Standard Error: 0.7500


If we know that the population and/or the sample distribution is normally distributed, then we can find the probability of getting a particular outcome in the sampling distribution.

For example, let's say we have normal conditions and we want to calculate the probability of getting a particular value or range within a sample. We can calculate the standard error, then use it to find the cumulative distribution.

In [40]:
## https://stackoverflow.com/a/33824283/254046

import math
from scipy.stats import norm

mu = 8
sigma = 6
n = 35
range1 = 10
range2 = 11

standard_error = sigma / math.sqrt( n )
cdf1 = norm.cdf(range1, mu, standard_error)
cdf2 = norm.cdf(range2, mu, standard_error)

print( 'mean=%.2f, stddev=%.2f, samples=%d' % ( mu, sigma, n ) )
print( 'standard_error=%.3f' % ( standard_error ) )
print()
print( 'P(x < %.1f) = %.3f' % ( range1, cdf1 ) )
print( 'P(x > %.1f) = %.3f' % ( range1, 1 - cdf1 ) )
print( 'P(%.1f < x < %.1f) = %.3f' % ( range1, range2, cdf2 - cdf1 ) )

mean=8.00, stddev=6.00, samples=35
standard_error=1.014

P(x < 10.0) = 0.976
P(x > 10.0) = 0.024
P(10.0 < x < 11.0) = 0.023


## References
- [Sampling distributions - Khan Academy](https://www.khanacademy.org/math/statistics-probability/sampling-distributions-library)