# [Confidence Interval](statistics.md#confidence-intervals)

A confidence interval is an interval estimate of a population parameter. It is a range of values that is likely to contain the population parameter. It is used to estimate the population parameter based on the sample data.

- mean estimation (poblational and sample)
- proportion estimation (poblational and sample)
- variance and quasi-variance estimation
- mean, variance and proportion difference estimation
- standard deviation estimation

In [2]:
import seaborn as sns
import numpy as np
from scipy import stats

### Mean Estimation

...

In [7]:
def mean_ci(data, confidence=0.95) -> tuple:
    a = 1.0 * np.array(data)
    n = len(a)
    m, se = np.mean(a), stats.sem(a)
    h = se * stats.t.ppf((1 + confidence) / 2., n-1)
    return m, m-h, m+h

### Proportion Estimation

...

In [6]:
def proportion_ci(data_1, data_2, confidence=0.95) -> tuple:
    n1 = len(data_1)
    n2 = len(data_2)
    p1 = sum(data_1) / n1
    p2 = sum(data_2) / n2
    p = (sum(data_1) + sum(data_2)) / (n1 + n2)
    se = np.sqrt(p * (1 - p) * (1 / n1 + 1 / n2))
    z = stats.norm.ppf((1 + confidence) / 2)
    h = z * se
    return p1 - p2, p1 - p2 - h, p1 - p2 + h

### Variance and Quasi-Variance Estimation

...

In [5]:
def variance(data, confidence=0.95) -> tuple:
    a = 1.0 * np.array(data)
    n = len(a)
    m, se = np.var(a), stats.sem(a)
    h = se * stats.t.ppf((1 + confidence) / 2., n-1)
    return m, m-h, m+h

def quasi_variance(data, confidence=0.95) -> tuple:
    a = 1.0 * np.array(data)
    n = len(a)
    m, se = np.var(a), stats.sem(a)
    h = se * stats.t.ppf((1 + confidence) / 2., n-1)
    return m, m-h, m+h

### Mean, Variance and Proportion Difference Estimation

...

### Standard Deviation Estimation

...

In [8]:
def std(data, confidence=0.95) -> tuple:
    a = 1.0 * np.array(data)
    n = len(a)
    m, se = np.std(a), stats.sem(a)
    h = se * stats.t.ppf((1 + confidence) / 2., n-1)
    return m, m-h, m+h
