A confidence interval is a range of values that is likely to contain the true population parameter with a certain level of confidence. In the case of the population mean, the equation you mentioned allows us to estimate the sample mean $(\hat{µ})$ based on the observed data.

$$\hat{µ} = \frac{1}{n} \sum \limits _ {i=1} ^{n} X_{i} ~ N (µ, \frac{σ^2}{n})$$

$\hat{µ}$: This represents the estimate of the population mean µ. The hat symbol (\hat{}) is often used to denote an estimate.

X_i: These are the individual observations or values in the sample. The subscript "i" indicates the i-th observation.

n: This represents the size of the sample, or the number of observations.

$N(µ, σ^2/n)$: This represents the distribution of the sample mean. The sample mean is assumed to follow a normal distribution with mean µ and variance σ^2/n.

The equation also provides information about the distribution of the sample mean. It states that the sample mean follows a normal distribution with mean µ and variance σ^2/n. The variance of the sample mean decreases as the sample size increases, reflecting a more precise estimation of the population mean.

The confidence interval is found as:

$$\hat{µ} ± Z_{value} * (\frac {σ}{n})$$

\hat{µ}: This represents the sample mean, which is an estimate of the population mean.

Z-value: This is the critical value from the standard normal distribution corresponding to the desired confidence level. It determines the width of the confidence interval.

σ: This is the population standard deviation, which is assumed to be known.

n: This represents the sample size, i.e., the number of observations in the sample.

This equation is applicable in situations where you have a known population standard deviation and want to estimate the population mean based on a sample. It is commonly used in statistical inference and hypothesis testing to make conclusions about the population mean using sample data.

In [1]:
import scipy.stats as stats
import math

# To estimate the population mean µ, you calculate the sample mean by taking the sum of all observations and dividing it by the sample size:

def calculate_confidence_interval(data, confidence_level):
    n = len(data)  # Sample size
    sample_mean = sum(data) / n  # Sample mean

    # Standard deviation of the sample
    sample_std = math.sqrt(sum([(x - sample_mean) ** 2 for x in data]) / (n - 1))

    # Z-value corresponding to the desired confidence level
    z_value = stats.norm.ppf(1 - (1 - confidence_level) / 2)

    # Margin of error
    margin_of_error = z_value * (sample_std / math.sqrt(n))

    # Confidence interval bounds
    lower_bound = sample_mean - margin_of_error
    upper_bound = sample_mean + margin_of_error

    return (lower_bound, upper_bound)

# Example usage
data = [2.5, 3.7, 4.1, 2.8, 3.9, 4.5, 3.3, 4.8, 3.7, 2.9]
confidence_level = 0.95

lower, upper = calculate_confidence_interval(data, confidence_level)

# This equation computes the average of the observed values in the sample. It provides an estimate of the population mean based on the available data.

print(f"Confidence Interval ({confidence_level * 100}%): ({lower}, {upper})")


Confidence Interval (95.0%): (3.156371956784613, 4.083628043215388)
