# Point Estimation

**Point Estimation** is the process of using a single "best guess" value (a statistic) from a sample to estimate an unknown population parameter. 

While a **Confidence Interval** provides a range, a **Point Estimate** provides a specific coordinate on the number line.

---

## 1. Key Formulas & Notation

In statistics, we use a "hat" notation ( $\hat{}$ ) to represent an estimate of a true population parameter.

| Parameter to Estimate | Population (Truth) | Point Estimate (Sample) | Formula |
| :--- | :--- | :--- | :--- |
| **Mean** | $\mu$ | $\hat{\mu} = \bar{x}$ | $\frac{\sum x_i}{n}$ |
| **Variance** | $\sigma^2$ | $\hat{\sigma}^2 = s^2$ | $\frac{\sum (x_i - \bar{x})^2}{n-1}$ |
| **Proportion** | $p$ | $\hat{p}$ | $\frac{\text{Successes}}{n}$ |



---

## 2. Criteria for a "Good" Point Estimator

Not all statistics are good estimators. Data scientists look for two main properties:

1.  **Unbiasedness:** On average, the estimator should equal the true population parameter. (e.g., using $n-1$ in sample variance makes it unbiased).
2.  **Efficiency (Low Variance):** If you take many samples, the point estimates should be tightly clustered together rather than widely spread out.

---

## 3. Data Science & ML Use Cases

* **Model Weights:** In Linear Regression, the coefficients ( $\beta_1, \beta_2, ...$ ) are point estimates of the "true" relationship between variables.
* **Probability Predictions:** When a Random Forest outputs a `0.85` probability for a class, that is a point estimate of the likelihood of that observation belonging to that class.
* **KPI Tracking:** When a company reports an "Average Revenue Per User" (ARPU), they are using a point estimate from their current database to represent their entire customer base's behavior.

---

## 4. Python Implementation: Estimating Population Mean
This script demonstrates how a point estimate changes with sample size and how it compares to the true population value.


In [1]:
import numpy as np

# 1. Create a "True" Population (Unknown in real life)
np.random.seed(42)
population = np.random.normal(loc=55, scale=10, size=100000)
true_mu = np.mean(population)

# 2. Take a Sample
sample_size = 100
sample = np.random.choice(population, sample_size)

# 3. Calculate Point Estimates
point_est_mean = np.mean(sample)
point_est_std = np.std(sample, ddof=1) # ddof=1 provides unbiased estimate

print(f"True Population Mean: {true_mu:.4f}")
print(f"Point Estimate (Sample Mean): {point_est_mean:.4f}")
print(f"Error (Bias): {abs(true_mu - point_est_mean):.4f}")

True Population Mean: 55.0097
Point Estimate (Sample Mean): 55.7202
Error (Bias): 0.7105


### Pro Tip:
A point estimate is almost **never** exactly equal to the population parameter due to **sampling error**. In professional Data Science reports, it is best practice to provide the point estimate (e.g., "The model accuracy is 85%") followed by its margin of error or a confidence interval.