# Statistical Inference with Python

## What is Statistical Inference
Statistical inference is the process of drawing conclusions about a population based on a sample of data. It's a powerful tool that allows us to move from observations to generalizations. In the world of machine learning, statistical inference provides the bedrock for:
* Estimating model parameters and quantifying their uncertainty
* Evaluating the performance of machine learning models and making comparisons
* Making predictions on new, unseen data with confidence intervals
* Understanding the underlying relationships between variables.

## Key Concepts and Notation
Before we proceed, let's establish some essential terminology and notation:
* **Population:** The entire collection of individuals or objects about which we want to draw conclusions. We often use Greek letters to denote population parameters such as:
    * µ (mu) for the population mean
    * σ (sigma) for the population standard deviation
    * σ² (sigma squared) for the population variance 
* **Sample:** A subset of the population from which we collect data. Sample statistics are usually denoted with Roman letters:
    * x̄ (x-bar) for the sample mean.
    * s for the sample standard deviation.
    * s² for the sample variance
    * n for the sample size
* **Random Variable:** A variable whose value is numerical outcome of a random phenomenon. We often use uppercase letters (e.g., X, Y) to denote random variables.
* **Probability Distribution:** A function that describes the likelihood of different outcomes for a random variable. Common examples include the normal distribution, binomial distribution, and Poisson distribution. 
* **Sampling Distribution:** The probability of a statistic (e.g., the distribution of sample means from many different samples). This concept is central to understanding how well a sample statistic estimates a population parameter.

## Estimation with Python
The goal of estimation is to use sample data to approximate unknown population parameters. 

### Point Estimation
A point estimate is a single value that serves as our "best guess" for a population parameter. For example, the sample mean (x̅) is a common point estimator for the population mean(µ).

**Desirable Properties of Estimators:**
* **Unbiasedness:** An estimator is unbiased if its expected value is equal to the true parameter value (i.e., on average, it hits the target). Mathematically, for an estimator θ̂ (theta-hat) of a parameter θ (theta), E(θ̂)=θ.
* **Consistency:** An estimator is consistent if it converges to the true parameter value as the sample size increases (i.e., more data leads to better accuracy).
* **Efficiency:** An estimator is efficient if it has the smallest variance among all unbiased estimators (i.e., it's the most precise).
* **Example with Python:**

In [5]:
import numpy as np

# Generate 10 random heights (in cm) between 150 and 200
data = np.random.randint(150, 200, size=10) 

# Calculate the sample mean (point estimate for the population mean)
sample_mean = np.mean(data)
print(f"Sample Mean: {sample_mean}")

Sample Mean: 168.6


### Interval Estimation
