# Day 6: Confidence Intervals and Hypothesis Testing Basics

Today’s focus on **Confidence Intervals** and **Hypothesis Testing** marks an essential step in understanding how to draw conclusions from data. Both techniques are fundamental in inferential statistics, enabling us to make generalizations about populations from sample data and assess the likelihood of hypotheses.

---

## Theory Overview:

### Confidence Intervals (CI):
- A **confidence interval** provides a range within which we expect a population parameter (like the mean) to lie, based on our sample data, with a specified level of confidence.
- **Margin of Error**: The margin of error represents the amount of uncertainty around a sample estimate, with smaller margins indicating greater precision. For example, a 95% CI means that if we took 100 samples and computed a 95% CI for each, approximately 95 of them would contain the true population mean.
- The interval is typically calculated as:

  $$
  \text{CI} = \text{sample mean} \pm (\text{critical value}) \times \text{standard error}
  $$

- The **critical value** depends on the desired confidence level and whether we use a z-score (for large samples) or t-score (for smaller samples).

### Hypothesis Testing:
- **Hypothesis testing** helps us decide if there’s enough evidence in our sample data to infer that a certain condition is true for the entire population.
  
  #### Null Hypothesis (H₀) vs. Alternative Hypothesis (H₁):
  - The **null hypothesis (H₀)** generally represents a default position (e.g., no effect, no difference).
  - The **alternative hypothesis (H₁)** is what we seek to support with our data.

- **P-Value**: This measures the probability of observing the test results, assuming the null hypothesis is true. A small p-value (commonly < 0.05) indicates that the observed result is unlikely under H₀, suggesting we may reject it in favor of H₁.

---

## Practical: Calculating Confidence Intervals and Conducting Hypothesis Tests in Python
Let's see how to calculate a confidence interval and perform a hypothesis test using Python.

### 1. Confidence Interval Calculation
Suppose we have a sample data representing the heights of individuals. We’ll calculate a 95% confidence interval for the sample mean.

In [1]:
import numpy as np
import scipy.stats as stats

data = np.random.normal(170, 10, 30)
mean = np.mean(data)
std_error = stats.sem(data)

confidence_level = 0.95
confidence_interval = stats.t.interval(confidence_level, len(data)-1, loc=mean, scale=std_error)

print(f"Sample Mean: {mean}")
print(f"95% Confidence Interval: {confidence_interval}")

Sample Mean: 167.98812604753738
95% Confidence Interval: (164.51451746987308, 171.46173462520167)


Here, the stats.t.interval function computes the interval, taking into account the sample size (small, so we use the t-distribution).

### 2. Hypothesis Testing: One-Sample t-Test
Suppose we want to test if our sample mean height significantly differs from a hypothesized population mean, say 170 cm.

python
Copy code
