# Statistics Assignment: Theoretical Solutions

---

## **1. Hypothesis Testing in Statistics**  
Hypothesis testing is a method to determine if there is enough statistical evidence to support a claim about a population parameter. It involves:  
- **Null Hypothesis ($(H_0$))**: Assumes no effect/difference.  
- **Alternative Hypothesis ($(H_1$))**: Represents the researcher’s claim.  
- **Test Statistic**: Computed from sample data (e.g., $(Z$), $(T$)).  
- **Decision Rule**: Compare the test statistic to a critical value or use a **P-value**.  

**Example**:  
Testing if a new drug lowers blood pressure:  
- $(H_0$): Drug has no effect ($(\mu = \mu_0$)).  
- $(H_1$): Drug reduces blood pressure ($(\mu < \mu_0$)).  

---

## **2. Null Hypothesis vs. Alternative Hypothesis**  
- **Null Hypothesis (\(H_0\))**: Default assumption (e.g., \(\mu = 50\)).  
- **Alternative Hypothesis (\(H_1\))**: Contradicts \(H_0\) (e.g., \(\mu \neq 50\) or \(\mu > 50\)).  

**Key Difference**:  
- \(H_0\) is tested for rejection; \(H_1\) is the claim to validate.  

---

## **3. Significance Level (\(\alpha\))**  
- The probability of rejecting \(H_0\) when it is true (**Type I error**).  
- Common values: \(\alpha = 0.05\) or \(\alpha = 0.01\).  
- **Importance**: Controls the risk of false positives.  

---

## **4. P-value**  
- The probability of observing the sample data (or more extreme results) **assuming \(H_0\) is true**.  
- **Formula**:  
  \[
  P\text{-value} = P(\text{Test Statistic} \geq \text{Observed Value} \mid H_0)
  \]  
**Example**:  
If \(P = 0.03\) and \(\alpha = 0.05\), reject \(H_0\).  

---

## **5. Interpreting the P-value**  
- **Low \(P\)-value** (\(P < \alpha\)): Strong evidence against \(H_0\).  
- **High \(P\)-value** (\(P \geq \alpha\)): Insufficient evidence to reject \(H_0\).  

---

## **6. Type I and Type II Errors**  
- **Type I Error**: Rejecting \(H_0\) when it is true (\(\alpha\)).  
  - Example: Concluding a drug works when it does not.  
- **Type II Error (\(\beta\))**: Failing to reject \(H_0\) when it is false.  
  - Example: Failing to detect a drug’s effect.  

---

## **7. One-Tailed vs. Two-Tailed Tests**  
- **One-Tailed**: Tests for an effect in **one direction** (e.g., \(\mu > 50\)).  
  - Critical region in one tail of the distribution.  
- **Two-Tailed**: Tests for effects in **both directions** (e.g., \(\mu \neq 50\)).  
  - Critical regions in both tails.  

---

## **8. Z-Test**  
- A hypothesis test using the **standard normal distribution** (\(Z\)-distribution).  
- **Use Case**:  
  - Population variance (\(\sigma^2\)) is known.  
  - Large sample size (\(n \geq 30\)).  

**Formula**:  
\[
Z = \frac{\bar{X} - \mu}{\sigma / \sqrt{n}}
\]  
Where:  
- \(\bar{X}\) = Sample mean  
- \(\mu\) = Population mean  
- \(\sigma\) = Population standard deviation  
- \(n\) = Sample size  

---

## **9. Z-Score Calculation**  
- Measures how many standard errors the sample mean deviates from the population mean.  
\[
Z = \frac{\bar{X} - \mu}{\sigma / \sqrt{n}}
\]  
**Interpretation**:  
- \(Z = 1.96\) → Sample mean is 1.96 standard errors above \(\mu\).  

---

## **10. T-Distribution**  
- A distribution with **heavier tails** than the normal distribution.  
- **Use Case**:  
  - Population variance unknown.  
  - Small sample size (\(n < 30\)).  
- **Degrees of Freedom**: \(df = n - 1\).  

---

## **11. Z-Test vs. T-Test**  
| **Z-Test** | **T-Test** |  
|------------|------------|  
| Uses \(\sigma\) (known variance). | Uses \(s\) (sample variance). |  
| Suitable for large samples. | Suitable for small samples. |  

---

## **12. T-Test**  
- Compares means using the **T-distribution**.  
- **Types**:  
  1. **One-Sample**: Compare sample mean to a known \(\mu\).  
  2. **Independent**: Compare two independent groups.  
  3. **Paired**: Compare paired observations (e.g., pre-test vs. post-test).  

**Formula (One-Sample)**:  
\[
t = \frac{\bar{X} - \mu}{s / \sqrt{n}}
\]  
Where \(s\) = sample standard deviation.  

---

## **13. Relationship Between Z-Test and T-Test**  
- Both compare means, but the T-test accounts for uncertainty in estimating \(\sigma\).  
- As \(n \to \infty\), the T-distribution converges to the Z-distribution.  

---

## **14. Confidence Interval (CI)**  
- A range of values that likely contains the population parameter.  
- **Formula for Mean**:  
\[
\text{CI} = \bar{X} \pm Z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}
\]  
**Example**: 95% CI = \([45, 55]\) → 95% confidence the true mean lies in this interval.  

---

## **15. Margin of Error**  
- Half the width of the confidence interval.  
\[
\text{Margin of Error} = Z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}
\]  
- **Impact**: Larger \(n\) reduces the margin of error.  

---

## **16. Bayes’ Theorem**  
\[
P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}
\]  
- **Application**: Updates prior beliefs (\(P(A)\)) with data (\(P(B|A)\)).  
- **Example**: Spam detection (updating spam probability based on keywords).  

---

## **17. Chi-Square Distribution**  
- A right-skewed distribution for **categorical data analysis**.  
- **Use Cases**:  
  - Goodness-of-fit tests.  
  - Tests of independence.  

---

## **18. Chi-Square Goodness-of-Fit Test**  
- Tests if observed data matches an expected distribution.  
- **Formula**:  
\[
\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}
\]  
Where:  
- \(O_i\) = Observed frequency  
- \(E_i\) = Expected frequency  

**Example**: Testing if a die is fair (\(E_i = \frac{n}{6}\)).  

---

## **19. F-Distribution**  
- The distribution of the ratio of two chi-square variables.  
- **Use Cases**:  
  - ANOVA (comparing group variances).  
  - Testing equality of variances.  

---

## **20. ANOVA Test**  
- **Analysis of Variance (ANOVA)** tests differences between group means.  
- **Assumptions**:  
  1. Normality.  
  2. Homogeneity of variances.  
  3. Independence.  

**Formula (F-statistic)**:  
\[
F = \frac{\text{Between-group variance}}{\text{Within-group variance}}
\]  

---

## **21. Types of ANOVA**  
1. **One-Way ANOVA**: Tests one factor (e.g., effect of fertilizer on plant growth).  
2. **Two-Way ANOVA**: Tests two factors and their interaction.  
3. **MANOVA**: Tests multiple dependent variables.  

---

## **22. F-Test**  
- Compares variances using the **F-distribution**.  
- **In ANOVA**: Compares variability between groups to variability within groups.  

**Formula**:  
\[
F = \frac{MS_{\text{between}}}{MS_{\text{within}}}
\]  
Where \(MS\) = Mean Square.  

---