# Question 1: Explain the properties of the F-distribution.

The **F-distribution** is a continuous probability distribution that arises frequently as the ratio of two scaled chi-square distributions. Key properties include:

- It is **asymmetric and right-skewed**.  
- Defined by two degrees of freedom parameters: \(d_1\) (numerator) and \(d_2\) (denominator).  
- Values are always **non-negative** (≥ 0).  
- Used to compare variances by examining the ratio of two sample variances.  
- The mean of the F-distribution is \(\frac{d_2}{d_2 - 2}\) for \(d_2 > 2\).  
- It approaches a normal distribution as the degrees of freedom increase.

---

# Question 2: In which types of statistical tests is the F-distribution used, and why is it appropriate for these tests?

The F-distribution is primarily used in:

- **F-tests** to compare variances of two populations.  
- **Analysis of Variance (ANOVA)** to test differences between group means by comparing between-group variance to within-group variance.

It is appropriate because the F-distribution models the ratio of two independent sample variances, which follow scaled chi-square distributions under the null hypothesis.

---

# Question 3: What are the key assumptions required for conducting an F-test to compare the variances of two populations?

Key assumptions for the F-test include:

- Samples are independent.  
- Populations are normally distributed.  
- Samples come from populations with continuous distributions.  
- The data are measured on at least an interval scale.

---

# Question 4: What is the purpose of ANOVA, and how does it differ from a t-test?

- **Purpose of ANOVA:** To determine if there are statistically significant differences among the means of three or more groups.  
- **Difference from t-test:**  
  - A t-test compares means between two groups only.  
  - ANOVA can compare multiple groups simultaneously without increasing Type I error, which would happen if multiple t-tests were conducted.

---

# Question 5: Explain when and why you would use a one-way ANOVA instead of multiple t-tests when comparing more than two groups.

One-way ANOVA is used when comparing the means of **three or more independent groups** to test if at least one group mean differs significantly.  

Using multiple t-tests increases the chance of Type I error (false positives). ANOVA controls the overall error rate while testing all groups simultaneously.

---

# Question 6: Explain how variance is partitioned in ANOVA into between-group variance and within-group variance. How does this partitioning contribute to the calculation of the F-statistic?

- **Between-group variance:** Variability due to differences between the group means and the overall mean.  
- **Within-group variance:** Variability within each group around their respective means (random error).  

The **F-statistic** is the ratio:

\[
F = \frac{\text{Between-group variance}}{\text{Within-group variance}}
\]

A larger F-value indicates that between-group differences are large relative to within-group variation, suggesting significant group differences.

---

# Question 7: Compare the classical (frequentist) approach to ANOVA with the Bayesian approach. What are the key differences in terms of how they handle uncertainty, parameter estimation, and hypothesis testing?

- **Frequentist ANOVA:**  
  - Relies on sampling distributions and p-values.  
  - Hypothesis testing focuses on rejecting or failing to reject null hypotheses.  
  - Parameters are fixed but unknown quantities.

- **Bayesian ANOVA:**  
  - Incorporates prior beliefs about parameters, updating these with data to get posterior distributions.  
  - Provides probabilistic interpretations of parameters and model comparisons.  
  - Handles uncertainty explicitly through the posterior distribution.

---


# Question 8: You have two sets of data representing the incomes of two different professions:

- Profession A: [48, 52, 55, 60, 62]  
- Profession B: [45, 50, 55, 52, 47]  

-- Perform an F-test to determine if the variances of the two professions' incomes are equal. What are your conclusions based on the F-test?


In [2]:

import numpy as np
from scipy.stats import f

# Data
prof_a = np.array([48, 52, 55, 60, 62])
prof_b = np.array([45, 50, 55, 52, 47])

# Sample variances
var_a = np.var(prof_a, ddof=1)
var_b = np.var(prof_b, ddof=1)

# F-statistic (larger variance / smaller variance)
if var_a > var_b:
    F = var_a / var_b
    dfn = len(prof_a) - 1  # degrees of freedom numerator
    dfd = len(prof_b) - 1  # degrees of freedom denominator
else:
    F = var_b / var_a
    dfn = len(prof_b) - 1
    dfd = len(prof_a) - 1

# Two-tailed p-value
p_value = 2 * min(f.cdf(F, dfn, dfd), 1 - f.cdf(F, dfn, dfd))

print(f"Variance Profession A: {var_a:.3f}")
print(f"Variance Profession B: {var_b:.3f}")
print(f"F-statistic: {F:.3f}")
print(f"P-value: {p_value:.4f}")

alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis: variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: no significant difference in variances.")

Variance Profession A: 32.800
Variance Profession B: 15.700
F-statistic: 2.089
P-value: 0.4930
Fail to reject the null hypothesis: no significant difference in variances.


# Question 9: Conduct a one-way ANOVA to test whether there are any statistically significant differences in average heights between three different regions with the following data:

In [3]:
from scipy.stats import f_oneway

region_a = [160, 162, 165, 158, 164]
region_b = [172, 175, 170, 168, 174]
region_c = [180, 182, 179, 185, 183]

# Perform one-way ANOVA
F_statistic, p_value = f_oneway(region_a, region_b, region_c)

print(f"F-statistic: {F_statistic:.3f}")
print(f"P-value: {p_value:.4f}")

alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis: at least one group mean is significantly different.")
else:
    print("Fail to reject the null hypothesis: no significant difference between group means.")


F-statistic: 67.873
P-value: 0.0000
Reject the null hypothesis: at least one group mean is significantly different.
