**1. Explain the properties of the F-distribution. **

The F-distribution has these key properties:

Asymmetric and right-skewed.
Degrees of Freedom (df): Defined by two values,
  df1(numerator) and
  df2(denominator).
Range: 0 to positive infinity.
Non-Negative: Only takes positive values.
Mean: Exists if
 with a complex formula.
Used in: ANOVA, testing variances, and regression analysis.
Right-Tail Tests: Often focuses on right tail for significance testing.

**2. In which types of statistical tests is the F-distribution used, and why is it appropriate for these tests?**

The F-distribution is used in:

ANOVA: To test if multiple group means are equal by comparing variances between and within groups.
Regression Analysis: To test the significance of a model by comparing explained to unexplained variance.
Variance Comparison: To test if two independent sample variances are significantly different.
It’s appropriate because it models the ratio of variances, which is essential for these types of hypothesis tests.

**3. What are the key assumptions required for conducting an F-test to compare the variances of two populations?**

Key assumptions for an F-test to compare variances are:

Normality: Both populations should be normally distributed.
Independence: The samples must be independent of each other.
Random Sampling: Data should be collected randomly from the populations.
Ratio Scale: The data should be on a ratio scale (positive values).
These assumptions ensure the validity of the F-test for comparing variances

**4. What is the purpose of ANOVA, and how does it differ from a t-test? **
The purpose of ANOVA (Analysis of Variance) is to test if there are significant differences between the means of three or more groups. It compares the variance between groups to the variance within groups.

Difference from t-test:

A t-test compares means of only two groups, while ANOVA can handle three or more groups.
ANOVA reduces the risk of Type I error (false positives) that can occur when conducting multiple t-tests.

**5. Explain when and why you would use a one-way ANOVA instead of multiple t-tests when comparing more than two groups.**

You would use a one-way ANOVA instead of multiple t-tests when comparing more than two groups to:

Avoid Increased Type I Error: Conducting multiple t-tests increases the chance of false positives (Type I error) as each test carries its own error rate. ANOVA controls this by performing a single test.

Compare All Groups Simultaneously: ANOVA tests if there’s at least one significant difference among multiple group means in a single analysis, providing an overall view rather than multiple pairwise comparisons.

Efficiency: It’s more efficient and straightforward to use one ANOVA test rather than running multiple t-tests for each group combination.

**6. Explain how variance is partitioned in ANOVA into between-group variance and within-group variance. How does this partitioning contribute to the calculation of the F-statistic?**

In ANOVA:

Between-Group Variance: Measures how much group means differ from the overall mean, showing group effects.
Within-Group Variance: Measures variability within each group, reflecting random error.
The F-statistic is calculated by dividing between-group variance by within-group variance:

F = Between-Group Variance/Within-Group Variance

​A high F-value suggests significant differences between group means.

**7. Compare the classical (frequentist) approach to ANOVA with the Bayesian approach. What are the key differences in terms of how they handle uncertainty, parameter estimation, and hypothesis testing?**

In classical (frequentist) ANOVA:

Uncertainty: Based on sampling variability and fixed parameters.
Parameter Estimation: Uses sample data to estimate group means and variances.
Hypothesis Testing: Tests if group means differ by calculating p-values, with results depending on sample size and significance level.
In Bayesian ANOVA:

Uncertainty: Treated as a probability distribution for parameters.
Parameter Estimation: Uses prior distributions combined with data to estimate parameters, yielding a posterior distribution.
Hypothesis Testing: Assesses probability of hypotheses (e.g., differences in means) directly from the posterior, providing more flexible and interpretable results.

In [1]:
'''8. Question: You have two sets of data representing the incomes of two different professions1
V Profession A: [48, 52, 55, 60, 62]
V Profession B: [45, 50, 55, 52, 47] Perform an F-test to determine if the variances of the two professions'
incomes are equal. What are your conclusions based on the F-test?

Task: Use Python to calculate the F-statistic and p-value for the given data.

Objective: Gain experience in performing F-tests and interpreting the results in terms of variance comparison.'''
import numpy as np
import scipy.stats as stats

profession_a = np.array([48, 52, 55, 60, 62])
profession_b = np.array([45, 50, 55, 52, 47])

var_a = np.var(profession_a, ddof=1)
var_b = np.var(profession_b, ddof=1)

f_statistic = var_a / var_b


df_a = len(profession_a) - 1
df_b = len(profession_b) - 1
p_value = stats.f.sf(f_statistic, df_a, df_b)

f_statistic, p_value
'''
Since the p-value is greater than
0.05
0.05, we conclude that there is no significant difference between the variances of the incomes of Profession A and Profession B. Thus, we can state that the variances are equal.
'''

(2.089171974522293, 0.24652429950266966)

**9. Question: Conduct a one-way ANOVA to test whether there are any statistically significant differences in
average heights between three different regions with the following data1
V Region A: [160, 162, 165, 158, 164]

V Region B: [172, 175, 170, 168, 174]

V Region C: [180, 182, 179, 185, 183]

V Task: Write Python code to perform the one-way ANOVA and interpret the results
V Objective: Learn how to perform one-way ANOVA using Python and interpret F-statistic and p-value.**

In [3]:
'''
**9. Question: Conduct a one-way ANOVA to test whether there are any statistically significant differences in
average heights between three different regions with the following data1
V Region A: [160, 162, 165, 158, 164]

V Region B: [172, 175, 170, 168, 174]

V Region C: [180, 182, 179, 185, 183]

V Task: Write Python code to perform the one-way ANOVA and interpret the results
V Objective: Learn how to perform one-way ANOVA using Python and interpret F-statistic and p-value.**
'''
import numpy as np
import scipy.stats as stats

region_a = np.array([160, 162, 165, 158, 164])
region_b = np.array([172, 175, 170, 168, 174])
region_c = np.array([180, 182, 179, 185, 183])

f_statistic, p_value = stats.f_oneway(region_a, region_b, region_c)

print(f"F-statistic: {f_statistic:.2f}")
print(f"P-value: {p_value:.4e}")

if p_value < 0.05:
    print("Reject the null hypothesis: There are statistically significant differences in average heights between regions.")
else:
    print("Fail to reject the null hypothesis: There are no statistically significant differences in average heights between regions.")


F-statistic: 67.87
P-value: 2.8707e-07
Reject the null hypothesis: There are statistically significant differences in average heights between regions.
