### Q1
Shape: The F-distribution is positively skewed and becomes more symmetrical as the degrees of freedom increase.

Range: The values of the F-distribution are always positive, ranging from 0 to infinity.

Parameters: It is characterized by two sets of degrees of freedom: the numerator degrees of freedom (df1) and the denominator degrees of freedom (df2).

### Q2
ANOVA (Analysis of Variance): It compares the variances within groups to the variances between groups to determine if there are any statistically significant differences among group means.

F-test for equality of variances: It tests if two populations have equal variances.

### Q3
Independence: The samples must be independent of each other.

Normality: The populations from which the samples are drawn must be normally distributed.

Homogeneity of variance: The populations must have equal variances.

### Q4
ANOVA: Used to compare means of three or more groups simultaneously to see if at least one group mean is different from the others.

t-test: Compares the means of two groups.

The key difference is the number of groups being compared. ANOVA can handle more than two groups, whereas t-tests are limited to two.

### Q5
One-way ANOVA is preferred when comparing more than two groups to control the Type I error rate. Conducting multiple t-tests increases the risk of Type I errors, but ANOVA handles this by using a single test.

### Q6
Between-group variance: Variance due to the differences among group means.

Within-group variance: Variance within each group.

ANOVA partitions the total variance into these two components. The ratio of between-group variance to within-group variance forms the F-statistic, which helps in determining if the group means are significantly different.

### Q7
Classical (Frequentist) Approach: Relies on p-values and null hypothesis significance testing. Uncertainty is interpreted as long-term frequencies.

Bayesian Approach: Uses probability distributions to estimate parameters and quantify uncertainty. It updates beliefs based on observed data.

### Q8

In [26]:
import numpy as np
from scipy.stats import f

# Given data
profession_A = np.array([48, 52, 55, 60, 62])
profession_B = np.array([45, 50, 55, 52, 47])

# Compute sample variances
var_A = np.var(profession_A, ddof=1)  
var_B = np.var(profession_B, ddof=1)

# Compute F-statistic 
F_statistic = var_A / var_B if var_A > var_B else var_B / var_A

# Degrees of freedom
df1 = len(profession_A) - 1
df2 = len(profession_B) - 1

# Critical values for 95% confidence level (two-tailed test)
alpha = 0.05
F_critical_low = f.ppf(alpha / 2, df1, df2)
F_critical_high = f.ppf(1 - (alpha / 2), df1, df2)

# Conclusion
reject_null = F_statistic < F_critical_low or F_statistic > F_critical_high

# Print results
print(f"Variance of Profession A: {var_A}")
print(f"Variance of Profession B: {var_B}")
print(f"F-statistic: {F_statistic}")
print(f"Critical F-values (95% confidence, two-tailed): ({F_critical_low}, {F_critical_high})")
print("Conclusion: Reject Null Hypothesis" if reject_null else "Conclusion: Fail to Reject Null Hypothesis")

Variance of Profession A: 32.8
Variance of Profession B: 15.7
F-statistic: 2.089171974522293
Critical F-values (95% confidence, two-tailed): (0.10411753745392768, 9.60452988472286)
Conclusion: Fail to Reject Null Hypothesis


### Q9

In [31]:
import numpy as np
from scipy.stats import f_oneway

# Given data
region_A = np.array([160, 162, 165, 158, 164])
region_B = np.array([172, 175, 170, 168, 174])
region_C = np.array([180, 182, 179, 185, 183])

# Perform one-way ANOVA
F_statistic, p_value = f_oneway(region_A, region_B, region_C)

# Print results
print(f"F-statistic: {F_statistic}")
print(f"p-value: {p_value}")

# Interpret results
alpha = 0.05
if p_value < alpha:
    print("Conclusion: Reject the null hypothesis - There is a statistically significant difference in average heights between the regions.")
else:
    print("Conclusion: Fail to reject the null hypothesis - No statistically significant difference in average heights between the regions.")


F-statistic: 67.87330316742101
p-value: 2.870664187937026e-07
Conclusion: Reject the null hypothesis - There is a statistically significant difference in average heights between the regions.
