Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio
test. The function should return the F-value and the corresponding p-value for the test.

Answer:-

In [1]:
import numpy as np
from scipy.stats import f

def variance_ratio_test(data1, data2):
    # Calculate the variances of the two datasets
    var1 = np.var(data1, ddof=1)
    var2 = np.var(data2, ddof=1)

    # Calculate the F-value
    if var1 > var2:
        F = var1 / var2
        df1 = len(data1) - 1
        df2 = len(data2) - 1
    else:
        F = var2 / var1
        df1 = len(data2) - 1
        df2 = len(data1) - 1

    # Calculate the p-value
    p_value = 1 - f.cdf(F, df1, df2)

    return F, p_value

# Example usage
data1 = [5, 7, 8, 9, 10, 12, 15]
data2 = [6, 6, 7, 8, 8, 9, 11]

F_value, p_value = variance_ratio_test(data1, data2)
print("F-value:", F_value)
print("p-value:", p_value)


F-value: 3.4848484848484858
p-value: 0.07708556610982875


Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an
F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

Answer:-

In [2]:
from scipy.stats import f

def critical_f_value(alpha, df_numerator, df_denominator):
    # Calculate the critical F-value for a two-tailed test
    # Since it's a two-tailed test, we divide the significance level by 2
    critical_value = f.ppf(1 - alpha / 2, df_numerator, df_denominator)
    return critical_value

# Example usage:
alpha = 0.05
df_numerator = 4  # Example degrees of freedom for the numerator
df_denominator = 5  # Example degrees of freedom for the denominator

critical_value = critical_f_value(alpha, df_numerator, df_denominator)
print(f"Critical F-value: {critical_value}")

Critical F-value: 7.387885751267751


Q3. Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the F-
value, degrees of freedom, and p-value for the test.

Answer:-

Performing a two - tailed F test

In [3]:
import numpy as np
from scipy.stats import f

# Set seed for reproducibility
np.random.seed(456)

# Generate random samples from two normal distributions with known variances
n1 = 30
n2 = 40
mean1 = 10
mean2 = 20
var1 = 6
var2 = 4

sample1 = np.random.normal(mean1, np.sqrt(var1), n1)
sample2 = np.random.normal(mean2, np.sqrt(var2), n2)

# Calculate the F-value and p-value for the variance ratio test
F = np.var(sample1, ddof=1) / np.var(sample2, ddof=1)
dfn = n1 - 1
dfd = n2 - 1
p_value = 2 * min(f.cdf(F, dfn, dfd), 1 - f.cdf(F, dfn, dfd))

# Output the results
print("Sample 1 mean: {:.2f}, variance: {:.2f}".format(np.mean(sample1), np.var(sample1, ddof=1)))
print("Sample 2 mean: {:.2f}, variance: {:.2f}".format(np.mean(sample2), np.var(sample2, ddof=1)))
print("F-value: {:.2f}".format(F))
print("Degrees of freedom: ({}, {})".format(dfn, dfd))
print("p-value: {:.4f}".format(p_value))

print('\n===================================================================\n')

# Null Hypothesis and Alternate hypothesis
null_hypothesis = "Variance of two samples population is same"
alternate_hypothesis = "Variance of both samples population is different"

# Assuming alpha value of 0.05
alpha = 0.05

# Calculates critical values for two tailed F-test
F_crit1 = f.ppf(alpha/2, dfn, dfd)
F_crit2 = f.ppf(1-alpha/2,dfn, dfd)

# Print Critical F values
print(f'Significance Level : {alpha}')
print(f'Numerator dof : {dfn}')
print(f'Denominator dof : {dfd}')
print(f"Critical F-values are {F_crit1:.4f} and {F_crit2:.4f}")

# Conclusion
if (F < F_crit1) or (F > F_crit2):
    print('Reject the Null Hypothesis')
    print(f'Conculsion : {alternate_hypothesis}')
else:
    print('FAILED to reject the Null Hypothesis')
    print(f'Conculsion : {null_hypothesis}')

Sample 1 mean: 10.48, variance: 5.47
Sample 2 mean: 19.90, variance: 2.94
F-value: 1.86
Degrees of freedom: (29, 39)
p-value: 0.0711


Significance Level : 0.05
Numerator dof : 29
Denominator dof : 39
Critical F-values are 0.4920 and 1.9619
FAILED to reject the Null Hypothesis
Conculsion : Variance of two samples population is same


Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from
each population. Conduct an F-test at the 5% significance level to determine if the variances are
significantly different.

Answer:-

In [4]:
from scipy.stats import f

# Given data
var1 = 10  # Variance of population 1
var2 = 15  # Variance of population 2
n1 = 12    # Sample size from population 1
n2 = 12    # Sample size from population 2
alpha = 0.05  # Significance level

# Calculate the F-statistic
f_statistic = var1 / var2

# Degrees of freedom
df1 = n1 - 1  # Degrees of freedom for population 1
df2 = n2 - 1  # Degrees of freedom for population 2

# Critical F-value
critical_value = f.ppf(1 - alpha, df1, df2)

# Output results
print(f"F-statistic: {f_statistic:.4f}")
print(f"Critical F-value at alpha={alpha}: {critical_value:.4f}")

# Decision
if f_statistic > critical_value:
    print("Reject the null hypothesis: The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: The variances are not significantly different.")

F-statistic: 0.6667
Critical F-value at alpha=0.05: 2.8179
Fail to reject the null hypothesis: The variances are not significantly different.


Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25
products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance
level to determine if the claim is justified.

Answer:

In [4]:
import scipy.stats as stats

# Set the significance level
alpha = 0.01

# Set the claimed population variance and sample variance
sigma2 = 0.005  # Claimed population variance
s2 = 0.006      # Sample variance

# Set the sample size
n = 25

# Null Hypothesis and Alternate Hypothesis
null_hypothesis = "The variance of the diameter of the product is 0.005"
alternate_hypothesis = "The variance of the diameter of the product is NOT 0.005."

# Calculate the F-statistic
F = s2 / sigma2

# Degrees of freedom
dfn = n - 1  # Degrees of freedom for the sample variance
dfd = 1      # Degrees of freedom for the claimed variance (since it's a known variance)

# Calculate the critical values for the two-tailed F-test
F_crit1 = stats.f.ppf(alpha / 2, dfn, dfd)
F_crit2 = stats.f.ppf(1 - alpha / 2, dfn, dfd)

# Calculate the p-value
p_value = 2 * min(stats.f.cdf(F, dfn, dfd), 1 - stats.f.cdf(F, dfn, dfd))

# Print the results
print(f"F-statistic: {F:.4f}")
print(f"F Critical values: {F_crit1:.4f} and {F_crit2:.4f}")
print(f"P-value: {p_value:.4f}")

if p_value < alpha:
    print("Reject null hypothesis.")
    print(f"Conclusion: {alternate_hypothesis}")
else:
    print("FAIL to reject null hypothesis.")
    print(f"Conclusion: {null_hypothesis}")

F-statistic: 1.2000
F Critical values: 0.1047 and 24939.5653
P-value: 0.7408
FAIL to reject null hypothesis.
Conclusion: The variance of the diameter of the product is 0.005


Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an
F-distribution and calculates the mean and variance of the distribution. The function should return the
mean and variance as a tuple.

Answer:-

The mean and variance of an F-distribution can be calculated using the following formulas:-

1.Mean of the F-distribution: [ \text{Mean} = \frac{df_1}{df_1 - 2} \quad \text{for } df_1 > 2 ]

2.Variance of the F-distribution: [ \text{Variance} = \frac{2 \cdot (df_2)^2 \cdot (df_1 + df_1 - 2)}{df_1 \cdot (df_2 - 2)^2 \cdot (df_2 - 4)} \quad \text{for } df_2 > 4 ]

In [5]:
def f_distribution_stats(df1, df2):
    """
    Calculate the mean and variance of an F-distribution.

    Parameters:
    df1 (int): Degrees of freedom for the numerator
    df2 (int): Degrees of freedom for the denominator

    Returns:
    tuple: A tuple containing the mean and variance of the F-distribution
    """
    # Calculate mean
    if df1 > 2:
        mean = df1 / (df1 - 2)
    else:
        mean = None  # Mean is undefined for df1 <= 2

    # Calculate variance
    if df2 > 4:
        variance = (2 * (df2 ** 2) * (df1 + df1 - 2)) / (df1 * (df2 - 2) ** 2 * (df2 - 4))
    else:
        variance = None  # Variance is undefined for df2 <= 4

    return mean, variance

# Example usage:
df1 = 5  # Degrees of freedom for the numerator
df2 = 10  # Degrees of freedom for the denominator
mean, variance = f_distribution_stats(df1, df2)

print(f"Mean: {mean}, Variance: {variance}")

Mean: 1.6666666666666667, Variance: 0.8333333333333334


Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The
sample variance is found to be 25. Another random sample of 15 measurements is taken from another
normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test
at the 10% significance level to determine if the variances are significantly different.

Answer:-

Performing one Tailed F-test below

In [6]:
import numpy as np
from scipy.stats import f

# Set significance level and degrees of freedom
alpha = 0.10
n1 = 10
n2 = 15
df1 = n1-1
df2 = n2-1

# Variance for each group
var1 = 25
var2 = 20

# Null and Alternate hypothesis
null_hypothesis = "Variances are similar"
alternate_hypothesis =  "Variances are significantly different"

# Calculate F-statistic
f_statistic = var1/var2
p_value = 1 - f.cdf(f_statistic, df1, df2)

# Calculate F-critical
F_crit = f.ppf(1-alpha,df1, df2)

# print the results
print(f"F-statistic: {f_statistic:.4f}")
print(f"F Critical value: {F_crit:.4f}")
print(f"P-value: {p_value:.4f}")

# Determine if null hypothesis should be rejected
if p_value < alpha:
    print("Reject null hypothesis.")
    print(f"Conclusion : {alternate_hypothesis}")
else:
    print("FAIL to reject null hypothesis. ")
    print(f"Conclusion : {null_hypothesis}")

F-statistic: 1.2500
F Critical value: 2.1220
P-value: 0.3416
FAIL to reject null hypothesis. 
Conclusion : Variances are similar


Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday
night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5%
significance level to determine if the variances are significantly different.

Answer:-

Let's conduct an F-test to determine if the variances of the waiting times at the two restaurants are significantly different at the 5% significance level.

Given Data:

Restaurant A: 24, 25, 28, 23, 22, 20, 27

Restaurant B: 31, 33, 35, 30, 32, 36

Significance level (
𝛼
): 0.05

Steps:

1.Calculate the sample variances.

2.Calculate the F-value.

3.Determine the critical value for the F-distribution.

4.Compare the F-value with the critical value to make a decision.

In [10]:
import numpy as np
from scipy.stats import f

# Given data
restaurant_A = [24, 25, 28, 23, 22, 20, 27]
restaurant_B = [31, 33, 35, 30, 32, 36]

# Calculate the sample variances
s1_sq = np.var(restaurant_A, ddof=1)
s2_sq = np.var(restaurant_B, ddof=1)

# Calculate the F-value
F_value = s1_sq / s2_sq if s1_sq > s2_sq else s2_sq / s1_sq

# Degrees of freedom
df1 = len(restaurant_A) - 1
df2 = len(restaurant_B) - 1

# Significance level
alpha = 0.05

# Calculate the critical value for a two-tailed test
critical_value_upper = f.ppf(1 - alpha / 2, df1, df2)
critical_value_lower = f.ppf(alpha / 2, df1, df2)

# Output the results
print(f"F-value: {F_value:.4f}")
print(f"Degrees of freedom: ({df1}, {df2})")
print(f"Critical value (lower): {critical_value_lower:.4f}")
print(f"Critical value (upper): {critical_value_upper:.4f}")

# Conclusion
if F_value < critical_value_lower or F_value > critical_value_upper:
    print("We reject the null hypothesis. The variances are significantly different.")
else:
    print("We fail to reject the null hypothesis. The variances are not significantly different.")

F-value: 1.4552
Degrees of freedom: (6, 5)
Critical value (lower): 0.1670
Critical value (upper): 6.9777
We fail to reject the null hypothesis. The variances are not significantly different.


Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83;
Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances
are significantly different.

Answer:-

Let's conduct an F-test to determine if the variances of the test scores for the two groups of students are significantly different at the 1% significance level.

Given Data:
Group A: 80, 85, 90, 92, 87, 83

Group B: 75, 78, 82, 79, 81, 84

Significance level (
𝛼
): 0.01

Steps:

1.Calculate the sample variances.

2.Calculate the F-value.

3.Determine the critical value for the F-distribution.

4.Compare the F-value with the critical value to make a decision.

In [12]:
import numpy as np
from scipy.stats import f

# Given data
group_A = [80, 85, 90, 92, 87, 83]
group_B = [75, 78, 82, 79, 81, 84]

# Calculate the sample variances
s1_sq = np.var(group_A, ddof=1)
s2_sq = np.var(group_B, ddof=1)

# Calculate the F-value
F_value = s1_sq / s2_sq if s1_sq > s2_sq else s2_sq / s1_sq

# Degrees of freedom
df1 = len(group_A) - 1
df2 = len(group_B) - 1

# Significance level
alpha = 0.01

# Calculate the critical value for a two-tailed test
critical_value_upper = f.ppf(1 - alpha / 2, df1, df2)
critical_value_lower = f.ppf(alpha / 2, df1, df2)

# Output the results
print(f"F-value: {F_value:.4f}")
print(f"Degrees of freedom: ({df1}, {df2})")
print(f"Critical value (lower): {critical_value_lower:.4f}")
print(f"Critical value (upper): {critical_value_upper:.4f}")

# Conclusion
if F_value < critical_value_lower or F_value > critical_value_upper:
    print("We reject the null hypothesis. The variances are significantly different.")
else:
    print("We fail to reject the null hypothesis. The variances are not significantly different.")


F-value: 1.9443
Degrees of freedom: (5, 5)
Critical value (lower): 0.0669
Critical value (upper): 14.9396
We fail to reject the null hypothesis. The variances are not significantly different.
