Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio
test. The function should return the F-value and the corresponding p-value for the test.

In [1]:
import numpy as np
from scipy.stats import f_oneway

def variance_ratio_test(array1, array2):
    """
    Calculate F-value and p-value for a variance ratio test.

    Parameters:
    - array1, array2: Arrays of data for the two groups.

    Returns:
    - F-value, p-value
    """
    # Perform variance ratio test
    f_statistic, p_value = f_oneway(array1, array2)

    return f_statistic, p_value

# Example usage:
group1_data = np.random.normal(loc=5, scale=2, size=50)
group2_data = np.random.normal(loc=6, scale=2, size=50)

f_value, p_value = variance_ratio_test(group1_data, group2_data)

print(f"F-value: {f_value}")
print(f"P-value: {p_value}")


F-value: 5.493548743266716
P-value: 0.02110598697730226


Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an
F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [2]:
from scipy.stats import f

def critical_f_value(alpha, df_num, df_denom):
    """
    Calculate the critical F-value for a two-tailed test.

    Parameters:
    - alpha: Significance level (e.g., 0.05 for a 5% significance level).
    - df_num: Degrees of freedom for the numerator.
    - df_denom: Degrees of freedom for the denominator.

    Returns:
    - Critical F-value
    """
    alpha_over_2 = alpha / 2

    # Percent point function (inverse of the cumulative distribution function)
    f_critical = f.ppf(1 - alpha_over_2, df_num, df_denom)

    return f_critical

# Example usage:
alpha_value = 0.05
df_numerator = 2
df_denominator = 30

critical_f = critical_f_value(alpha_value, df_numerator, df_denominator)

print(f"Critical F-value: {critical_f}")


Critical F-value: 4.18206059099611


Q3. Write a Python program that generates random samples from two normal distributions with known

variances and uses an F-test to determine if the variances are equal. The program should output the F-
value, degrees of freedom, and p-value for the test.

In [3]:
import numpy as np
from scipy.stats import f

def equal_variances_f_test(sample1, sample2):
    """
    Perform an F-test to determine if the variances of two samples are equal.

    Parameters:
    - sample1, sample2: Arrays representing the two samples.

    Returns:
    - F-value, degrees of freedom numerator, degrees of freedom denominator, p-value
    """
    var1 = np.var(sample1, ddof=1)  # variance of sample 1
    var2 = np.var(sample2, ddof=1)  # variance of sample 2

    df_num = len(sample1) - 1  # degrees of freedom numerator
    df_denom = len(sample2) - 1  # degrees of freedom denominator

    f_value = var1 / var2 if var1 > var2 else var2 / var1

    p_value = 2 * min(f.cdf(f_value, df_num, df_denom), 1 - f.cdf(f_value, df_num, df_denom))

    return f_value, df_num, df_denom, p_value

# Example usage:
np.random.seed(123)  # for reproducibility

# Generate random samples from normal distributions with known variances
sample1 = np.random.normal(loc=5, scale=2, size=30)
sample2 = np.random.normal(loc=5, scale=2, size=30)

# Perform F-test
f_value, df_num, df_denom, p_value = equal_variances_f_test(sample1, sample2)

# Output results
print(f"F-value: {f_value}")
print(f"Degrees of Freedom (Numerator): {df_num}")
print(f"Degrees of Freedom (Denominator): {df_denom}")
print(f"P-value: {p_value}")


F-value: 1.0780353340129458
Degrees of Freedom (Numerator): 29
Degrees of Freedom (Denominator): 29
P-value: 0.8410413689831162


Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from
each population. Conduct an F-test at the 5% significance level to determine if the variances are
significantly different.

In [4]:
from scipy.stats import f

# Given information
variance_population1 = 10
variance_population2 = 15
sample_size = 12
alpha = 0.05

# Degrees of freedom
df_num = sample_size - 1  # degrees of freedom numerator
df_denom = sample_size - 1  # degrees of freedom denominator

# F-statistic for a two-tailed test
f_statistic = variance_population1 / variance_population2 if variance_population1 > variance_population2 else variance_population2 / variance_population1

# P-value for a two-tailed test
p_value = 2 * min(f.cdf(f_statistic, df_num, df_denom), 1 - f.cdf(f_statistic, df_num, df_denom))

# Critical F-value
critical_f = f.ppf(1 - alpha / 2, df_num, df_denom)

# Print results
print(f"F-statistic: {f_statistic}")
print(f"Critical F-value: {critical_f}")
print(f"P-value: {p_value}")

# Check for significance based on p-value and critical F-value
if p_value < alpha:
    print("Reject the null hypothesis. The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis. There is not enough evidence to suggest a significant difference in variances.")


F-statistic: 1.5
Critical F-value: 3.473699051085809
P-value: 0.5123897987357995
Fail to reject the null hypothesis. There is not enough evidence to suggest a significant difference in variances.


Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25
products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance
level to determine if the claim is justified.

In [5]:
from scipy.stats import f

# Given information
claimed_variance = 0.005
sample_size = 25
sample_variance = 0.006
alpha = 0.01

# Degrees of freedom
df_num = sample_size - 1  # degrees of freedom numerator
df_denom = 1  # degrees of freedom denominator for comparing a sample to a population

# F-statistic for a one-tailed test
f_statistic = sample_variance / claimed_variance

# P-value for a one-tailed test
p_value = 1 - f.cdf(f_statistic, df_num, df_denom)

# Critical F-value
critical_f = f.ppf(1 - alpha, df_num, df_denom)

# Print results
print(f"F-statistic: {f_statistic}")
print(f"Critical F-value: {critical_f}")
print(f"P-value: {p_value}")

# Check for significance based on p-value and critical F-value
if p_value < alpha:
    print("Reject the null hypothesis. The claimed variance is not justified.")
else:
    print("Fail to reject the null hypothesis. The claimed variance is justified at the 1% significance level.")


F-statistic: 1.2
Critical F-value: 6234.6308935330835
P-value: 0.6296099619959358
Fail to reject the null hypothesis. The claimed variance is justified at the 1% significance level.


Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an
F-distribution and calculates the mean and variance of the distribution. The function should return the
mean and variance as a tuple.

In [6]:
def f_distribution_mean_variance(df_num, df_denom):
    """
    Calculate the mean and variance of an F-distribution.

    Parameters:
    - df_num: Degrees of freedom for the numerator.
    - df_denom: Degrees of freedom for the denominator.

    Returns:
    - Mean, Variance (as a tuple)
    """
    if df_denom <= 2:
        raise ValueError("Degrees of freedom for the denominator must be greater than 2.")
    
    mean = df_denom / (df_denom - 2) if df_denom > 2 else None

    if df_denom <= 4:
        raise ValueError("Degrees of freedom for the denominator must be greater than 4 for variance calculation.")

    variance = (2 * (df_denom**2 * (df_num + df_denom - 2))) / (df_num * (df_denom - 2)**2 * (df_denom - 4))

    return mean, variance

# Example usage:
df_num_example = 3
df_denom_example = 6

mean, variance = f_distribution_mean_variance(df_num_example, df_denom_example)

print(f"Mean: {mean}")
print(f"Variance: {variance}")


Mean: 1.5
Variance: 5.25


Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The
sample variance is found to be 25. Another random sample of 15 measurements is taken from another
normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test
at the 10% significance level to determine if the variances are significantly different.

In [7]:
from scipy.stats import f

# Given information
sample_size1 = 10
sample_variance1 = 25
sample_size2 = 15
sample_variance2 = 20
alpha = 0.10

# Degrees of freedom
df_num1 = sample_size1 - 1  # degrees of freedom numerator for sample 1
df_num2 = sample_size2 - 1  # degrees of freedom numerator for sample 2
df_denom1 = df_num1  # degrees of freedom denominator for sample 1
df_denom2 = df_num2  # degrees of freedom denominator for sample 2

# F-statistic for a two-tailed test
f_statistic = sample_variance1 / sample_variance2 if sample_variance1 > sample_variance2 else sample_variance2 / sample_variance1

# P-value for a two-tailed test
p_value = 2 * min(f.cdf(f_statistic, df_num1, df_denom1), 1 - f.cdf(f_statistic, df_num1, df_denom1))

# Critical F-value
critical_f = f.ppf(1 - alpha / 2, df_num1, df_denom1)

# Print results
print(f"F-statistic: {f_statistic}")
print(f"Critical F-value: {critical_f}")
print(f"P-value: {p_value}")

# Check for significance based on p-value and critical F-value
if p_value < alpha:
    print("Reject the null hypothesis. The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis. There is not enough evidence to suggest a significant difference in variances.")


F-statistic: 1.25
Critical F-value: 3.178893104458269
P-value: 0.7450016995870201
Fail to reject the null hypothesis. There is not enough evidence to suggest a significant difference in variances.


Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday
night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5%
significance level to determine if the variances are significantly different.

In [8]:
from scipy.stats import f

# Given data
data_restaurant_A = [24, 25, 28, 23, 22, 20, 27]
data_restaurant_B = [31, 33, 35, 30, 32, 36]

# Sample sizes and variances
sample_size_A = len(data_restaurant_A)
sample_variance_A = sum((x - sum(data_restaurant_A) / sample_size_A) ** 2 for x in data_restaurant_A) / (sample_size_A - 1)

sample_size_B = len(data_restaurant_B)
sample_variance_B = sum((x - sum(data_restaurant_B) / sample_size_B) ** 2 for x in data_restaurant_B) / (sample_size_B - 1)

# Degrees of freedom
df_num_A = sample_size_A - 1  # degrees of freedom numerator for sample A
df_num_B = sample_size_B - 1  # degrees of freedom numerator for sample B
df_denom_A = df_num_A  # degrees of freedom denominator for sample A
df_denom_B = df_num_B  # degrees of freedom denominator for sample B

# F-statistic for a two-tailed test
f_statistic = sample_variance_A / sample_variance_B if sample_variance_A > sample_variance_B else sample_variance_B / sample_variance_A

# P-value for a two-tailed test
p_value = 2 * min(f.cdf(f_statistic, df_num_A, df_denom_A), 1 - f.cdf(f_statistic, df_num_A, df_denom_A))

# Critical F-value
alpha = 0.05
critical_f = f.ppf(1 - alpha / 2, df_num_A, df_denom_A)

# Print results
print(f"F-statistic: {f_statistic}")
print(f"Critical F-value: {critical_f}")
print(f"P-value: {p_value}")

# Check for significance based on p-value and critical F-value
if p_value < alpha:
    print("Reject the null hypothesis. The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis. There is not enough evidence to suggest a significant difference in variances.")


F-statistic: 1.4551907719609583
Critical F-value: 5.819756578960778
P-value: 0.6602599723820768
Fail to reject the null hypothesis. There is not enough evidence to suggest a significant difference in variances.


Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83;
Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances
are significantly different.

In [9]:
from scipy.stats import f

# Given data
data_group_A = [80, 85, 90, 92, 87, 83]
data_group_B = [75, 78, 82, 79, 81, 84]

# Sample sizes and variances
sample_size_A = len(data_group_A)
sample_variance_A = sum((x - sum(data_group_A) / sample_size_A) ** 2 for x in data_group_A) / (sample_size_A - 1)

sample_size_B = len(data_group_B)
sample_variance_B = sum((x - sum(data_group_B) / sample_size_B) ** 2 for x in data_group_B) / (sample_size_B - 1)

# Degrees of freedom
df_num_A = sample_size_A - 1  # degrees of freedom numerator for group A
df_num_B = sample_size_B - 1  # degrees of freedom numerator for group B
df_denom_A = df_num_A  # degrees of freedom denominator for group A
df_denom_B = df_num_B  # degrees of freedom denominator for group B

# F-statistic for a two-tailed test
f_statistic = sample_variance_A / sample_variance_B if sample_variance_A > sample_variance_B else sample_variance_B / sample_variance_A

# P-value for a two-tailed test
p_value = 2 * min(f.cdf(f_statistic, df_num_A, df_denom_A), 1 - f.cdf(f_statistic, df_num_A, df_denom_A))

# Critical F-value
alpha = 0.01
critical_f = f.ppf(1 - alpha / 2, df_num_A, df_denom_A)

# Print results
print(f"F-statistic: {f_statistic}")
print(f"Critical F-value: {critical_f}")
print(f"P-value: {p_value}")

# Check for significance based on p-value and critical F-value
if p_value < alpha:
    print("Reject the null hypothesis. The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis. There is not enough evidence to suggest a significant difference in variances.")


F-statistic: 1.9442622950819677
Critical F-value: 14.939605459912224
P-value: 0.4831043549070688
Fail to reject the null hypothesis. There is not enough evidence to suggest a significant difference in variances.
