Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio
test. The function should return the F-value and the corresponding p-value for the test.

In [1]:
import numpy as np
import scipy.stats as stats

def calculate_f_value_and_p_value(data1, data2):
    """
    Calculate the F-value and corresponding p-value for a variance ratio test (ANOVA).

    Parameters:
    data1 (array-like): Data for group 1
    data2 (array-like): Data for group 2

    Returns:
    float: F-value
    float: p-value
    """
    # Calculate the variances
    var1 = np.var(data1, ddof=1)  # Variance of group 1
    var2 = np.var(data2, ddof=1)  # Variance of group 2
    
    # Calculate the F-value
    F_value = var1 / var2 if var1 >= var2 else var2 / var1

    # Degrees of freedom
    df1 = len(data1) - 1
    df2 = len(data2) - 1

    # Calculate the p-value
    p_value = stats.f.sf(F_value, df1, df2)  # Survival function (1 - cdf)

    return F_value, p_value

# Example usage
data_group1 = [15, 20, 25, 30, 35]
data_group2 = [18, 22, 30, 28, 32]
F_value, p_value = calculate_f_value_and_p_value(data_group1, data_group2)
print("F-value:", F_value)
print("p-value:", p_value)


F-value: 1.838235294117647
p-value: 0.2849375098848153


Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an
F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [2]:
import scipy.stats as stats

def get_critical_f_value(alpha, df1, df2):
    """
    Calculate the critical F-value for a two-tailed test.

    Parameters:
    alpha (float): Significance level (e.g., 0.05 for a 95% confidence interval).
    df1 (int): Degrees of freedom for the numerator.
    df2 (int): Degrees of freedom for the denominator.

    Returns:
    float: Critical F-value for the given alpha and degrees of freedom.
    """
    # Calculate the critical F-value using the inverse cumulative distribution function (percent point function)
    critical_f_value = stats.f.ppf(1 - alpha / 2, df1, df2)
    return critical_f_value

# Example usage
alpha = 0.05
df1 = 3  # degrees of freedom for the numerator
df2 = 20  # degrees of freedom for the denominator
critical_f_value = get_critical_f_value(alpha, df1, df2)
print("Critical F-value:", critical_f_value)


Critical F-value: 3.8586986662732143


Q3. Write a Python program that generates random samples from two normal distributions with known

variances and uses an F-test to determine if the variances are equal. The program should output the F-
value, degrees of freedom, and p-value for the test.

In [3]:
import numpy as np
import scipy.stats as stats

def f_test(sample1, sample2):
    """
    Perform an F-test to determine if the variances of two samples are equal.

    Parameters:
    sample1 (numpy array): First sample data.
    sample2 (numpy array): Second sample data.

    Returns:
    float: F-value
    tuple: Degrees of freedom (df1, df2)
    float: p-value
    """
    # Calculate variances
    var1 = np.var(sample1, ddof=1)  # variance of sample 1
    var2 = np.var(sample2, ddof=1)  # variance of sample 2

    # Calculate F-value
    f_value = var1 / var2 if var1 >= var2 else var2 / var1

    # Calculate degrees of freedom
    df1 = len(sample1) - 1  # degrees of freedom for sample 1
    df2 = len(sample2) - 1  # degrees of freedom for sample 2

    # Calculate p-value
    p_value = 2 * min(stats.f.cdf(f_value, df1, df2), 1 - stats.f.cdf(f_value, df1, df2))

    return f_value, (df1, df2), p_value

# Generate random samples from two normal distributions
np.random.seed(0)  # for reproducibility
sample1 = np.random.normal(loc=0, scale=5, size=30)  # sample from first normal distribution
sample2 = np.random.normal(loc=0, scale=7, size=25)  # sample from second normal distribution

# Perform the F-test
f_value, degrees_of_freedom, p_value = f_test(sample1, sample2)

# Print the results
print("F-value:", f_value)
print("Degrees of freedom:", degrees_of_freedom)
print("p-value:", p_value)


F-value: 1.5554662872695164
Degrees of freedom: (29, 24)
p-value: 0.2726436400056713


Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from
each population. Conduct an F-test at the 5% significance level to determine if the variances are
significantly different.

In [4]:
import scipy.stats as stats

# Given variances and sample sizes
variance_population1 = 10
variance_population2 = 15
sample_size_population1 = 12
sample_size_population2 = 12

# Calculate the F-value
F_value = variance_population1 / variance_population2

# Degrees of freedom
df1 = sample_size_population1 - 1
df2 = sample_size_population2 - 1

# Significance level (alpha)
alpha = 0.05

# Calculate critical F-value
critical_F_value = stats.f.ppf(1 - alpha / 2, df1, df2)

# Compare with critical F-value
if F_value > critical_F_value:
    print("Reject null hypothesis. Variances are significantly different.")
else:
    print("Fail to reject null hypothesis. Variances are not significantly different.")

print("Calculated F-value:", F_value)
print("Critical F-value:", critical_F_value)
print("Degrees of freedom (numerator, denominator):", df1, df2)


Fail to reject null hypothesis. Variances are not significantly different.
Calculated F-value: 0.6666666666666666
Critical F-value: 3.473699051085809
Degrees of freedom (numerator, denominator): 11 11


Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25
products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance
level to determine if the claim is justified.

In [5]:
import scipy.stats as stats

# Given data
claimed_variance = 0.005
sample_variance = 0.006
sample_size = 25

# Calculate the F-value
F_value = sample_variance / claimed_variance

# Degrees of freedom
df1 = sample_size - 1
df2 = 1  # Degrees of freedom for the assumed population variance

# Significance level (alpha)
alpha = 0.01

# Calculate critical F-value
critical_F_value = stats.f.ppf(1 - alpha / 2, df1, df2)

# Compare with critical F-value
if F_value > critical_F_value:
    print("Reject null hypothesis. Sample variance is significantly different from the claimed population variance.")
else:
    print("Fail to reject null hypothesis. Sample variance is not significantly different from the claimed population variance.")

print("Calculated F-value:", F_value)
print("Critical F-value:", critical_F_value)
print("Degrees of freedom (numerator, denominator):", df1, df2)


Fail to reject null hypothesis. Sample variance is not significantly different from the claimed population variance.
Calculated F-value: 1.2
Critical F-value: 24939.565259943236
Degrees of freedom (numerator, denominator): 24 1


Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an
F-distribution and calculates the mean and variance of the distribution. The function should return the
mean and variance as a tuple.

In [6]:
def calculate_f_distribution_mean_and_variance(df1, df2):
    """
    Calculate the mean and variance of an F-distribution given degrees of freedom.

    Parameters:
    df1 (int): Degrees of freedom for the numerator.
    df2 (int): Degrees of freedom for the denominator.

    Returns:
    tuple: Mean and variance of the F-distribution (mean, variance).
    """
    # Calculate mean
    mean = df2 / (df2 - 2)

    # Calculate variance
    variance = (2 * df2 ** 2 * (df1 + df2 - 2)) / (df1 * (df2 - 2) ** 2 * (df2 - 4))

    return mean, variance

# Example usage
df1 = 5  # degrees of freedom for the numerator
df2 = 10  # degrees of freedom for the denominator
mean, variance = calculate_f_distribution_mean_and_variance(df1, df2)
print("Mean of F-distribution:", mean)
print("Variance of F-distribution:", variance)


Mean of F-distribution: 1.25
Variance of F-distribution: 1.3541666666666667


Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The
sample variance is found to be 25. Another random sample of 15 measurements is taken from another
normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test
at the 10% significance level to determine if the variances are significantly different.

In [7]:
import scipy.stats as stats

# Given sample variances and sample sizes
sample_variance1 = 25
sample_variance2 = 20
sample_size1 = 10
sample_size2 = 15

# Determine larger and smaller sample variances
larger_variance = max(sample_variance1, sample_variance2)
smaller_variance = min(sample_variance1, sample_variance2)

# Calculate the F-value
F_value = larger_variance / smaller_variance

# Degrees of freedom
df1 = sample_size1 - 1
df2 = sample_size2 - 1

# Significance level (alpha)
alpha = 0.10

# Calculate critical F-value
critical_F_value = stats.f.ppf(1 - alpha / 2, df1, df2)

# Compare with critical F-value
if F_value > critical_F_value:
    print("Reject null hypothesis. Variances are significantly different.")
else:
    print("Fail to reject null hypothesis. Variances are not significantly different.")

print("Calculated F-value:", F_value)
print("Critical F-value:", critical_F_value)
print("Degrees of freedom (numerator, denominator):", df1, df2)


Fail to reject null hypothesis. Variances are not significantly different.
Calculated F-value: 1.25
Critical F-value: 2.6457907352338195
Degrees of freedom (numerator, denominator): 9 14


Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday
night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5%
significance level to determine if the variances are significantly different.

In [8]:
import numpy as np
import scipy.stats as stats

# Waiting times in minutes for each restaurant
waiting_times_restaurant_a = np.array([24, 25, 28, 23, 22, 20, 27])
waiting_times_restaurant_b = np.array([31, 33, 35, 30, 32, 36])

# Calculate sample variances for each restaurant
sample_variance_a = np.var(waiting_times_restaurant_a, ddof=1)
sample_variance_b = np.var(waiting_times_restaurant_b, ddof=1)

# Determine larger and smaller sample variances
larger_variance = max(sample_variance_a, sample_variance_b)
smaller_variance = min(sample_variance_a, sample_variance_b)

# Calculate the F-value
F_value = larger_variance / smaller_variance

# Degrees of freedom
df1 = len(waiting_times_restaurant_a) - 1
df2 = len(waiting_times_restaurant_b) - 1

# Significance level (alpha)
alpha = 0.05

# Calculate critical F-value
critical_F_value = stats.f.ppf(1 - alpha / 2, df1, df2)

# Compare with critical F-value
if F_value > critical_F_value:
    print("Reject null hypothesis. Variances are significantly different.")
else:
    print("Fail to reject null hypothesis. Variances are not significantly different.")

print("Calculated F-value:", F_value)
print("Critical F-value:", critical_F_value)
print("Degrees of freedom (numerator, denominator):", df1, df2)


Fail to reject null hypothesis. Variances are not significantly different.
Calculated F-value: 1.4551907719609583
Critical F-value: 6.977701858535566
Degrees of freedom (numerator, denominator): 6 5


Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83;
Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances
are significantly different.

In [9]:
import numpy as np
import scipy.stats as stats

# Test scores for each group
test_scores_group_a = np.array([80, 85, 90, 92, 87, 83])
test_scores_group_b = np.array([75, 78, 82, 79, 81, 84])

# Calculate sample variances for each group
sample_variance_a = np.var(test_scores_group_a, ddof=1)
sample_variance_b = np.var(test_scores_group_b, ddof=1)

# Determine larger and smaller sample variances
larger_variance = max(sample_variance_a, sample_variance_b)
smaller_variance = min(sample_variance_a, sample_variance_b)

# Calculate the F-value
F_value = larger_variance / smaller_variance

# Degrees of freedom
df1 = len(test_scores_group_a) - 1
df2 = len(test_scores_group_b) - 1

# Significance level (alpha)
alpha = 0.01

# Calculate critical F-value
critical_F_value = stats.f.ppf(1 - alpha / 2, df1, df2)

# Compare with critical F-value
if F_value > critical_F_value:
    print("Reject null hypothesis. Variances are significantly different.")
else:
    print("Fail to reject null hypothesis. Variances are not significantly different.")

print("Calculated F-value:", F_value)
print("Critical F-value:", critical_F_value)
print("Degrees of freedom (numerator, denominator):", df1, df2)


Fail to reject null hypothesis. Variances are not significantly different.
Calculated F-value: 1.9442622950819677
Critical F-value: 14.939605459912224
Degrees of freedom (numerator, denominator): 5 5
