Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio
test. The function should return the F-value and the corresponding p-value for the test.

In [1]:
import numpy as np
from scipy import stats

def calculate_f_value_and_p_value(data1, data2):
    # Calculate the variances of the two datasets
    var1 = np.var(data1, ddof=1)
    var2 = np.var(data2, ddof=1)
    
    # Calculate the F-value
    f_value = var1 / var2
    
    # Degrees of freedom
    df1 = len(data1) - 1
    df2 = len(data2) - 1
    
    # Calculate the p-value for the F-value
    p_value = 1 - stats.f.cdf(f_value, df1, df2)
    
    return f_value, p_value

# Example usage
data1 = [23, 21, 18, 24, 20]
data2 = [30, 32, 28, 29, 31]
f_value, p_value = calculate_f_value_and_p_value(data1, data2)
print(f"F-value: {f_value:.3f}, p-value: {p_value:.3f}")


F-value: 2.280, p-value: 0.222


Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an
F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [2]:
from scipy import stats

def critical_f_value(alpha, df1, df2):
    """
    Compute the critical F-value for a two-tailed test given a significance level
    and degrees of freedom for the numerator and denominator.

    Parameters:
    alpha (float): Significance level for the test (e.g., 0.05).
    df1 (int): Degrees of freedom for the numerator.
    df2 (int): Degrees of freedom for the denominator.

    Returns:
    float: Critical F-value for the given significance level.
    """
    # For a two-tailed test, the critical value corresponds to the upper tail
    # with alpha/2 probability. Hence, the critical value for the two-tailed test is
    # the (1 - alpha/2) quantile of the F-distribution.
    critical_value = stats.f.ppf(1 - alpha / 2, df1, df2)
    return critical_value

# Example usage
alpha = 0.05
df1 = 5  # Degrees of freedom for the numerator
df2 = 10 # Degrees of freedom for the denominator
critical_value = critical_f_value(alpha, df1, df2)
print(f"Critical F-value: {critical_value:.3f}")


Critical F-value: 4.236


Q3. Write a Python program that generates random samples from two normal distributions with known

variances and uses an F-test to determine if the variances are equal. The program should output the F-
value, degrees of freedom, and p-value for the test.

In [3]:
import numpy as np
from scipy import stats

def generate_samples(mean1, var1, mean2, var2, size1, size2):
    """
    Generate random samples from two normal distributions with given parameters.

    Parameters:
    mean1 (float): Mean of the first normal distribution.
    var1 (float): Variance of the first normal distribution.
    mean2 (float): Mean of the second normal distribution.
    var2 (float): Variance of the second normal distribution.
    size1 (int): Number of samples to generate for the first distribution.
    size2 (int): Number of samples to generate for the second distribution.

    Returns:
    tuple: Two arrays of generated samples.
    """
    samples1 = np.random.normal(loc=mean1, scale=np.sqrt(var1), size=size1)
    samples2 = np.random.normal(loc=mean2, scale=np.sqrt(var2), size=size2)
    return samples1, samples2

def perform_f_test(samples1, samples2):
    """
    Perform an F-test to compare the variances of two samples.

    Parameters:
    samples1 (array-like): The first sample of data.
    samples2 (array-like): The second sample of data.

    Returns:
    tuple: F-value, degrees of freedom (df1, df2), and p-value of the F-test.
    """
    # Calculate variances of the two samples
    var1 = np.var(samples1, ddof=1)
    var2 = np.var(samples2, ddof=1)
    
    # Calculate the F-value
    f_value = var1 / var2
    
    # Degrees of freedom
    df1 = len(samples1) - 1
    df2 = len(samples2) - 1
    
    # Calculate the p-value for the F-value
    p_value = 1 - stats.f.cdf(f_value, df1, df2)
    
    return f_value, df1, df2, p_value

# Parameters for generating random samples
mean1 = 0
var1 = 1
mean2 = 0
var2 = 2
size1 = 30
size2 = 30

# Generate samples
samples1, samples2 = generate_samples(mean1, var1, mean2, var2, size1, size2)

# Perform F-test
f_value, df1, df2, p_value = perform_f_test(samples1, samples2)

# Output results
print(f"F-value: {f_value:.3f}")
print(f"Degrees of freedom (df1, df2): ({df1}, {df2})")
print(f"P-value: {p_value:.3f}")


F-value: 0.401
Degrees of freedom (df1, df2): (29, 29)
P-value: 0.992


Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from
each population. Conduct an F-test at the 5% significance level to determine if the variances are
significantly different.

In [4]:
from scipy import stats

def perform_f_test_known_variances(var1, var2, size1, size2, alpha):
    """
    Perform an F-test to compare the variances of two populations with known variances.

    Parameters:
    var1 (float): Variance of the first population.
    var2 (float): Variance of the second population.
    size1 (int): Sample size from the first population.
    size2 (int): Sample size from the second population.
    alpha (float): Significance level for the test.

    Returns:
    tuple: F-value, critical F-value, and p-value of the F-test.
    """
    # Calculate the F-value
    f_value = var1 / var2
    
    # Degrees of freedom
    df1 = size1 - 1
    df2 = size2 - 1
    
    # Critical F-value for the given alpha level
    critical_value = stats.f.ppf(1 - alpha, df1, df2)
    
    # Calculate the p-value for the F-value
    p_value = 1 - stats.f.cdf(f_value, df1, df2)
    
    return f_value, critical_value, p_value

# Known variances and sample sizes
var1 = 10
var2 = 15
size1 = 12
size2 = 12
alpha = 0.05

# Perform the F-test
f_value, critical_value, p_value = perform_f_test_known_variances(var1, var2, size1, size2, alpha)

# Output results
print(f"F-value: {f_value:.3f}")
print(f"Critical F-value: {critical_value:.3f}")
print(f"P-value: {p_value:.3f}")

# Decision
if f_value > critical_value:
    print("Reject the null hypothesis: The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: The variances are not significantly different.")


F-value: 0.667
Critical F-value: 2.818
P-value: 0.744
Fail to reject the null hypothesis: The variances are not significantly different.


Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25
products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance
level to determine if the claim is justified.

In [5]:
from scipy import stats

def perform_f_test_claimed_variance(sample_var, claimed_var, sample_size, alpha):
    """
    Perform an F-test to compare a sample variance with a claimed variance.

    Parameters:
    sample_var (float): Sample variance.
    claimed_var (float): Claimed population variance.
    sample_size (int): Sample size.
    alpha (float): Significance level for the test.

    Returns:
    tuple: F-value, critical F-values (lower and upper), and p-value of the F-test.
    """
    # Calculate the F-value
    f_value = sample_var / claimed_var
    
    # Degrees of freedom
    df1 = sample_size - 1
    df2 = df1  # The denominator degrees of freedom for the F-distribution is df1 in a one-sample test
    
    # Critical F-values for the given alpha level (two-tailed test)
    critical_value_upper = stats.f.ppf(1 - alpha / 2, df1, df2)
    critical_value_lower = stats.f.ppf(alpha / 2, df1, df2)
    
    # Calculate the p-value for the F-value
    p_value = 2 * min(stats.f.cdf(f_value, df1, df2), 1 - stats.f.cdf(f_value, df1, df2))
    
    return f_value, critical_value_lower, critical_value_upper, p_value

# Parameters
claimed_var = 0.005
sample_var = 0.006
sample_size = 25
alpha = 0.01

# Perform the F-test
f_value, critical_value_lower, critical_value_upper, p_value = perform_f_test_claimed_variance(sample_var, claimed_var, sample_size, alpha)

# Output results
print(f"F-value: {f_value:.3f}")
print(f"Critical F-values: Lower = {critical_value_lower:.3f}, Upper = {critical_value_upper:.3f}")
print(f"P-value: {p_value:.3f}")

# Decision
if f_value < critical_value_lower or f_value > critical_value_upper:
    print("Reject the null hypothesis: The variance is significantly different from the claimed variance.")
else:
    print("Fail to reject the null hypothesis: The variance is not significantly different from the claimed variance.")


F-value: 1.200
Critical F-values: Lower = 0.337, Upper = 2.967
P-value: 0.659
Fail to reject the null hypothesis: The variance is not significantly different from the claimed variance.


Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an
F-distribution and calculates the mean and variance of the distribution. The function should return the
mean and variance as a tuple.

In [6]:
def f_distribution_mean_variance(d1, d2):
    """
    Calculate the mean and variance of an F-distribution.

    Parameters:
    d1 (int): Degrees of freedom for the numerator.
    d2 (int): Degrees of freedom for the denominator.

    Returns:
    tuple: A tuple containing the mean and variance of the F-distribution.
    """
    if d2 <= 2:
        raise ValueError("Degrees of freedom for the denominator must be greater than 2 to calculate mean.")
    
    mean = d2 / (d2 - 2)
    
    if d2 <= 4:
        raise ValueError("Degrees of freedom for the denominator must be greater than 4 to calculate variance.")
    
    variance = (2 * d2**2 * (d2 + d1 - 2)) / (d1 * (d2 - 2)**2 * (d2 - 4))
    
    return (mean, variance)

# Example usage
d1 = 5
d2 = 10
mean, variance = f_distribution_mean_variance(d1, d2)
print(f"Mean: {mean}, Variance: {variance}")


Mean: 1.25, Variance: 1.3541666666666667


Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The
sample variance is found to be 25. Another random sample of 15 measurements is taken from another
normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test
at the 10% significance level to determine if the variances are significantly different.

In [7]:
from scipy.stats import f

# Given values
s1_squared = 25
s2_squared = 20
n1 = 10
n2 = 15
alpha = 0.10

# Degrees of freedom
df1 = n1 - 1
df2 = n2 - 1

# F-statistic
F = s1_squared / s2_squared

# Critical value for two-tailed test
F_critical = f.ppf(1 - alpha / 2, df1, df2)

# Decision
print(f"F-statistic: {F}")
print(f"Critical value (two-tailed, alpha = {alpha}): {F_critical}")

if F > F_critical:
    print("Reject the null hypothesis: variances are significantly different.")
else:
    print("Do not reject the null hypothesis: variances are not significantly different.")


F-statistic: 1.25
Critical value (two-tailed, alpha = 0.1): 2.6457907352338195
Do not reject the null hypothesis: variances are not significantly different.


Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday
night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5%
significance level to determine if the variances are significantly different.

In [8]:
import numpy as np
from scipy.stats import f

# Data
data_A = np.array([24, 25, 28, 23, 22, 20, 27])
data_B = np.array([31, 33, 35, 30, 32, 36])

# Sample variances
var_A = np.var(data_A, ddof=1)  # ddof=1 for sample variance
var_B = np.var(data_B, ddof=1)

# Sample sizes
n_A = len(data_A)
n_B = len(data_B)

# Degrees of freedom
df_A = n_A - 1
df_B = n_B - 1

# F-statistic
F = var_A / var_B if var_A > var_B else var_B / var_A

# Critical value for F-distribution
alpha = 0.05
F_critical = f.ppf(1 - alpha / 2, df_A, df_B)

# Decision
print(f"Sample Variance A: {var_A}")
print(f"Sample Variance B: {var_B}")
print(f"F-statistic: {F}")
print(f"Critical value (alpha = {alpha}): {F_critical}")

if F > F_critical:
    print("Reject the null hypothesis: variances are significantly different.")
else:
    print("Do not reject the null hypothesis: variances are not significantly different.")


Sample Variance A: 7.80952380952381
Sample Variance B: 5.366666666666667
F-statistic: 1.4551907719609583
Critical value (alpha = 0.05): 6.977701858535566
Do not reject the null hypothesis: variances are not significantly different.


Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83;
Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances
are significantly different.

In [9]:
import numpy as np
from scipy.stats import f

# Data
data_A = np.array([80, 85, 90, 92, 87, 83])
data_B = np.array([75, 78, 82, 79, 81, 84])

# Sample variances
var_A = np.var(data_A, ddof=1)  # ddof=1 for sample variance
var_B = np.var(data_B, ddof=1)

# Sample sizes
n_A = len(data_A)
n_B = len(data_B)

# Degrees of freedom
df_A = n_A - 1
df_B = n_B - 1

# F-statistic
F = var_A / var_B if var_A > var_B else var_B / var_A

# Critical value for F-distribution at 1% significance level
alpha = 0.01
F_critical = f.ppf(1 - alpha / 2, df_A, df_B)

# Decision
print(f"Sample Variance A: {var_A}")
print(f"Sample Variance B: {var_B}")
print(f"F-statistic: {F}")
print(f"Critical value (alpha = {alpha}): {F_critical}")

if F > F_critical:
    print("Reject the null hypothesis: variances are significantly different.")
else:
    print("Do not reject the null hypothesis: variances are not significantly different.")


Sample Variance A: 19.76666666666667
Sample Variance B: 10.166666666666666
F-statistic: 1.9442622950819677
Critical value (alpha = 0.01): 14.939605459912219
Do not reject the null hypothesis: variances are not significantly different.
