Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio
test. The function should return the F-value and the corresponding p-value for the test.

In [1]:
import numpy as np
import scipy.stats as stats

def variance_ratio_test(data1, data2):
    # Calculate the variances of the two samples
    variance1 = np.var(data1, ddof=1)  # ddof=1 for unbiased variance estimate
    variance2 = np.var(data2, ddof=1)

    # Calculate the F-value
    F = variance1 / variance2

    # Calculate the degrees of freedom
    df1 = len(data1) - 1
    df2 = len(data2) - 1

    # Calculate the p-value
    p_value = 1 - stats.f.cdf(F, df1, df2)

    return F, p_value

# Example usage:
data1 = [12, 15, 18, 22, 17]
data2 = [8, 11, 14, 9, 12]
F, p_value = variance_ratio_test(data1, data2)
print("F-value:", F)
print("p-value:", p_value)


F-value: 2.403508771929825
p-value: 0.2082525723890155


Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an
F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [2]:
import scipy.stats as stats

def get_critical_f_value(alpha, dfn, dfd):
    # Calculate the critical F-value for a two-tailed test
    f_critical = stats.f.ppf(1 - alpha / 2, dfn, dfd)
    
    return f_critical

# Example usage:
alpha = 0.05
dfn = 3  # Degrees of freedom for the numerator
dfd = 10  # Degrees of freedom for the denominator
critical_f = get_critical_f_value(alpha, dfn, dfd)
print("Critical F-value:", critical_f)


Critical F-value: 4.825621493405406


In this example, the function get_critical_f_value uses the stats.f.ppf function to calculate the critical F-value for a two-tailed test. You need to provide the significance level (alpha), the degrees of freedom for the numerator (dfn), and the degrees of freedom for the denominator (dfd) as input arguments to the function. The function returns the critical F-value.

Adjust the values of alpha, dfn, and dfd in the example usage to calculate the critical F-value for your specific test.

Q3. Write a Python program that generates random samples from two normal distributions with known

variances and uses an F-test to determine if the variances are equal. The program should output the F-
value, degrees of freedom, and p-value for the test.

In [3]:
import numpy as np
import scipy.stats as stats

# Set the random seed for reproducibility
np.random.seed(42)

# Generate random samples from two normal distributions
sample_size = 30  # Sample size for each group
variance1 = 4.0  # Variance of the first distribution
variance2 = 6.0  # Variance of the second distribution

sample1 = np.random.normal(0, np.sqrt(variance1), sample_size)
sample2 = np.random.normal(0, np.sqrt(variance2), sample_size)

# Perform an F-test to compare variances
f_statistic = np.var(sample1, ddof=1) / np.var(sample2, ddof=1)
df1 = sample_size - 1
df2 = sample_size - 1
p_value = 2 * min(stats.f.cdf(f_statistic, df1, df2), 1 - stats.f.cdf(f_statistic, df1, df2))

# Output the results
print("F-value:", f_statistic)
print("Degrees of freedom (df1, df2):", df1, df2)
print("p-value:", p_value)

# Determine if the variances are equal based on the p-value and significance level
alpha = 0.05
if p_value < alpha:
    print("The variances are not equal.")
else:
    print("The variances are equal.")


F-value: 0.6228812519994188
Degrees of freedom (df1, df2): 29 29
p-value: 0.2084092012966122
The variances are equal.


Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from
each population. Conduct an F-test at the 5% significance level to determine if the variances are
significantly different.

To conduct an F-test to determine if the variances of two populations are significantly different, you can use the following approach. Given that the variances of the two populations are known to be 10 and 15, and you have samples of 12 observations from each population, you can perform an F-test at a 5% significance level.

In [4]:
import scipy.stats as stats

# Known variances of the two populations
variance1 = 10
variance2 = 15

# Sample sizes
n1 = 12
n2 = 12

# Calculate the F-statistic
f_statistic = variance1 / variance2

# Degrees of freedom for the F-distribution
df1 = n1 - 1
df2 = n2 - 1

# Calculate the p-value for a two-tailed test
p_value = 2 * min(stats.f.cdf(f_statistic, df1, df2), 1 - stats.f.cdf(f_statistic, df1, df2))

# Significance level
alpha = 0.05

# Print the results
print("F-statistic:", f_statistic)
print("Degrees of freedom (df1, df2):", df1, df2)
print("p-value:", p_value)

# Determine if the variances are significantly different
if p_value < alpha:
    print("The variances are significantly different.")
else:
    print("There is no significant difference in variances.")


F-statistic: 0.6666666666666666
Degrees of freedom (df1, df2): 11 11
p-value: 0.5123897987357996
There is no significant difference in variances.


Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25
products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance
level to determine if the claim is justified.

In [5]:
import scipy.stats as stats

# Claimed variance by the manufacturer
claimed_variance = 0.005

# Sample variance
sample_variance = 0.006

# Sample size
n = 25

# Calculate the F-statistic
f_statistic = sample_variance / claimed_variance

# Degrees of freedom for the F-distribution
df1 = n - 1
df2 = 1  # Degrees of freedom for the claimed variance

# Calculate the p-value for a one-tailed test (right-tailed)
p_value = 1 - stats.f.cdf(f_statistic, df1, df2)

# Significance level
alpha = 0.01  # 1% significance level

# Print the results
print("F-statistic:", f_statistic)
print("Degrees of freedom (df1, df2):", df1, df2)
print("p-value:", p_value)

# Determine if the claim is justified
if p_value < alpha:
    print("The claim is not justified; the sample variance is significantly different from the claimed variance.")
else:
    print("The claim is justified; there is no significant difference in variances.")


F-statistic: 1.2
Degrees of freedom (df1, df2): 24 1
p-value: 0.6296099619959358
The claim is justified; there is no significant difference in variances.


Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an
F-distribution and calculates the mean and variance of the distribution. The function should return the
mean and variance as a tuple.

In [6]:
def calculate_f_distribution_mean_and_variance(dfn, dfd):
    if dfn <= 0 or dfd <= 0:
        raise ValueError("Degrees of freedom must be greater than 0.")
    
    # Calculate the mean of the F-distribution
    if dfd > 2:
        mean = dfd / (dfd - 2)
    else:
        mean = float('inf')  # When dfd is 1 or 2, the mean is undefined
    
    # Calculate the variance of the F-distribution
    if dfd > 4:
        variance = (2 * (dfd ** 2) * (dfn + dfd - 2)) / (dfn * (dfd - 2) ** 2 * (dfd - 4))
    else:
        variance = float('inf')  # When dfd is 1, 2, 3, or 4, the variance is undefined
    
    return mean, variance

# Example usage:
dfn = 3  # Degrees of freedom for the numerator
dfd = 10  # Degrees of freedom for the denominator
mean, variance = calculate_f_distribution_mean_and_variance(dfn, dfd)
print("Mean:", mean)
print("Variance:", variance)


Mean: 1.25
Variance: 1.9097222222222223


Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The
sample variance is found to be 25. Another random sample of 15 measurements is taken from another
normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test
at the 10% significance level to determine if the variances are significantly different.

In [7]:
import scipy.stats as stats

# Sample variances
sample_variance1 = 25
sample_variance2 = 20

# Sample sizes
n1 = 10
n2 = 15

# Calculate the F-statistic
f_statistic = sample_variance1 / sample_variance2

# Degrees of freedom for the F-distribution
df1 = n1 - 1
df2 = n2 - 1

# Calculate the p-value for a two-tailed test
p_value = 2 * min(stats.f.cdf(f_statistic, df1, df2), 1 - stats.f.cdf(f_statistic, df1, df2))

# Significance level
alpha = 0.10  # 10% significance level

# Print the results
print("F-statistic:", f_statistic)
print("Degrees of freedom (df1, df2):", df1, df2)
print("p-value:", p_value)

# Determine if the variances are significantly different
if p_value < alpha:
    print("The variances are not equal; they are significantly different.")
else:
    print("The variances are equal; there is no significant difference in variances.")


F-statistic: 1.25
Degrees of freedom (df1, df2): 9 14
p-value: 0.6832194382585954
The variances are equal; there is no significant difference in variances.


Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday
night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5%significance level to determine if the variances are significantly different.

In [8]:
import scipy.stats as stats

# Waiting times for Restaurant A and Restaurant B
waiting_times_a = [24, 25, 28, 23, 22, 20, 27]
waiting_times_b = [31, 33, 35, 30, 32, 36]

# Calculate the sample variances
variance_a = sum((x - sum(waiting_times_a)/len(waiting_times_a))**2 for x in waiting_times_a) / (len(waiting_times_a)-1)
variance_b = sum((x - sum(waiting_times_b)/len(waiting_times_b))**2 for x in waiting_times_b) / (len(waiting_times_b)-1)

# Calculate the F-statistic
f_statistic = variance_a / variance_b

# Degrees of freedom for the F-distribution
df1 = len(waiting_times_a) - 1
df2 = len(waiting_times_b) - 1

# Calculate the p-value for a two-tailed test
p_value = 2 * min(stats.f.cdf(f_statistic, df1, df2), 1 - stats.f.cdf(f_statistic, df1, df2))

# Significance level
alpha = 0.05  # 5% significance level

# Print the results
print("F-statistic:", f_statistic)
print("Degrees of freedom (df1, df2):", df1, df2)
print("p-value:", p_value)

# Determine if the variances are significantly different
if p_value < alpha:
    print("The variances are not equal; they are significantly different.")
else:
    print("The variances are equal; there is no significant difference in variances.")


F-statistic: 1.4551907719609583
Degrees of freedom (df1, df2): 6 5
p-value: 0.6974815747937484
The variances are equal; there is no significant difference in variances.


Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83;
Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances
are significantly different.

In [9]:
import scipy.stats as stats

# Test scores for Group A and Group B
scores_group_a = [80, 85, 90, 92, 87, 83]
scores_group_b = [75, 78, 82, 79, 81, 84]

# Calculate the sample variances
variance_group_a = sum((x - sum(scores_group_a)/len(scores_group_a))**2 for x in scores_group_a) / (len(scores_group_a)-1)
variance_group_b = sum((x - sum(scores_group_b)/len(scores_group_b))**2 for x in scores_group_b) / (len(scores_group_b)-1)

# Calculate the F-statistic
f_statistic = variance_group_a / variance_group_b

# Degrees of freedom for the F-distribution
df1 = len(scores_group_a) - 1
df2 = len(scores_group_b) - 1

# Calculate the p-value for a two-tailed test
p_value = 2 * min(stats.f.cdf(f_statistic, df1, df2), 1 - stats.f.cdf(f_statistic, df1, df2))

# Significance level
alpha = 0.01  # 1% significance level

# Print the results
print("F-statistic:", f_statistic)
print("Degrees of freedom (df1, df2):", df1, df2)
print("p-value:", p_value)

# Determine if the variances are significantly different
if p_value < alpha:
    print("The variances are not equal; they are significantly different.")
else:
    print("The variances are equal; there is no significant difference in variances.")


F-statistic: 1.9442622950819677
Degrees of freedom (df1, df2): 5 5
p-value: 0.4831043549070688
The variances are equal; there is no significant difference in variances.
