# Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.

In [3]:
import numpy as np
from scipy.stats import f

def variance_ratio_test(data1, data2):
    """
    Compute the variance ratio test for two arrays of data.
    Assumes that the arrays have the same length.
    """
    # Compute the differences and log differences between adjacent observations
    diff1 = np.diff(data1)
    diff2 = np.diff(data2)
    log_diff1 = np.log(data1[1:]) - np.log(data1[:-1])
    log_diff2 = np.log(data2[1:]) - np.log(data2[:-1])

    # Compute the variances of the differences and log differences
    s1_sq = np.mean(np.square(diff1 - diff2))
    s2_sq = np.mean(np.square(log_diff1 - log_diff2))

    # Compute the test statistic and degrees of freedom
    n = len(data1)
    F = s2_sq / s1_sq
    df1 = n/2
    df2 = n/2

    # Compute the p-value for the test
    p_value = 1 - f.cdf(F, df1, df2)

    return F, p_value

This function takes in two arrays of data (data1 and data2), which are assumed to have the same length. It then computes the differences and log differences between adjacent observations for each array, and calculates the variances of these differences. The variances are then used to compute the F-value for the variance ratio test, along with the degrees of freedom for the test. Finally, the function uses the F-distribution to compute the corresponding p-value for the test. The function returns both the F-value and the p-value.

# Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [4]:
from scipy.stats import f

def critical_f_value(df1, df2):
    """
    Compute the critical F-value for a two-tailed test given a significance level of 0.05
    and the degrees of freedom for the numerator and denominator of an F-distribution.
    """
    alpha = 0.05
    f_critical = f.ppf(alpha/2, df1, df2)
    return f_critical

This function uses the f.ppf() function from the scipy.stats module to compute the critical F-value for a two-tailed test at a significance level of 0.05. The f.ppf() function takes the significance level (alpha) divided by 2 as its first argument, and the degrees of freedom for the numerator and denominator of the F-distribution as its next two arguments (df1 and df2, respectively). The resulting critical F-value is then returned by the function.

# Q3. Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the F- value, degrees of freedom, and p-value for the test.

In [5]:
import numpy as np
from scipy.stats import f

# Set the random seed for reproducibility
np.random.seed(123)

# Set the sample sizes and variances for the two normal distributions
n1, n2 = 50, 40
var1, var2 = 1.5**2, 1.8**2

# Generate random samples from the two normal distributions
data1 = np.random.normal(loc=0, scale=np.sqrt(var1), size=n1)
data2 = np.random.normal(loc=0, scale=np.sqrt(var2), size=n2)

# Compute the F-value and p-value for the variance ratio test
F = np.var(data1) / np.var(data2)
df1 = n1 - 1
df2 = n2 - 1
p_value = 1 - f.cdf(F, df1, df2)

# Print the results of the variance ratio test
print("F-value: {:.4f}".format(F))
print("Degrees of freedom: {}, {}".format(df1, df2))
print("p-value: {:.4f}".format(p_value))

F-value: 0.8261
Degrees of freedom: 49, 39
p-value: 0.7387


This program generates random samples from two normal distributions with known variances (var1 and var2) and sample sizes (n1 and n2). It then computes the F-value and p-value for the variance ratio test using the np.var() function to compute the sample variances and the f.cdf() function from the scipy.stats module to compute the p-value. The degrees of freedom for the test are also computed based on the sample sizes (n1 and n2). Finally, the program prints out the F-value, degrees of freedom, and p-value for the test. Note that the results of the test will be different each time the program is run due to the random generation of the samples.

# Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

In [6]:
import numpy as np
from scipy.stats import f

# Define the sample sizes and known variances
n1, n2 = 12, 12
var1, var2 = 10, 15

# Generate random samples from the two populations
data1 = np.random.normal(loc=0, scale=np.sqrt(var1), size=n1)
data2 = np.random.normal(loc=0, scale=np.sqrt(var2), size=n2)

# Compute the F-value and p-value for the variance ratio test
F = np.var(data1) / np.var(data2)
df1 = n1 - 1
df2 = n2 - 1
p_value = 1 - f.cdf(F, df1, df2)

# Set the significance level
alpha = 0.05

# Conduct the hypothesis test
if p_value < alpha:
    print("Reject the null hypothesis: The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: The variances are not significantly different.")

Fail to reject the null hypothesis: The variances are not significantly different.


This program generates random samples from the two populations with known variances (var1 and var2) and sample sizes of 12 observations each (n1 and n2). It then computes the F-value and p-value for the variance ratio test using the np.var() function to compute the sample variances and the f.cdf() function from the scipy.stats module to compute the p-value. The degrees of freedom for the test are also computed based on the sample sizes (n1 and n2). Finally, the program conducts the hypothesis test at the 5% significance level and prints out the conclusion of the test.

# Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.

In [7]:
import numpy as np
from scipy.stats import f

# Define the null hypothesis variance and the sample variance
null_var = 0.005
sample_var = 0.006
n = 25

# Compute the F-value and p-value for the variance ratio test
F = sample_var / null_var
df1 = n - 1
df2 = np.inf
p_value = 1 - f.cdf(F, df1, df2)

# Set the significance level
alpha = 0.01

# Conduct the hypothesis test
if p_value < alpha:
    print("Reject the null hypothesis: The claim of variance 0.005 is not justified.")
else:
    print("Fail to reject the null hypothesis: The claim of variance 0.005 is justified.")

Fail to reject the null hypothesis: The claim of variance 0.005 is justified.


This program conducts an F-test at the 1% significance level to determine if the claim that the variance of the diameter of a certain product is 0.005 is justified. It uses the sample variance of 0.006 and a sample size of 25 to compute the F-value and p-value for the variance ratio test using the f.cdf() function from the scipy.stats module. The degrees of freedom for the test are computed based on the sample size (n) and the null hypothesis variance. Finally, the program conducts the hypothesis test and prints out the conclusion of the test.

# Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.

In [8]:
def f_distribution_mean_var(df1, df2):
    mean = df2 / (df2 - 2)
    variance = (2 * df2 ** 2 * (df1 + df2 - 2)) / (df1 * (df2 - 2) ** 2 * (df2 - 4))
    return mean, variance

This function uses the formula for the mean and variance of an F-distribution, which are:

Mean = df2 / (df2 - 2)

Variance = (2 * df2^2 * (df1 + df2 - 2)) / (df1 * (df2 - 2)^2 * (df2 - 4))

where df1 and df2 are the degrees of freedom for the numerator and denominator, respectively.

The function returns the mean and variance as a tuple. You can call this function and pass in the degrees of freedom to calculate the mean and variance of the F-distribution.

# Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.

In [9]:
import numpy as np
from scipy.stats import f

# Define the sample variances and sample sizes
s1_squared = 25
s2_squared = 20
n1 = 10
n2 = 15

# Compute the F-value and p-value for the variance ratio test
F = s1_squared / s2_squared
df1 = n1 - 1
df2 = n2 - 1
p_value = 2 * (1 - f.cdf(F, df1, df2))

# Set the significance level
alpha = 0.1

# Conduct the hypothesis test
if p_value < alpha:
    print("Reject the null hypothesis: The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: The variances are not significantly different.")

Fail to reject the null hypothesis: The variances are not significantly different.


This program conducts an F-test at the 10% significance level to determine if the variances of two normal populations are significantly different. It uses the sample variances of 25 and 20, and sample sizes of 10 and 15 to compute the F-value and p-value for the variance ratio test using the f.cdf() function from the scipy.stats module. The degrees of freedom for the test are computed based on the sample sizes (n1 and n2). Finally, the program conducts the hypothesis test and prints out the conclusion of the test.

# Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

In [10]:
import numpy as np
from scipy.stats import f

# Define the waiting times at each restaurant
a = np.array([24, 25, 28, 23, 22, 20, 27])
b = np.array([31, 33, 35, 30, 32, 36])

# Compute the sample variances and degrees of freedom
squared_var_a = np.var(a, ddof=1)
squared_var_b = np.var(b, ddof=1)
n_a = len(a)
n_b = len(b)
df_a = n_a - 1
df_b = n_b - 1

# Compute the F-value and p-value for the variance ratio test
F = squared_var_a / squared_var_b
p_value = 2 * (1 - f.cdf(F, df_a, df_b))

# Set the significance level
alpha = 0.05

# Conduct the hypothesis test
if p_value < alpha:
    print("Reject the null hypothesis: The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: The variances are not significantly different.")

Fail to reject the null hypothesis: The variances are not significantly different.


This program conducts an F-test at the 5% significance level to determine if the variances of waiting times at two different restaurants are significantly different. It uses the waiting times of Restaurant A and B to compute the sample variances (squared_var_a and squared_var_b) and degrees of freedom (df_a and df_b). The F-value and p-value for the variance ratio test are then computed using the f.cdf() function from the scipy.stats module. Finally, the program conducts the hypothesis test and prints out the conclusion of the test.

# Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83; Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.

In [11]:
import numpy as np
from scipy.stats import f

# Define the test scores for each group
a = np.array([80, 85, 90, 92, 87, 83])
b = np.array([75, 78, 82, 79, 81, 84])

# Compute the sample variances and degrees of freedom
squared_var_a = np.var(a, ddof=1)
squared_var_b = np.var(b, ddof=1)
n_a = len(a)
n_b = len(b)
df_a = n_a - 1
df_b = n_b - 1

# Compute the F-value and p-value for the variance ratio test
F = squared_var_a / squared_var_b
p_value = 2 * (1 - f.cdf(F, df_a, df_b))

# Set the significance level
alpha = 0.01

# Conduct the hypothesis test
if p_value < alpha:
    print("Reject the null hypothesis: The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: The variances are not significantly different.")

Fail to reject the null hypothesis: The variances are not significantly different.
