Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio
test. The function should return the F-value and the corresponding p-value for the test.

In [8]:
import numpy as np
from scipy import stats

def f_test(arr1, arr2):
    n1 = len(arr1)
    n2 = len(arr2)
    s1 = np.var(arr1, ddof=1)
    s2 = np.var(arr2, ddof=1)
    f_value = s1 / s2
    df1 = n1 - 1
    df2 = n2 - 1
    p_value = stats.f.sf(f_value, df1, df2)
    return round(f_value,3), round(p_value,3)

In [9]:
arr1 = [10, 21, 35, 49, 51]
arr2 = [61, 27, 78, 19, 100]
f_test(arr1,arr2)

(0.27, 0.883)

Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an
F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [10]:
from scipy.stats import f
def critical_f_value(numerator_df, denominator_df, alpha=0.05):
    return f.ppf(1 - alpha/2, numerator_df, denominator_df)

In [12]:
numerator_df = 8
denominator_df = 12
alpha = 0.05

critical_f_value(numerator_df, denominator_df, alpha)

3.511776736314822

Q3. Write a Python program that generates random samples from two normal distributions with known
variances and uses an F-test to determine if the variances are equal. The program should output the F-
value, degrees of freedom, and p-value for the test.

In [26]:
import numpy as np
from scipy.stats import f

# Set the seed for reproducibility
np.random.seed(123)

# Set the parameters for the normal distributions
mu1 = 0
mu2 = 0
sigma1 = 1
sigma2 = 2

# Generate random samples from the two normal distributions
n1 = 10
n2 = 12
x1 = np.random.normal(mu1, sigma1, n1)
x2 = np.random.normal(mu2, sigma2, n2)

# Calculate the F-value and p-value using an F-test
df1 = n1 - 1
df2 = n2 - 1
F = np.var(x1, ddof=1) / np.var(x2, ddof=1)
p_value = 2 * min(f.cdf(F, df1, df2), 1 - f.cdf(F, df1, df2))

# Output the results of the F-test
print("F-value: {:.2f}".format(F))
print("Degrees of freedom: {:d}, {:d}".format(df1, df2))
print("p-value: {:.4f}".format(p_value))

F-value: 0.37
Degrees of freedom: 9, 11
p-value: 0.1432


Q4. The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from
each population. Conduct an F-test at the 5% significance level to determine if the variances are
significantly different.

Null hypothesis: The variances of the two populations are equal.

Alternative hypothesis: The variances of the two populations are not equal.

In [32]:
import numpy as np
from scipy.stats import f

# Set the seed for reproducibility
np.random.seed(123)

# Set the known variances of the two populations
var1 = 10
var2 = 15

# Generate random samples from the two populations
n1 = 12
n2 = 12
x1 = np.random.normal(0, np.sqrt(var1), n1)
x2 = np.random.normal(0, np.sqrt(var2), n2)

# Calculate the F-value and p-value using an F-test
df1 = n1 - 1
df2 = n2 - 1
F = np.var(x1, ddof=1) / np.var(x2, ddof=1)
p_value = 2 * min(f.cdf(F, df1, df2), 1 - f.cdf(F, df1, df2))

# Output the results of the F-test
alpha = 0.05
print("F-value: {:.2f}".format(F))
print("Degrees of freedom: {:d}, {:d}".format(df1, df2))
print("p-value: {:.4f}".format(p_value))
if p_value < alpha:
    print("The variances are significantly different.")
else:
    print("The variances are not significantly different.")

F-value: 0.78
Degrees of freedom: 11, 11
p-value: 0.6892
The variances are not significantly different.


Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25
products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance
level to determine if the claim is justified.

Null hypothesis: The population variance of the diameter of the product is equal to 0.005.
Alternative hypothesis: The population variance of the diameter of the product is greater than 0.005.

The F-statistic is calculated as:

F = (sample variance / population variance) = 0.006 / 0.005 = 1.2

The degrees of freedom for the numerator and denominator are 24 and 25, respectively (since we have a sample size of 25).

Using an F-table with 24 and 25 degrees of freedom, and a 1% significance level, we find the critical value to be 3.86.

Since our calculated F-statistic (1.2) is less than the critical value (3.86), we fail to reject the null hypothesis. 

Therefore, we do not have enough evidence to conclude that the population variance of the diameter of the product is greater than 0.005. We can conclude that the manufacturer's claim is justified at the 1% significance level.

Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an
F-distribution and calculates the mean and variance of the distribution. The function should return the
mean and variance as a tuple.

In [35]:
import scipy.stats as stats

def mean_var(df_n, df_d):
    
    mean = df_d / (df_d - 2)
    variance = (2 * df_d ** 2 * (df_n + df_d - 2)) / (df_n * (df_d - 2) ** 2 * (df_d - 4))
    
    return (round(mean,4), round(variance,4))


In [36]:
mean_var(10,5)

(1.6667, 7.2222)

Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The
sample variance is found to be 25. Another random sample of 15 measurements is taken from another
normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test
at the 10% significance level to determine if the variances are significantly different.

Null hypothesis: The population variances of the two populations are equal.

Alternative hypothesis: The population variances of the two populations are different.

The F-statistic is calculated as:

F = (larger sample variance / smaller sample variance) = 25 / 20 = 1.25

Since we are testing if the variances are different, we will use a two-tailed test. Using an F-table or calculator with 9 and 14 degrees of freedom (for a sample size of 10 and 15, respectively), and a 10% significance level, we find the critical values to be 0.405 and 2.942.

Since our calculated F-statistic (1.25) is greater than the lower critical value (0.405) but less than the upper critical value (2.942), we fail to reject the null hypothesis. Therefore, we do not have enough evidence to conclude that the population variances of the two populations are significantly different at the 10% significance level.

Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday
night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5%
significance level to determine if the variances are significantly different.

In [52]:
import numpy as np
from scipy import stats

def f_test(arr1, arr2):
    n1 = len(arr1)
    n2 = len(arr2)
    alpha = 0.05
    s1 = np.var(arr1, ddof=1)
    s2 = np.var(arr2, ddof=1)
    f_value = s1 / s2
    df1 = n1 - 1
    df2 = n2 - 1
    p_value = stats.f.sf(f_value, df1, df2)
    return round(f_value,3), round(p_value,3)

The variances are not significantly different.


In [50]:
arr1, arr2 = [24, 25, 28, 23, 22, 20, 27], [31, 33, 35, 30, 32, 36]

In [51]:
f_test(arr1,arr2)

(1.455, 0.349)

In [53]:
if p_value < alpha:
    print("The variances are significantly different.")
else:
    print("The variances are not significantly different.")

The variances are not significantly different.


Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83;
Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances
are significantly different.

In [56]:
# using above created function
alpha = 0.01
arr1, arr2 = [80, 85, 90, 92, 87, 83], [75, 78, 82, 79, 81, 84]
f_test(arr1,arr2)

(1.944, 0.242)

In [57]:
if p_value < alpha:
    print("The variances are significantly different.")
else:
    print("The variances are not significantly different.")

The variances are not significantly different.
