Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio 
test. The function should return the F-value and the corresponding p-value for the test.

In [1]:
from scipy.stats import f_oneway

data1 = [1, 2, 3, 4, 5]
data2 = [2, 4, 6, 8, 10]

def variance_ratio_test(data1, data2):
    f_value, p_value = f_oneway(data1, data2)
    return f_value, p_value

f_value, p_value = variance_ratio_test(data1, data2)
print("F-value: ", f_value)
print("p-value: ", p_value)

F-value:  3.6
p-value:  0.0943497728424377


Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an 
F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [2]:
from scipy.stats import f

num_df = 2
denom_df = 20
alpha = 0.05

def critical_f_value(num_df, denom_df, alpha=0.05):
    # Calculate the critical F-value
    f_crit = f.ppf(1-alpha/2, num_df, denom_df)
    
    return f_crit
f_crit = critical_f_value(num_df, denom_df, alpha)
print("Critical F-value: ", f_crit)

Critical F-value:  4.461255495919247


Q3. Write a Python program that generates random samples from two normal distributions with known 
variances and uses an F-test to determine if the variances are equal. The program should output the F-value, degrees of freedom, and p-value for the test.

In [3]:
import numpy as np
from scipy.stats import f

# Set random seed for reproducibility
np.random.seed(1234)

# Generate two normal distributions with known variances
mu1 = 5
mu2 = 5
sigma1 = 2
sigma2 = 3
n1 = 20
n2 = 25
data1 = np.random.normal(mu1, sigma1, n1)
data2 = np.random.normal(mu2, sigma2, n2)

# Calculate the F-value, degrees of freedom, and p-value for the F-test
f_value = np.var(data1, ddof=1) / np.var(data2, ddof=1)
dfn = n1 - 1
dfd = n2 - 1
p_value = 2 * min(f.cdf(f_value, dfn, dfd), 1 - f.cdf(f_value, dfn, dfd))

# Output the results
print("F-value: ", f_value)
print("Degrees of freedom (numerator, denominator): ", dfn, ", ", dfd)
print("p-value: ", p_value)


F-value:  0.6020513333352552
Degrees of freedom (numerator, denominator):  19 ,  24
p-value:  0.26230564743629287


Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from 
each population. Conduct an F-test at the 5% significance level to determine if the variances are 
significantly different.

In [4]:
import numpy as np
from scipy.stats import f

# Set the significance level and sample sizes
alpha = 0.05
n1 = 12
n2 = 12

# Set the known variances of the populations
var1 = 10
var2 = 15

# Generate random samples from the two populations
np.random.seed(1234)
pop1 = np.random.normal(0, np.sqrt(var1), n1)
pop2 = np.random.normal(0, np.sqrt(var2), n2)

# Calculate the sample variances
sample_var1 = np.var(pop1, ddof=1)
sample_var2 = np.var(pop2, ddof=1)

# Calculate the F-statistic
F = sample_var1 / sample_var2

# Calculate the critical value for the F-distribution
dfn = n1 - 1
dfd = n2 - 1
critical_value = f.ppf(1 - alpha/2, dfn, dfd)

# Calculate the p-value
p_value = 2 * min(f.cdf(F, dfn, dfd), 1 - f.cdf(F, dfn, dfd))

# Print the results
print("Sample variance of population 1:", sample_var1)
print("Sample variance of population 2:", sample_var2)
print("F-statistic:", F)
print("Critical value:", critical_value)
print("p-value:", p_value)

# Check if the null hypothesis is rejected or not
if F > critical_value:
    print("Reject the null hypothesis that the variances are equal.")
else:
    print("Fail to reject the null hypothesis that the variances are equal.")


Sample variance of population 1: 12.305836871110602
Sample variance of population 2: 13.93185376486066
F-statistic: 0.8832878293733425
Critical value: 3.473699051085809
p-value: 0.8406030712310796
Fail to reject the null hypothesis that the variances are equal.


Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 
products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance 
level to determine if the claim is justified.

In [5]:
import numpy as np
from scipy.stats import f

# Set the significance level and sample size
alpha = 0.01
n = 25

# Set the claimed variance and sample variance
claimed_var = 0.005
sample_var = 0.006

# Calculate the F-statistic
F = sample_var / claimed_var

# Calculate the critical value for the F-distribution
dfn = n - 1
dfd = 0 # degrees of freedom for the denominator
critical_value = f.ppf(1 - alpha, dfn, dfd)

# Calculate the p-value
p_value = 1 - f.cdf(F, dfn, dfd)

# Print the results
print("Claimed variance:", claimed_var)
print("Sample variance:", sample_var)
print("F-statistic:", F)
print("Critical value:", critical_value)
print("p-value:", p_value)

# Check if the null hypothesis is rejected or not
if F > critical_value:
    print("Reject the null hypothesis that the variance is equal to the claimed variance.")
else:
    print("Fail to reject the null hypothesis that the variance is equal to the claimed variance.")


Claimed variance: 0.005
Sample variance: 0.006
F-statistic: 1.2
Critical value: nan
p-value: nan
Fail to reject the null hypothesis that the variance is equal to the claimed variance.


Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an 
F-distribution and calculates the mean and variance of the distribution. The function should return the 
mean and variance as a tuple.

        This function uses the f module from the scipy.stats library. It takes in the degrees of freedom for the numerator (dfn) and denominator (dfd) of an F-distribution and calculates the mean and variance of the distribution using the formulas:

    1.) Mean: dfd / (dfd - 2)
    2.) Variance: 2 * dfd^2 * (dfn + dfd - 2) / (dfn * (dfd - 2)^2 * (dfd - 4))
The function returns the mean and variance as a tuple. You can call this function with the desired values of dfn and dfd to get the mean and variance of the corresponding F-distribution.

In [6]:
from scipy.stats import f

def f_dist_mean_var(dfn, dfd):
    """
    Calculates the mean and variance of an F-distribution given the degrees of freedom
    for the numerator and denominator.
    
    Parameters:
    dfn (int): degrees of freedom for the numerator
    dfd (int): degrees of freedom for the denominator
    
    Returns:
    tuple: mean and variance of the F-distribution
    """
    mean = dfd / (dfd - 2)
    var = (2 * dfd**2 * (dfn + dfd - 2)) / (dfn * (dfd - 2)**2 * (dfd - 4))
    
    return mean, var


Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The 
sample variance is found to be 25. Another random sample of 15 measurements is taken from another 
normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test 
at the 10% significance level to determine if the variances are significantly different.

In [12]:
from scipy.stats import f

n1 = 10
sigma1 = 25
n2 = 15
sigma2 = 20
alpha = 0.1
df1 = n1 - 1
df2 = n2 - 1

F = sigma1 / sigma2
p = 2 * min(f.cdf(F, df1, df2), 1 - f.cdf(F, df1, df2))

if p < alpha:
    print("The variances are significantly different.")
else:
    print("The variances are not significantly different.")


The variances are not significantly different.


Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday 
night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

    We can conduct an F-test to test the equality of variances between the two groups using the following steps:

    1.) Calculate the sample variances for both groups.
    2.) Calculate the ratio of the sample variances, which follows an F-distribution with degrees of freedom (n1-1) and (n2-1), where n1 and n2 are the sample sizes for each group.
    3.) Calculate the p-value associated with the F-statistic using the F-distribution.
    4.) Compare the p-value to the significance level (0.05) and make a decision.

In [11]:
import numpy as np
from scipy.stats import f

# Data for Restaurant A and Restaurant B
a = np.array([24, 25, 28, 23, 22, 20, 27])
b = np.array([31, 33, 35, 30, 32, 36])

# Calculate sample variances
var_a = np.var(a, ddof=1)
var_b = np.var(b, ddof=1)

# Calculate the F-statistic and p-value
f_stat = var_a / var_b
p_val = f.cdf(f_stat, len(a)-1, len(b)-1)

# Print the results
print("F-statistic:", f_stat)
print("p-value:", p_val)

# Check if p-value is less than the significance level
if p_val < 0.05:
    print("Reject null hypothesis, variances are significantly different.")
else:
    print("Fail to reject null hypothesis, variances are not significantly different.")


F-statistic: 1.4551907719609583
p-value: 0.6512592126031258
Fail to reject null hypothesis, variances are not significantly different.


Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83; 
Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances 
are significantly different.

In [13]:
from scipy.stats import f

groupA = [80, 85, 90, 92, 87, 83]
groupB = [75, 78, 82, 79, 81, 84]

n1 = len(groupA)
var1 = sum((x - sum(groupA)/n1)**2 for x in groupA) / (n1 - 1)
n2 = len(groupB)
var2 = sum((x - sum(groupB)/n2)**2 for x in groupB) / (n2 - 1)

alpha = 0.01
df1 = n1 - 1
df2 = n2 - 1

F = var1 / var2
p = 2 * min(f.cdf(F, df1, df2), 1 - f.cdf(F, df1, df2))

if p < alpha:
    print("The variances are significantly different.")
else:
    print("The variances are not significantly different.")


The variances are not significantly different.
