Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio
test. The function should return the F-value and the corresponding p-value for the test.



In [1]:
import numpy as np
from scipy.stats import f

def variance_ratio_test(data1, data2):
    # Calculate the sample variances
    var1 = np.var(data1, ddof=1)
    var2 = np.var(data2, ddof=1)
    
    # Sample sizes
    n1 = len(data1)
    n2 = len(data2)
    
    # Calculate the F-value
    if var1 > var2:
        F = var1 / var2
        dfn = n1 - 1
        dfd = n2 - 1
    else:
        F = var2 / var1
        dfn = n2 - 1
        dfd = n1 - 1
    
    # Calculate the p-value
    p_value = 2 * min(f.cdf(F, dfn, dfd), 1 - f.cdf(F, dfn, dfd))
    
    return F, p_value

# Example usage
data1 = [10, 20, 30, 40, 50]
data2 = [15, 25, 35, 45, 55]

F_value, p_value = variance_ratio_test(data1, data2)
print(f"F-value: {F_value}, p-value: {p_value}")


F-value: 1.0, p-value: 1.0


Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an
F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

Adjust the significance level to account for the two tails (i.e., use 
α/2 for each tail).
Use the inverse cumulative distribution function (also known as the percent point function, ppf) to find the critical value for the upper tail.


In [2]:
from scipy.stats import f

def critical_f_value(alpha, dfn, dfd):
    # Adjust alpha for the two-tailed test
    alpha = alpha / 2
    
    # Calculate the critical F-value for the upper tail
    upper_critical_value = f.ppf(1 - alpha, dfn, dfd)
    
    # For a two-tailed test, we typically consider the symmetry and use only the upper critical value
    return upper_critical_value

# Example usage
alpha = 0.05
dfn = 5  # degrees of freedom for the numerator
dfd = 10  # degrees of freedom for the denominator

critical_value = critical_f_value(alpha, dfn, dfd)
print(f"Critical F-value for a two-tailed test: {critical_value}")


Critical F-value for a two-tailed test: 4.236085668188633


Q3. Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the F-
value, degrees of freedom, and p-value for the test.

In [3]:
import numpy as np
from scipy.stats import f

# Function to perform the F-test
def variance_ratio_test(data1, data2):
    # Calculate the sample variances
    var1 = np.var(data1, ddof=1)
    var2 = np.var(data2, ddof=1)
    
    # Sample sizes
    n1 = len(data1)
    n2 = len(data2)
    
    # Calculate the F-value
    if var1 > var2:
        F = var1 / var2
        dfn = n1 - 1
        dfd = n2 - 1
    else:
        F = var2 / var1
        dfn = n2 - 1
        dfd = n1 - 1
    
    # Calculate the p-value
    p_value = 2 * min(f.cdf(F, dfn, dfd), 1 - f.cdf(F, dfn, dfd))
    
    return F, dfn, dfd, p_value

# Parameters for the normal distributions
mean1, mean2 = 0, 0
variance1, variance2 = 1, 2
size1, size2 = 30, 30

# Generate random samples
data1 = np.random.normal(mean1, np.sqrt(variance1), size1)
data2 = np.random.normal(mean2, np.sqrt(variance2), size2)

# Perform the F-test
F_value, dfn, dfd, p_value = variance_ratio_test(data1, data2)

# Output the results
print(f"F-value: {F_value}")
print(f"Degrees of freedom (numerator): {dfn}")
print(f"Degrees of freedom (denominator): {dfd}")
print(f"P-value: {p_value}")


F-value: 2.251183215526366
Degrees of freedom (numerator): 29
Degrees of freedom (denominator): 29
P-value: 0.032537120640975914


Generate Random Samples:

- np.random.normal(mean, stddev, size) generates random samples from a normal distribution.

- We use the square root of the variances (np.sqrt(variance)) to get the standard deviations for the normal distributions.

- F-test Function:

- variance_ratio_test calculates the F-value and degrees of freedom.
- The sample variances are computed using np.var(data, ddof=1).
- The F-value is the ratio of the larger variance to the smaller variance.
- The p-value is calculated using the cumulative distribution function (CDF) of the F-distribution, considering a two-tailed test.


Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from
each population. Conduct an F-test at the 5% significance level to determine if the variances are
significantly different.

In [5]:
from scipy.stats import f

# Given variances and sample sizes
variance1 = 10
variance2 = 15
n1 = 12
n2 = 12

# Calculate the F-value
if variance1 > variance2:
    F = variance1 / variance2
    dfn = n1 - 1
    dfd = n2 - 1
else:
    F = variance2 / variance1
    dfn = n2 - 1
    dfd = n1 - 1

# Significance level
alpha = 0.05

# Calculate critical F-values for a two-tailed test
critical_value_lower = f.ppf(alpha / 2, dfn, dfd)
critical_value_upper = f.ppf(1 - alpha / 2, dfn, dfd)

# Calculate the p-value
p_value = 2 * min(f.cdf(F, dfn, dfd), 1 - f.cdf(F, dfn, dfd))

# Output the results
print(f"F-value: {F}")
print(f"Degrees of freedom (numerator): {dfn}")
print(f"Degrees of freedom (denominator): {dfd}")
print(f"Critical F-values: {critical_value_lower}, {critical_value_upper}")
print(f"P-value: {p_value}")

# Decision based on critical values
if F < critical_value_lower or F > critical_value_upper:
    print("Reject the null hypothesis: The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: The variances are not significantly different.")


F-value: 1.5
Degrees of freedom (numerator): 11
Degrees of freedom (denominator): 11
Critical F-values: 0.28787755798459863, 3.473699051085809
P-value: 0.5123897987357999
Fail to reject the null hypothesis: The variances are not significantly different.


Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25
products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance
level to determine if the claim is justified.

To test the manufacturer's claim about the variance of the diameter using an F-test, we can compare the sample variance to the claimed population variance. The F-test compares two variances, so in this case, we can treat the claimed variance as one population and the sample variance as another.


 



In [6]:
from scipy.stats import f

# Given values
claimed_variance = 0.005
sample_variance = 0.006
n = 25

# Calculate the F-value
F = sample_variance / claimed_variance

# Degrees of freedom
dfn = n - 1
dfd = n - 1

# Significance level
alpha = 0.01

# Critical values for a two-tailed test
critical_value_lower = f.ppf(alpha / 2, dfn, dfd)
critical_value_upper = f.ppf(1 - alpha / 2, dfn, dfd)

# Calculate the p-value
p_value = 2 * min(f.cdf(F, dfn, dfd), 1 - f.cdf(F, dfn, dfd))

# Output the results
print(f"F-value: {F}")
print(f"Degrees of freedom (numerator): {dfn}")
print(f"Degrees of freedom (denominator): {dfd}")
print(f"Critical values: {critical_value_lower}, {critical_value_upper}")
print(f"P-value: {p_value}")

# Decision based on critical values
if F < critical_value_lower or F > critical_value_upper:
    print("Reject the null hypothesis: The claimed variance is not justified.")
else:
    print("Fail to reject the null hypothesis: The claimed variance is justified.")


F-value: 1.2
Degrees of freedom (numerator): 24
Degrees of freedom (denominator): 24
Critical values: 0.3370701342685674, 2.966741631292762
P-value: 0.6587309365634488
Fail to reject the null hypothesis: The claimed variance is justified.


- Calculate the F-value: We calculate the F-value as the ratio of the sample variance to the claimed population variance.

- Degrees of Freedom: For both the sample and the claimed population, the degrees of freedom aren−1.

- Critical Values: We use the ppf function from scipy.stats.f to get the critical values for the F-distribution at the 0.005 and 0.995 quantiles, corresponding to a two-tailed test with a 1% significance level.

- P-value: The p-value is calculated using the cumulative distribution function (CDF) of the F-distribution.

- Decision: We compare the calculated F-value to the critical values to determine whether to reject the null hypothesis.

Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an
F-distribution and calculates the mean and variance of the distribution. The function should return the
mean and variance as a tuple.

In [10]:
def f_distribution_mean_variance(dfn, dfd):
    # Calculate the mean
    if dfd > 2:
        mean = dfd / (dfd - 2)
    else:
        mean = float('inf')  # Mean is undefined for dfd <= 2
    
    # Calculate the variance
    if dfd > 4:
        variance = (2 * dfd**2 * (dfn + dfd - 2)) / (dfn * (dfd - 2)**2 * (dfd - 4))
    else:
        variance = float('inf')  # Variance is undefined for dfd <= 4
    
    return (mean, variance)

# Example usage
dfn = 5  # degrees of freedom for the numerator
dfd = 10  # degrees of freedom for the denominator

result = f_distribution_mean_variance(dfn, dfd)
print(result)  # This will print the tuple containing mean and variance


(1.25, 1.3541666666666667)


Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The
sample variance is found to be 25. Another random sample of 15 measurements is taken from another
normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test
at the 10% significance level to determine if the variances are significantly different.



In [11]:
from scipy.stats import f

# Given sample variances
s1_squared = 25
s2_squared = 20

# Sample sizes
n1 = 10
n2 = 15

# Calculate the F-value
F = s1_squared / s2_squared

# Degrees of freedom
dfn = n1 - 1
dfd = n2 - 1

# Significance level
alpha = 0.10

# Critical value for a two-tailed test
critical_value_lower = f.ppf(alpha / 2, dfn, dfd)
critical_value_upper = f.ppf(1 - alpha / 2, dfn, dfd)

# Output the results
print(f"F-value: {F}")
print(f"Degrees of freedom (numerator): {dfn}")
print(f"Degrees of freedom (denominator): {dfd}")
print(f"Critical values: {critical_value_lower}, {critical_value_upper}")

# Decision based on critical values
if F < critical_value_lower or F > critical_value_upper:
    print("Reject the null hypothesis: The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: The variances are not significantly different.")


F-value: 1.25
Degrees of freedom (numerator): 9
Degrees of freedom (denominator): 14
Critical values: 0.3305268601412525, 2.6457907352338195
Fail to reject the null hypothesis: The variances are not significantly different.


Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday
night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5%
significance level to determine if the variances are significantly different.


In [12]:
from scipy.stats import f

# Waiting times data
waiting_times_A = [24, 25, 28, 23, 22, 20, 27]
waiting_times_B = [31, 33, 35, 30, 32, 36]

# Sample variances
squared_variance_A = np.var(waiting_times_A, ddof=1)
squared_variance_B = np.var(waiting_times_B, ddof=1)

# Sample sizes
n_A = len(waiting_times_A)
n_B = len(waiting_times_B)

# Calculate the F-value
F = squared_variance_A / squared_variance_B

# Degrees of freedom
dfn = n_A - 1
dfd = n_B - 1

# Significance level
alpha = 0.05

# Critical value for a two-tailed test
critical_value_lower = f.ppf(alpha / 2, dfn, dfd)
critical_value_upper = f.ppf(1 - alpha / 2, dfn, dfd)

# Output the results
print(f"F-value: {F}")
print(f"Degrees of freedom (numerator): {dfn}")
print(f"Degrees of freedom (denominator): {dfd}")
print(f"Critical values: {critical_value_lower}, {critical_value_upper}")

# Decision based on critical values
if F < critical_value_lower or F > critical_value_upper:
    print("Reject the null hypothesis: The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: The variances are not significantly different.")


F-value: 1.4551907719609583
Degrees of freedom (numerator): 6
Degrees of freedom (denominator): 5
Critical values: 0.16701279718024772, 6.977701858535566
Fail to reject the null hypothesis: The variances are not significantly different.


Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83;
Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances
are significantly different.

In [13]:
from scipy.stats import f

# Test scores data
test_scores_A = [80, 85, 90, 92, 87, 83]
test_scores_B = [75, 78, 82, 79, 81, 84]

# Sample variances
squared_variance_A = np.var(test_scores_A, ddof=1)
squared_variance_B = np.var(test_scores_B, ddof=1)

# Sample sizes
n_A = len(test_scores_A)
n_B = len(test_scores_B)

# Calculate the F-value
F = squared_variance_A / squared_variance_B

# Degrees of freedom
dfn = n_A - 1
dfd = n_B - 1

# Significance level
alpha = 0.01

# Critical value for a two-tailed test
critical_value_lower = f.ppf(alpha / 2, dfn, dfd)
critical_value_upper = f.ppf(1 - alpha / 2, dfn, dfd)

# Output the results
print(f"F-value: {F}")
print(f"Degrees of freedom (numerator): {dfn}")
print(f"Degrees of freedom (denominator): {dfd}")
print(f"Critical values: {critical_value_lower}, {critical_value_upper}")

# Decision based on critical values
if F < critical_value_lower or F > critical_value_upper:
    print("Reject the null hypothesis: The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: The variances are not significantly different.")


F-value: 1.9442622950819677
Degrees of freedom (numerator): 5
Degrees of freedom (denominator): 5
Critical values: 0.06693617195469603, 14.939605459912219
Fail to reject the null hypothesis: The variances are not significantly different.
