Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.

In [2]:
import numpy as np
from scipy import stats

def variance_ratio_test(data1, data2):
    
    var1 = np.var(data1, ddof=1)  # ddof=1 for unbiased estimator
    var2 = np.var(data2, ddof=1)

    if var1 < var2:
        var1, var2 = var2, var1
        data1, data2 = data2, data1
   
    F = var1 / var2    
    df1 = len(data1) - 1
    df2 = len(data2) - 1    
    p_value = 1 - stats.f.cdf(F, df1, df2)

    return F, p_value

data1 = [10, 12, 15, 14, 11, 13]
data2 = [8, 9, 10, 11, 9, 10]
F_value, p_value = variance_ratio_test(data1, data2)
print(f"F-value: {F_value}")
print(f"P-value: {p_value}")

F-value: 3.1818181818181817
P-value: 0.11486261035506617


Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [3]:
from scipy import stats

def get_critical_f_value(alpha, dfn, dfd):
    
    f_critical = stats.f.ppf(1 - alpha / 2, dfn, dfd)

    return f_critical

alpha = 0.05
dfn = 3  # Degrees of freedom for the numerator
dfd = 20  # Degrees of freedom for the denominator
critical_f_value = get_critical_f_value(alpha, dfn, dfd)
print(f"Critical F-value: {critical_f_value}")

Critical F-value: 3.8586986662732143


Q3. Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the Fvalue, degrees of freedom, and p-value for the test.

In [4]:
import numpy as np
from scipy import stats

np.random.seed(0)

variance1 = 1.5  
variance2 = 1.0  
sample_size1 = 30  
sample_size2 = 40  

data1 = np.random.normal(loc=0, scale=np.sqrt(variance1), size=sample_size1)
data2 = np.random.normal(loc=0, scale=np.sqrt(variance2), size=sample_size2)

f_statistic = np.var(data1, ddof=1) / np.var(data2, ddof=1)
df1 = sample_size1 - 1
df2 = sample_size2 - 1
p_value = 1 - stats.f.cdf(f_statistic, df1, df2)

print("F-value:", f_statistic)
print("Degrees of freedom (numerator, denominator):", df1, df2)
print("P-value:", p_value)

F-value: 2.3921864858565463
Degrees of freedom (numerator, denominator): 29 39
P-value: 0.005733556290811737


Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

Ans. To conduct an F-test to determine if the variances of two populations are significantly different, you can use the following steps and the provided information:

Define your null and alternative hypotheses:

Null Hypothesis (H0): The variances of the two populations are equal.
Alternative Hypothesis (H1): The variances of the two populations are not equal.
Set the significance level (alpha) to 0.05.

Calculate the F-statistic and degrees of freedom for the F-test using the given variances and sample sizes:

Population 1 variance (var1) = 10
Population 2 variance (var2) = 15
Sample size for both populations (n1 and n2) = 12
You can use the formula for the F-statistic:
F=10/15

Calculate the degrees of freedom for the F-test:

Degrees of freedom for Population 1 (df1) = n1 - 1 = 12 - 1 = 11
Degrees of freedom for Population 2 (df2) = n2 - 1 = 12 - 1 = 11
Use the F-distribution with df1 and df2 degrees of freedom to find the critical F-value for a two-tailed test at the 5% significance level.

Compare the calculated F-statistic to the critical F-value:

If the calculated F-statistic is greater than the critical F-value, you can reject the null hypothesis, indicating that the variances are significantly different.
If the calculated F-statistic is less than or equal to the critical F-value, you fail to reject the null hypothesis, suggesting that there is no significant difference in variances.
Output the results:

F-statistic
Critical F-value
Decision: Reject or fail to reject the null hypothesis

In [5]:
from scipy import stats

var1 = 10
var2 = 15
n1 = 12
n2 = 12
alpha = 0.05

F_statistic = var1 / var2

df1 = n1 - 1
df2 = n2 - 1

critical_f_value = stats.f.ppf(1 - alpha / 2, df1, df2)

if F_statistic > critical_f_value:
    decision = "Reject the null hypothesis"
else:
    decision = "Fail to reject the null hypothesis"

print("F-statistic:", F_statistic)
print("Critical F-value:", critical_f_value)
print("Decision:", decision)

F-statistic: 0.6666666666666666
Critical F-value: 3.473699051085809
Decision: Fail to reject the null hypothesis


Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.

Ans. To conduct an F-test to determine if the manufacturer's claim about the variance of the product's diameter is justified, you can follow these steps:

Define your null and alternative hypotheses:

Null Hypothesis (H0): The population variance is equal to the claimed value of 0.005.
Alternative Hypothesis (H1): The population variance is not equal to 0.005.
Set the significance level (alpha) to 0.01 (1%).

Calculate the F-statistic and degrees of freedom for the F-test using the given sample variance and sample size:

Claimed population variance (σ^2) = 0.005
Sample variance (s^2) = 0.006
Sample size (n) = 25
You can use the formula for the F-statistic:
F = (s^2)/(σ^2) = 0.006/0.005 = 1.2

Calculate the degrees of freedom for the F-test:

Degrees of freedom for the numerator (df1) = n - 1 = 25 - 1 = 24
Degrees of freedom for the denominator (df2) = 1 (since you're comparing to a single population with a claimed variance)
Use the F-distribution with df1 and df2 degrees of freedom to find the critical F-value for a two-tailed test at the 1% significance level.

Compare the calculated F-statistic to the critical F-value:

If the calculated F-statistic is greater than the critical F-value or less than its reciprocal, you can reject the null hypothesis, indicating that the manufacturer's claim is not justified.
If the calculated F-statistic is between the critical F-value and its reciprocal, you fail to reject the null hypothesis, suggesting that the manufacturer's claim is justified.


In [7]:
from scipy import stats

claimed_variance = 0.005
sample_variance = 0.006
sample_size = 25
alpha = 0.01

F_statistic = sample_variance / claimed_variance

df1 = sample_size - 1
df2 = 1

critical_f_value_lower = stats.f.ppf(alpha / 2, df1, df2)
critical_f_value_upper = 1 / critical_f_value_lower

if F_statistic < critical_f_value_lower or F_statistic > critical_f_value_upper:
    decision = "Reject the null hypothesis"
else:
    decision = "Fail to reject the null hypothesis"

print("F-statistic:", F_statistic)
print("Critical F-value (Lower):", critical_f_value_lower)
print("Critical F-value (Upper):", critical_f_value_upper)
print("Decision:", decision)

F-statistic: 1.2
Critical F-value (Lower): 0.10469807488970448
Critical F-value (Upper): 9.551273994803273
Decision: Fail to reject the null hypothesis


Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.

In [8]:
def f_distribution_mean_variance(dfn, dfd):
    
    if dfn <= 0 or dfd <= 0:
        raise ValueError("Degrees of freedom must be greater than 0.")

    if dfn == 1:
        mean = float(dfd) / (dfd - 2)
    else:
        mean = float(dfd) / (dfd - 2) if dfn > 2 else float("inf")

    if dfn <= 2 or dfd <= 2:
        variance = float("inf")
    else:
        variance = (2 * (dfd**2 * (dfn + dfd - 2))) / (dfn * (dfd - 2)**2 * (dfd - 4))

    return mean, variance

dfn = 5 
dfd = 10  
mean, variance = f_distribution_mean_variance(dfn, dfd)
print(f"Mean: {mean}")
print(f"Variance: {variance}")

Mean: 1.25
Variance: 1.3541666666666667


Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.

Ans. To conduct an F-test to determine if the variances of two populations are significantly different, you can follow these steps:

Define your null and alternative hypotheses:

Null Hypothesis (H0): The variances of the two populations are equal.
Alternative Hypothesis (H1): The variances of the two populations are not equal.
Set the significance level (alpha) to 0.10 (10%).

Calculate the F-statistic and degrees of freedom for the F-test using the sample variances and sample sizes:

Sample variance for the first population (s1^2) = 25

Sample size for the first population (n1) = 10

Sample variance for the second population (s2^2) = 20

Sample size for the second population (n2) = 15

You can use the formula for the F-statistic:

F = (s1^2)/(s2^2) = 25/20

Calculate the degrees of freedom for the F-test:

Degrees of freedom for the numerator (df1) = n1 - 1 = 10 - 1 = 9

Degrees of freedom for the denominator (df2) = n2 - 1 = 15 - 1 = 14

Use the F-distribution with df1 and df2 degrees of freedom to find the critical F-value for a two-tailed test at the 10% significance level.

Compare the calculated F-statistic to the critical F-value:
If the calculated F-statistic is greater than the critical F-value or less than its reciprocal, you can reject the null hypothesis, indicating that the variances are significantly different.
If the calculated F-statistic is between the critical F-value and its reciprocal, you fail to reject the null hypothesis, suggesting that there is no significant difference in variances.

In [9]:
from scipy import stats

sample_variance1 = 25
sample_size1 = 10
sample_variance2 = 20
sample_size2 = 15
alpha = 0.10

F_statistic = sample_variance1 / sample_variance2

df1 = sample_size1 - 1
df2 = sample_size2 - 1

critical_f_value_lower = stats.f.ppf(alpha / 2, df1, df2)
critical_f_value_upper = 1 / critical_f_value_lower

if F_statistic < critical_f_value_lower or F_statistic > critical_f_value_upper:
    decision = "Reject the null hypothesis"
else:
    decision = "Fail to reject the null hypothesis"

print("F-statistic:", F_statistic)
print("Critical F-value (Lower):", critical_f_value_lower)
print("Critical F-value (Upper):", critical_f_value_upper)
print("Decision:", decision)

F-statistic: 1.25
Critical F-value (Lower): 0.3305268601412525
Critical F-value (Upper): 3.0254727242822095
Decision: Fail to reject the null hypothesis


Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

Ans. Define your null and alternative hypotheses:

Null Hypothesis (H0): The variances of the waiting times at both restaurants are equal.
Alternative Hypothesis (H1): The variances of the waiting times at both restaurants are not equal.
Set the significance level (alpha) to 0.05 (5%).

Calculate the F-statistic and degrees of freedom for the F-test using the sample variances and sample sizes:

Waiting times at Restaurant A: 24, 25, 28, 23, 22, 20, 27
Waiting times at Restaurant B: 31, 33, 35, 30, 32, 36
You need to calculate the sample variances and degrees of freedom for both samples.

Use the formula for the F-statistic:
F = (s1^2)/(s2^2)

Calculate the degrees of freedom for the F-test:

Degrees of freedom for the numerator (df1) is the number of data points in each sample minus 1.
Degrees of freedom for the denominator (df2) is the number of data points in each sample minus 1.
Use the F-distribution with df1 and df2 degrees of freedom to find the critical F-value for a two-tailed test at the 5% significance level.

Compare the calculated F-statistic to the critical F-value:

If the calculated F-statistic is greater than the critical F-value or less than its reciprocal, you can reject the null hypothesis, indicating that the variances are significantly different.
If the calculated F-statistic is between the critical F-value and its reciprocal, you fail to reject the null hypothesis, suggesting that there is no significant difference in variances.

In [10]:
from scipy import stats

waiting_times_A = [24, 25, 28, 23, 22, 20, 27]
waiting_times_B = [31, 33, 35, 30, 32, 36]
alpha = 0.05

variance_A = np.var(waiting_times_A, ddof=1)
variance_B = np.var(waiting_times_B, ddof=1)
F_statistic = variance_A / variance_B

df1 = len(waiting_times_A) - 1
df2 = len(waiting_times_B) - 1

critical_f_value_lower = stats.f.ppf(alpha / 2, df1, df2)
critical_f_value_upper = 1 / critical_f_value_lower

if F_statistic < critical_f_value_lower or F_statistic > critical_f_value_upper:
    decision = "Reject the null hypothesis"
else:
    decision = "Fail to reject the null hypothesis"

print("F-statistic:", F_statistic)
print("Critical F-value (Lower):", critical_f_value_lower)
print("Critical F-value (Upper):", critical_f_value_upper)
print("Decision:", decision)

F-statistic: 1.4551907719609583
Critical F-value (Lower): 0.16701279718024772
Critical F-value (Upper): 5.9875651260469285
Decision: Fail to reject the null hypothesis


Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83; Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different

Ans. Define your null and alternative hypotheses:

Null Hypothesis (H0): The variances of the test scores in both groups are equal.
Alternative Hypothesis (H1): The variances of the test scores in both groups are not equal.
Set the significance level (alpha) to 0.01 (1%).

Calculate the F-statistic and degrees of freedom for the F-test using the sample variances and sample sizes:

Test scores for Group A: 80, 85, 90, 92, 87, 83
Test scores for Group B: 75, 78, 82, 79, 81, 84
You need to calculate the sample variances and degrees of freedom for both groups.

Use the formula for the F-statistic:
F = (s1^2)/(s2^2)

Calculate the degrees of freedom for the F-test:

Degrees of freedom for the numerator (df1) is the number of data points in each group minus 1.
Degrees of freedom for the denominator (df2) is the number of data points in each group minus 1.
Use the F-distribution with df1 and df2 degrees of freedom to find the critical F-value for a two-tailed test at the 1% significance level.

Compare the calculated F-statistic to the critical F-value:

If the calculated F-statistic is greater than the critical F-value or less than its reciprocal, you can reject the null hypothesis, indicating that the variances are significantly different.
If the calculated F-statistic is between the critical F-value and its reciprocal, you fail to reject the null hypothesis, suggesting that there is no significant difference in variances.

In [None]:
from scipy import stats

test_scores_A = [80, 85, 90, 92, 87, 83]
test_scores_B = [75, 78, 82, 79, 81, 84]
alpha = 0.01

variance_A = np.var(test_scores_A, ddof=1)
variance_B = np.var(test_scores_B, ddof=1)
F_statistic = variance_A / variance_B

df1 = len(test_scores_A) - 1
df2 = len(test_scores_B) - 1

critical_f_value_lower = stats.f.ppf(alpha / 2, df1, df2)
critical_f_value_upper = 1 / critical_f_value_lower

if F_statistic < critical_f_value_lower or F_statistic > critical_f_value_upper:
    decision = "Reject the null hypothesis"
else:
    decision = "Fail to reject the null hypothesis"

print("F-statistic:", F_statistic)
print("Critical F-value (Lower):", critical_f_value_lower)
print("Critical F-value (Upper):", critical_f_value_upper)
print("Decision:", decision)