#### Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.

In [1]:
import scipy.stats as stats

def variance_ratio_test(data1, data2):
    f_stat, p_value = stats.f_oneway(data1, data2)
    return f_stat, p_value

data1 = [1, 2, 3, 4, 5]
data2 = [6, 7, 8, 9, 10]

f_stat, p_value = variance_ratio_test(data1, data2)
print("F-value:", f_stat)
print("p-value:", p_value)


F-value: 25.0
p-value: 0.0010528257933665399


#### Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [2]:
from scipy.stats import f

def critical_f_value(alpha, dfn, dfd):
    """
    Calculates the critical F-value for a two-tailed test given a significance level and degrees of freedom
    for the numerator and denominator.
    
    Args:
    alpha (float): Significance level
    dfn (int): Degrees of freedom for the numerator
    dfd (int): Degrees of freedom for the denominator
    
    Returns:
    float: Critical F-value
    """
    return f.ppf(alpha/2, dfn, dfd), f.ppf(1 - alpha/2, dfn, dfd)

# Import the function
from scipy.stats import f

# Define the degrees of freedom
dfn = 3
dfd = 12

# Set the significance level
alpha = 0.05

# Call the function to get the critical F-value
critical_f = f.ppf(alpha / 2, dfn, dfd, loc=0, scale=1)

# Print the result
print(f"The critical F-value is {critical_f:.4f}")


The critical F-value is 0.0698


#### Q3. Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the F-value, degrees of freedom, and p-value for the test.

In [3]:
import numpy as np
from scipy.stats import f

# Set random seed for reproducibility
np.random.seed(123)

# Generate random samples
n1, n2 = 50, 75  # sample sizes
mu1, mu2 = 10, 10  # means
var1, var2 = 4, 6  # known variances
data1 = np.random.normal(mu1, np.sqrt(var1), n1)
data2 = np.random.normal(mu2, np.sqrt(var2), n2)

# Calculate F-statistic and p-value
f_stat = np.var(data1, ddof=1) / np.var(data2, ddof=1)
df1, df2 = n1 - 1, n2 - 1
p_val = 2 * min(f.cdf(f_stat, df1, df2), 1 - f.cdf(f_stat, df1, df2))

# Output results
print("F-value:", f_stat)
print("Degrees of freedom:", df1, ",", df2)
print("p-value:", p_val)


F-value: 0.7792593732703956
Degrees of freedom: 49 , 74
p-value: 0.35403562472938904


#### Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

Ans - Given,

Sample size from population 1 (n1) = 12
Sample size from population 2 (n2) = 12
Variance of population 1 (s1^2) = 10
Variance of population 2 (s2^2) = 15
Significance level (α) = 0.05
We can use an F-test to determine if the variances are significantly different.

The null and alternative hypotheses for the F-test are:

H0: σ1^2 = σ2^2 (The variances are equal)
Ha: σ1^2 ≠ σ2^2 (The variances are not equal)
The F-statistic can be calculated using the formula:

F = s1^2/s2^2

Under the null hypothesis, the F-statistic follows an F-distribution with degrees of freedom (df1, df2), where df1 = n1 - 1 and df2 = n2 - 1.

To find the critical F-value for a two-tailed test with α = 0.05 and df1 = 11 and df2 = 11, we can use the scipy.stats module in Python:

In [4]:
import scipy.stats as stats

alpha = 0.05
df1 = 11
df2 = 11
crit_val = stats.f.ppf(alpha/2, df1, df2)
s1_sq = 10
s2_sq = 15
F = s1_sq/s2_sq
if F < crit_val or F > 1/crit_val:
    print("Reject the null hypothesis")
else:
    print("Fail to reject the null hypothesis")


Fail to reject the null hypothesis


#### Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.

Ans - To conduct the F-test, we can use the following hypotheses:

Null hypothesis: The population variance of the diameter of the product is equal to 0.005.
Alternative hypothesis: The population variance of the diameter of the product is not equal to 0.005.

We can use the F-test formula to calculate the test statistic:

F = s2 / σ2

where s2 is the sample variance and σ2 is the population variance.

For this problem, we have:

s2 = 0.006
σ2 = 0.005
n = 25

So, the F-value can be calculated as:

F = s2 / σ2 = 0.006 / 0.005 = 1.2

To determine if this F-value is significant at the 1% significance level, we need to find the critical F-value. Since we have a one-tailed test (the alternative hypothesis is "not equal"), we need to find the critical values for both tails.

Using a calculator or a table, we find that the critical F-values are 0.368 and 3.846 for a degrees of freedom of 24 and 1% significance level.

Since our calculated F-value of 1.2 falls within this range, we fail to reject the null hypothesis. This means that we do not have sufficient evidence to conclude that the population variance is different from 0.005. Therefore, the manufacturer's claim is justified.

#### Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.

In [5]:
import math

def f_distribution_mean_var(df_n, df_d):
    mean = df_d / (df_d - 2)
    variance = (2 * (df_d ** 2) * (df_n + df_d - 2)) / ((df_n * (df_d - 2) ** 2 * (df_d - 4)))
    return mean, variance

mean, variance = f_distribution_mean_var(3, 16)
print("Mean:", mean)
print("Variance:", variance)


Mean: 1.1428571428571428
Variance: 1.2335600907029478


#### Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.

Ans - Null Hypothesis: Variances of two populations are equal.

Alternative Hypothesis: Variances of two populations are not equal.

Significance level, α = 0.10

Degrees of freedom for sample 1 (n1 = 10) is n1 - 1 = 9.

Degrees of freedom for sample 2 (n2 = 15) is n2 - 1 = 14.

Test statistic, F = s1^2/s2^2 where s1^2 is the sample variance of the first sample and s2^2 is the sample variance of the second sample.

F = 25/20 = 1.25

Using an F-table or a statistical software, we find the critical F-value for α = 0.10, df1 = 9, and df2 = 14 to be 2.82.

Since the calculated F-value (1.25) is less than the critical F-value (2.82), we fail to reject the null hypothesis.

Therefore, we do not have sufficient evidence to conclude that the variances of the two populations are significantly different at the 10% significance level.

#### Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

In [7]:
import numpy as np
from scipy.stats import f

def perform_f_test(sample1:list, sample2:list, alpha=0.05):

    """
    This function will take two samples as list and perform F-test
    and print the results
    """
    
    # Calculating Variances of both samples
    var1 = np.var(sample1, ddof=1)
    var2 = np.var(sample2, ddof=1)

    # Null and Alternate hypothesis
    null_hypothesis = "Variances are similar"
    alternate_hypothesis =  "Variances are significantly different"

    # Printing sample mean and variance
    print(f'Sample 1 Mean : {np.mean(sample1):.4f}, Sample 1 Variance : {var1:.4f}')
    print(f'Sample 2 Mean : {np.mean(sample2):.4f}, Sample 2 Variance : {var2:.4f}')

    print('\n================================================================================\n')

    # Calculate F-statistic
    if var1 >= var2:
        f_statistic = var1/var2
        dfn = len(sample1)-1
        dfd = len(sample2)-1        
    else:
        f_statistic = var2/var1
        dfn = len(sample2)-1
        dfd = len(sample1)-1

    # Calculating p-value
    p_value = 1 - f.cdf(f_statistic, dfn, dfd) 

    # Calculate F-critical
    F_crit = f.ppf(1-alpha, dfn, dfd)

    # print the results
    print(f"F-statistic: {f_statistic:.4f}")
    print(f"F Critical value: {F_crit:.4f}")
    print(f"P-value: {p_value:.4f}")
    print(f"Significance Level: {alpha}")
    
    # Determine if null hypothesis should be rejected
    if p_value < alpha:
        print("Reject null hypothesis.")
        print(f"Conclusion : {alternate_hypothesis}")
    else:
        print("FAIL to reject null hypothesis. ")
        print(f"Conclusion : {null_hypothesis}")

In [8]:
# Given Sample Data
restaurant_A = [24, 25, 28, 23, 22, 20, 27]
restaurant_B = [31, 33, 35, 30, 32, 36]

# Perform F-test by calling the function
perform_f_test(restaurant_A,restaurant_B,alpha=0.05)

Sample 1 Mean : 24.1429, Sample 1 Variance : 7.8095
Sample 2 Mean : 32.8333, Sample 2 Variance : 5.3667


F-statistic: 1.4552
F Critical value: 4.9503
P-value: 0.3487
Significance Level: 0.05
FAIL to reject null hypothesis. 
Conclusion : Variances are similar


#### Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83; Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.

Ans - To conduct an F-test to determine if the variances of two populations are significantly different, we need to follow these steps:

Calculate the sample variances of the two groups.
Calculate the F-value using the sample variances.
Calculate the p-value associated with the F-value.
Compare the p-value to the chosen significance level. If the p-value is less than the significance level, we reject the null hypothesis and conclude that the variances are significantly different.
Using Python, we can write a function to perform these steps:

In [9]:
import numpy as np
from scipy.stats import f

def f_test(data1, data2, alpha=0.01):
    # Step 1: Calculate the sample variances of the two groups
    var1 = np.var(data1, ddof=1)
    var2 = np.var(data2, ddof=1)
    
    # Step 2: Calculate the F-value using the sample variances
    f_value = var1 / var2
    
    # Step 3: Calculate the p-value associated with the F-value
    df1 = len(data1) - 1
    df2 = len(data2) - 1
    p_value = 1 - f.cdf(f_value, df1, df2)
    
    # Step 4: Compare the p-value to the chosen significance level
    if p_value < alpha:
        print(f"The p-value ({p_value:.4f}) is less than the significance level ({alpha:.2f}).")
        print("We reject the null hypothesis and conclude that the variances are significantly different.")
    else:
        print(f"The p-value ({p_value:.4f}) is greater than the significance level ({alpha:.2f}).")
        print("We fail to reject the null hypothesis and cannot conclude that the variances are significantly different.")
    
    return f_value, p_value


In [10]:
group_a = [80, 85, 90, 92, 87, 83]
group_b = [75, 78, 82, 79, 81, 84]

f_value, p_value = f_test(group_a, group_b, alpha=0.01)
print(f"F-value: {f_value:.4f}")
print(f"P-value: {p_value:.4f}")


The p-value (0.2416) is greater than the significance level (0.01).
We fail to reject the null hypothesis and cannot conclude that the variances are significantly different.
F-value: 1.9443
P-value: 0.2416
