# Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.

In [1]:
import numpy as np
from scipy import stats

def calculate_f_value(data1, data2):
    # Calculate the variances
    var1 = np.var(data1, ddof=1)  # ddof=1 for sample variance
    var2 = np.var(data2, ddof=1)

    # Calculate the F-value and p-value
    f_value = var1 / var2
    df1 = len(data1) - 1
    df2 = len(data2) - 1
    p_value = stats.f.sf(f_value, df1, df2)

    return f_value, p_value


In [2]:
data1 = [1, 2, 3, 4, 5]
data2 = [6, 7, 8, 9, 10]

f_value, p_value = calculate_f_value(data1, data2)

print("F-value:", f_value)
print("p-value:", p_value)


F-value: 1.0
p-value: 0.5


we first calculate the variances of the two input arrays using np.var(). The ddof=1 argument is set to calculate the sample variances instead of the population variances.

Next, we calculate the F-value by dividing the first variance by the second variance. Then, we calculate the degrees of freedom (df1 and df2) using the lengths of the data arrays minus 1.

Finally, we use stats.f.sf() from the scipy.stats module to calculate the p-value. The sf function returns the survival function, which is 1 minus the cumulative distribution function (CDF) for the F-distribution. It takes the F-value, along with the degrees of freedom (df1 and df2), as arguments.

The function returns the calculated F-value and the corresponding p-value.

# Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

To calculate the critical F-value for a two-tailed test in Python, you can use the scipy.stats module. Specifically, the f.ppf() function from the scipy.stats module can be used to calculate the critical F-value based on the significance level and degrees of freedom.

In [7]:
from scipy.stats import f

def get_critical_f_value(alpha, dfn, dfd):
    # Calculate the critical F-value
    critical_f = f.ppf(1 - alpha / 2, dfn, dfd)
    
    return critical_f


f.ppf() calculates the inverse of the cumulative distribution function (CDF) of the F-distribution. By passing 1 - alpha/2 as the first argument, we obtain the critical F-value for a two-tailed test.

this function by calling it with the desired significance level, degrees of freedom for the numerator, and degrees of freedom for the denominator.

In [8]:
alpha = 0.05
dfn = 5
dfd = 10

critical_f_value = get_critical_f_value(alpha, dfn, dfd)
print("Critical F-value:", critical_f_value)


Critical F-value: 4.236085668188633


calculate and print the critical F-value for a two-tailed test with a significance level of 0.05, and degrees of freedom for the numerator (dfn) equal to 5 and degrees of freedom for the denominator (dfd) equal to 10.

# Q3. Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the Fvalue, degrees of freedom, and p-value for the test.

To generate random samples from two normal distributions with known variances and perform an F-test in Python, you can utilize the numpy and scipy.stats modules. 

In [6]:
import numpy as np
from scipy.stats import f

def perform_f_test(sample1, sample2):
    # Calculate the variances of the samples
    var1 = np.var(sample1, ddof=1)  # Variance of sample 1
    var2 = np.var(sample2, ddof=1)  # Variance of sample 2
    
    # Calculate the F-value
    f_value = var1 / var2
    
    # Calculate the degrees of freedom
    dfn = len(sample1) - 1  # Degrees of freedom numerator
    dfd = len(sample2) - 1  # Degrees of freedom denominator
    
    # Calculate the p-value
    p_value = 2 * min(f.cdf(f_value, dfn, dfd), 1 - f.cdf(f_value, dfn, dfd))
    
    return f_value, dfn, dfd, p_value


# Generate random samples from two normal distributions
np.random.seed(42)  # Set the random seed for reproducibility
sample1 = np.random.normal(loc=0, scale=1, size=100)
sample2 = np.random.normal(loc=0, scale=1.5, size=100)

# Perform the F-test
f_value, dfn, dfd, p_value = perform_f_test(sample1, sample2)

# Print the results
print("F-value:", f_value)
print("Degrees of Freedom (numerator):", dfn)
print("Degrees of Freedom (denominator):", dfd)
print("p-value:", p_value)


F-value: 0.4030463392763856
Degrees of Freedom (numerator): 99
Degrees of Freedom (denominator): 99
p-value: 9.191624675736852e-06


we first import the necessary modules numpy and scipy.stats. The perform_f_test() function takes two samples as input (sample1 and sample2). It calculates the variances of the samples using np.var() and then computes the F-value as the ratio of the variances. The degrees of freedom for the numerator (dfn) and denominator (dfd) are determined by subtracting 1 from the sample sizes. Finally, the p-value is calculated using the cumulative distribution function (f.cdf()) of the F-distribution.

To generate random samples, we use np.random.normal() with different means (loc) and variances (scale) for each sample. In this example, we generate 100 samples from two normal distributions with mean 0 and standard deviation 1 for sample1, and mean 0 and standard deviation 1.5 for sample2.

The program then performs the F-test by calling the perform_f_test() function with the generated samples. The results, including the F-value, degrees of freedom, and p-value, are printed to the console.

# Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

To conduct an F-test to determine if the variances of two populations are significantly different, we can follow these steps:

Step 1: State the null hypothesis (H0) and the alternative hypothesis (Ha).
In this case:
H0: The variances of the two populations are equal.
Ha: The variances of the two populations are significantly different.

Step 2: Determine the significance level.
The significance level is given as 5% or 0.05.

Step 3: Compute the F-statistic.
The F-statistic is calculated as the ratio of the sample variances:
F = s1^2 / s2^2

where s1^2 is the sample variance of the first population and s2^2 is the sample variance of the second population.

Step 4: Determine the critical value.
The critical value is obtained from the F-distribution with degrees of freedom (df1, df2) where:

    df1 = n1 - 1 (degrees of freedom for the first population)
    df2 = n2 - 1 (degrees of freedom for the second population)
    In this case, n1 = n2 = 12 (sample size from each population), so df1 = df2 = 11.

Using a significance level of 0.05, we can find the critical value from the F-distribution table or calculator.

Step 5: Compare the F-statistic with the critical value.
If the calculated F-statistic is greater than the critical value, we reject the null hypothesis. Otherwise, we fail to reject the null hypothesis.

Now let's calculate the F-statistic and perform the test:

Given:
Variance of population 1 (σ1^2) = 10
Variance of population 2 (σ2^2) = 15
Sample size (n1 = n2) = 12
Significance level (α) = 0.05

Step 3: Compute the F-statistic
F = s1^2 / s2^2
= 10 / 15
= 0.6667

Step 4: Determine the critical value
Since df1 = df2 = 11 and α = 0.05, the critical value is approximately 2.63 (from the F-distribution table).

Step 5: Compare the F-statistic with the critical value
0.6667 < 2.63

Since the calculated F-statistic (0.6667) is less than the critical value (2.63), we fail to reject the null hypothesis.

# Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.

To conduct an F-test to determine if the manufacturer's claim about the variance is justified, we can follow these steps:

Step 1: State the null hypothesis (H0) and the alternative hypothesis (Ha).
In this case:
H0: The variance of the diameter is equal to 0.005.
Ha: The variance of the diameter is not equal to 0.005.

Step 2: Determine the significance level.
The significance level is given as 1% or 0.01.

Step 3: Compute the F-statistic.
The F-statistic is calculated as the ratio of the sample variances:
F = s^2 / σ^2

where s^2 is the sample variance and σ^2 is the claimed variance.

Step 4: Determine the critical value.
The critical value is obtained from the F-distribution with degrees of freedom (df1, df2) where:

    df1 = n - 1 (degrees of freedom for the sample)
    df2 = N - 1 (degrees of freedom for the population)
    In this case, n = 25 (sample size) and N is not given.

Since N is not given, we assume a large population or approximate it with a t-distribution with infinite degrees of freedom, which is nearly equivalent to the standard normal distribution.

Using a significance level of 0.01, we can find the critical value from the F-distribution or standard normal distribution table or calculator.

Step 5: Compare the F-statistic with the critical value.
If the calculated F-statistic is greater than the critical value or falls outside the confidence interval, we reject the null hypothesis. Otherwise, we fail to reject the null hypothesis.

Now let's calculate the F-statistic and perform the test:

Given:
Claimed variance (σ^2) = 0.005
Sample variance (s^2) = 0.006
Sample size (n) = 25
Significance level (α) = 0.01

Step 3: Compute the F-statistic
F = s^2 / σ^2
= 0.006 / 0.005
= 1.2

Step 4: Determine the critical value
Since the degrees of freedom for the sample (df1) is 25 - 1 = 24, and the degrees of freedom for the population (df2) is assumed to be large or infinite, we can use the critical value from the standard normal distribution at the 0.995 percentile (since we have a two-tailed test due to "not equal" alternative hypothesis). The critical value is approximately 2.58.

Step 5: Compare the F-statistic with the critical value
1.2 < 2.58

Since the calculated F-statistic (1.2) is less than the critical value (2.58), we fail to reject the null hypothesis.

# Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.

In [9]:
def f_distribution_mean_variance(df1, df2):
    # Calculate the mean
    mean = df2 / (df2 - 2)

    # Calculate the variance
    var_numerator = 2 * (df2 ** 2) * (df1 + df2 - 2)
    var_denominator = df1 * ((df2 - 2) ** 2) * (df2 - 4)
    variance = var_numerator / var_denominator

    return mean, variance


this function by passing the degrees of freedom for the numerator and denominator as arguments. It will return a tuple containing the mean and variance of the F-distribution.

In [10]:
df1 = 5
df2 = 10
mean, variance = f_distribution_mean_variance(df1, df2)
print("Mean:", mean)
print("Variance:", variance)


Mean: 1.25
Variance: 1.3541666666666667


# Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.

To conduct an F-test to determine if the variances of two populations are significantly different, we can follow these steps:

Step 1: State the null hypothesis (H0) and the alternative hypothesis (Ha).
In this case:
H0: The variances of the two populations are equal.
Ha: The variances of the two populations are significantly different.

Step 2: Determine the significance level.
The significance level is given as 10% or 0.1.

Step 3: Compute the F-statistic.
The F-statistic is calculated as the ratio of the sample variances:
F = s1^2 / s2^2

where s1^2 is the sample variance of the first population and s2^2 is the sample variance of the second population.

Step 4: Determine the critical value.
The critical value is obtained from the F-distribution with degrees of freedom (df1, df2) where:

    df1 = n1 - 1 (degrees of freedom for the first sample)
    df2 = n2 - 1 (degrees of freedom for the second sample)
    In this case, n1 = 10 (sample size of the first sample) and n2 = 15 (sample size of the second sample).

Using a significance level of 0.1, we can find the critical value from the F-distribution table or calculator.

Step 5: Compare the F-statistic with the critical value.
If the calculated F-statistic is greater than the critical value, we reject the null hypothesis. Otherwise, we fail to reject the null hypothesis.

Now let's calculate the F-statistic and perform the test:

Given:
Sample variance of the first sample (s1^2) = 25
Sample variance of the second sample (s2^2) = 20
Sample size of the first sample (n1) = 10
Sample size of the second sample (n2) = 15
Significance level (α) = 0.1

Step 3: Compute the F-statistic
F = s1^2 / s2^2
= 25 / 20
= 1.25

Step 4: Determine the critical value
Since df1 = n1 - 1 = 10 - 1 = 9, and df2 = n2 - 1 = 15 - 1 = 14, and α = 0.1, the critical value is approximately 1.945 (from the F-distribution table).

Step 5: Compare the F-statistic with the critical value
1.25 < 1.945

Since the calculated F-statistic (1.25) is less than the critical value (1.945), we fail to reject the null hypothesis.

# Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

To conduct an F-test to determine if the variances of two populations are significantly different, we can follow these steps:

Step 1: State the null hypothesis (H0) and the alternative hypothesis (Ha).
In this case:
H0: The variances of the waiting times at the two restaurants are equal.
Ha: The variances of the waiting times at the two restaurants are significantly different.

Step 2: Determine the significance level.
The significance level is given as 5% or 0.05.

Step 3: Calculate the sample variances.
Compute the sample variances for each restaurant using the given data.

For Restaurant A:
n1 = 7 (number of observations)
x1 = [24, 25, 28, 23, 22, 20, 27] (waiting times)
Calculate the sample variance s1^2 using the formula:
s1^2 = Σ(x1 - x̄1)^2 / (n1 - 1)

For Restaurant B:
n2 = 6 (number of observations)
x2 = [31, 33, 35, 30, 32, 36] (waiting times)
Calculate the sample variance s2^2 using the formula:
s2^2 = Σ(x2 - x̄2)^2 / (n2 - 1)

Step 4: Compute the F-statistic.
The F-statistic is calculated as the ratio of the sample variances:
F = s1^2 / s2^2

Step 5: Determine the critical value.
The critical value is obtained from the F-distribution with degrees of freedom (df1, df2) where:

    df1 = n1 - 1 (degrees of freedom for Restaurant A)
    df2 = n2 - 1 (degrees of freedom for Restaurant B)
    In this case, n1 = 7 and n2 = 6.

Using a significance level of 0.05, we can find the critical value from the F-distribution table or calculator.

Step 6: Compare the F-statistic with the critical value.
If the calculated F-statistic is greater than the critical value, we reject the null hypothesis. Otherwise, we fail to reject the null hypothesis.

Now let's calculate the F-statistic and perform the test:

Given:
Waiting times at Restaurant A: [24, 25, 28, 23, 22, 20, 27]
Waiting times at Restaurant B: [31, 33, 35, 30, 32, 36]
Significance level (α) = 0.05

Step 3: Calculate the sample variances
For Restaurant A:
x̄1 = sum(x1) / n1 = (24 + 25 + 28 + 23 + 22 + 20 + 27) / 7 = 23.57
s1^2 = Σ(x1 - x̄1)^2 / (n1 - 1) = (24 - 23.57)^2 + (25 - 23.57)^2 + ... + (27 - 23.57)^2 / (7 - 1) = 6.38

For Restaurant B:
x̄2 = sum(x2) / n2 = (31 + 33 + 35 + 30 + 32 + 36) / 6 = 32.83
s2^2 = Σ(x2 - x̄2)^2 / (n2 - 1) = (31 - 32.83)^2 + (33 - 32.83)^2 + ... + (36 - 32.83)^2 / (6 - 1) = 6.97

Step 4: Compute the F-statistic
F = s1^2 / s2^2 = 6.38 / 6.97 ≈ 0.916

Step 5: Determine the critical value
df1 = n1 - 1 = 7 - 1 = 6
df2 = n2 - 1 = 6 - 1 = 5
Using a significance level of 0.05, the critical value from the F-distribution table is approximately 5.05.

Step 6: Compare the F-statistic with the critical value
0.916 < 5.05

Since the calculated F-statistic (0.916) is less than the critical value (5.05), we fail to reject the null hypothesis.

# Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83; Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.

To conduct an F-test to determine if the variances of two populations are significantly different, we can follow these steps:

Step 1: State the null hypothesis (H0) and the alternative hypothesis (Ha).
In this case:
H0: The variances of the test scores in the two groups are equal.
Ha: The variances of the test scores in the two groups are significantly different.

Step 2: Determine the significance level.
The significance level is given as 1% or 0.01.

Step 3: Calculate the sample variances.
Compute the sample variances for each group using the given data.

For Group A:
n1 = 6 (number of observations)
x1 = [80, 85, 90, 92, 87, 83] (test scores)
Calculate the sample variance s1^2 using the formula:
s1^2 = Σ(x1 - x̄1)^2 / (n1 - 1)

For Group B:
n2 = 6 (number of observations)
x2 = [75, 78, 82, 79, 81, 84] (test scores)
Calculate the sample variance s2^2 using the formula:
s2^2 = Σ(x2 - x̄2)^2 / (n2 - 1)

Step 4: Compute the F-statistic.
The F-statistic is calculated as the ratio of the sample variances:
F = s1^2 / s2^2

Step 5: Determine the critical value.
The critical value is obtained from the F-distribution with degrees of freedom (df1, df2) where:

    df1 = n1 - 1 (degrees of freedom for Group A)
    df2 = n2 - 1 (degrees of freedom for Group B)
    In this case, n1 = 6 and n2 = 6.

Using a significance level of 0.01, we can find the critical value from the F-distribution table or calculator.

Step 6: Compare the F-statistic with the critical value.
If the calculated F-statistic is greater than the critical value, we reject the null hypothesis. Otherwise, we fail to reject the null hypothesis.

Now let's calculate the F-statistic and perform the test:

Given:
Test scores for Group A: [80, 85, 90, 92, 87, 83]
Test scores for Group B: [75, 78, 82, 79, 81, 84]
Significance level (α) = 0.01

Step 3: Calculate the sample variances
For Group A:
x̄1 = sum(x1) / n1 = (80 + 85 + 90 + 92 + 87 + 83) / 6 = 86.17
s1^2 = Σ(x1 - x̄1)^2 / (n1 - 1) = (80 - 86.17)^2 + (85 - 86.17)^2 + ... + (83 - 86.17)^2 / (6 - 1) = 24.96

For Group B:
x̄2 = sum(x2) / n2 = (75 + 78 + 82 + 79 + 81 + 84) / 6 = 80.83
s2^2 = Σ(x2 - x̄2)^2 / (n2 - 1) = (75 - 80.83)^2 + (78 - 80.83)^2 + ... + (84 - 80.83)^2 / (6 - 1) = 11.47

Step 4: Compute the F-statistic
F = s1^2 / s2^2 = 24.96 / 11.47 ≈ 2.174

Step 5: Determine the critical value
df1 = n1 - 1 = 6 - 1 = 5
df2 = n2 - 1 = 6 - 1 = 5
Using a significance level of 0.01, the critical value from the F-distribution table is approximately 8.03.

Step 6: Compare the F-statistic with the critical value
2.174 < 8.03

Since the calculated F-statistic (2.174) is less than the critical value (8.03), we fail to reject the null hypothesis.