Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.


Answer(Q1):



In [1]:
import numpy as np
from scipy.stats import f_oneway

def variance_ratio_test(array1, array2):
    """
    Calculates the F-value and p-value for a variance ratio test (F-test).

    Parameters:
        array1 (array-like): First array of data.
        array2 (array-like): Second array of data.

    Returns:
        f_value (float): The F-value calculated for the variance ratio test.
        p_value (float): The p-value associated with the F-value.
    """
    # Convert arrays to NumPy arrays to ensure compatibility with the f_oneway function
    array1 = np.asarray(array1)
    array2 = np.asarray(array2)

    # Perform the variance ratio test (F-test)
    f_value, p_value = f_oneway(array1, array2)

    return f_value, p_value

# Example usage:
array1 = [12, 15, 18, 21, 24]
array2 = [10, 13, 16, 19, 22]
f_value, p_value = variance_ratio_test(array1, array2)
print("F-value:", f_value)
print("p-value:", p_value)


F-value: 0.4444444444444444
p-value: 0.5237424355470287


Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.


Answer(Q2):

To calculate the critical F-value for a two-tailed test given a significance level (alpha) and the degrees of freedom for the numerator (df_num) and denominator (df_den) of an F-distribution, you can use the `scipy.stats.f.ppf` function from the SciPy library. This function calculates the percent point function (inverse of the cumulative distribution function) for the F-distribution. Here's the Python function to achieve this:



In [None]:
from scipy.stats import f

def critical_f_value(alpha, df_num, df_den):
    """
    Calculates the critical F-value for a two-tailed test.

    Parameters:
        alpha (float): The significance level (e.g., 0.05).
        df_num (int): Degrees of freedom for the numerator.
        df_den (int): Degrees of freedom for the denominator.

    Returns:
        critical_f (float): The critical F-value for the two-tailed test.
    """
    # Calculate the critical F-value using the percent point function (ppf)
    critical_f = f.ppf(1 - alpha / 2, df_num, df_den)

    return critical_f

# Example usage:
alpha = 0.05
df_num = 3
df_den = 10
critical_f = critical_f_value(alpha, df_num, df_den)
print("Critical F-value:", critical_f)




In this function, we use the `f.ppf` function from SciPy to calculate the critical F-value. The `ppf` function takes the significance level (alpha) and the degrees of freedom for the numerator and denominator of the F-distribution as input and returns the critical F-value for a two-tailed test at the specified significance level.

Remember that in a two-tailed test, we are interested in extreme values in both tails of the F-distribution, so we divide the significance level (alpha) by 2 and calculate the critical value corresponding to the upper (1 - alpha/2) percentile of the F-distribution.


Q3. Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the F- value, degrees of freedom, and p-value for the test.

Answer(Q3):



In [2]:
import numpy as np
from scipy.stats import f

def generate_samples(mean1, variance1, size1, mean2, variance2, size2):
    """
    Generates random samples from two normal distributions with known variances.

    Parameters:
        mean1 (float): Mean of the first normal distribution.
        variance1 (float): Variance of the first normal distribution.
        size1 (int): Number of samples to generate from the first distribution.
        mean2 (float): Mean of the second normal distribution.
        variance2 (float): Variance of the second normal distribution.
        size2 (int): Number of samples to generate from the second distribution.

    Returns:
        samples1 (numpy array): Random samples from the first normal distribution.
        samples2 (numpy array): Random samples from the second normal distribution.
    """
    samples1 = np.random.normal(mean1, np.sqrt(variance1), size1)
    samples2 = np.random.normal(mean2, np.sqrt(variance2), size2)
    return samples1, samples2

def f_test(samples1, samples2):
    """
    Performs the F-test to compare the variances of two normal distributions.

    Parameters:
        samples1 (numpy array): Random samples from the first normal distribution.
        samples2 (numpy array): Random samples from the second normal distribution.

    Returns:
        f_value (float): The F-value calculated for the variance ratio test.
        df_num (int): Degrees of freedom for the numerator.
        df_den (int): Degrees of freedom for the denominator.
        p_value (float): The p-value associated with the F-value.
    """
    df_num = len(samples1) - 1
    df_den = len(samples2) - 1

    f_value = np.var(samples1, ddof=1) / np.var(samples2, ddof=1)
    p_value = f.sf(f_value, df_num, df_den) * 2  # Two-tailed test

    return f_value, df_num, df_den, p_value

# Parameters for generating random samples
mean1 = 10.0
variance1 = 4.0
size1 = 30
mean2 = 10.5
variance2 = 4.0
size2 = 25

# Generate random samples from the two normal distributions
samples1, samples2 = generate_samples(mean1, variance1, size1, mean2, variance2, size2)

# Perform the F-test
f_value, df_num, df_den, p_value = f_test(samples1, samples2)

# Output the results
print("Sample 1:", samples1)
print("Sample 2:", samples2)
print("F-value:", f_value)
print("Degrees of freedom (numerator):", df_num)
print("Degrees of freedom (denominator):", df_den)
print("p-value:", p_value)


Sample 1: [10.04535144  8.82799009 11.8341046  11.05411394 11.684784    8.85679755
 11.5216926  12.26906331  8.12165365 13.19808172 11.37876793  7.9639177
 12.59228183 11.67897429 10.22714093  8.35277509 10.13907411 11.53243316
  7.22420748  8.63303     9.38318831  7.92129659 10.18776994 13.38909
  6.35046546  7.35993439  8.13807856  9.75062425  7.82157408 12.63815648]
Sample 2: [10.02897966 13.43769978 10.03481566 13.07823927  5.75483299 11.28431964
 13.5440952  10.29987165  7.71937922 13.48440848 10.459518    9.70325276
 10.59805544 11.3159792  12.11389254 11.6083215   8.27470508 13.16349339
 10.03056273 10.65353641  9.11588706  9.40039739  8.89730523 13.34986051
  9.88936119]
F-value: 0.9812675612790575
Degrees of freedom (numerator): 29
Degrees of freedom (denominator): 24
p-value: 1.0481244887211334


Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.


Answer(Q4):

To conduct an F-test at the 5% significance level to determine if the variances of two populations are significantly different, we'll follow these steps:

1. Define the null and alternative hypotheses.
2. Calculate the F-statistic.
3. Determine the critical F-value.
4. Compare the F-statistic with the critical F-value and make a decision.

Let's perform these steps in Python:


In [3]:
from scipy.stats import f

def f_test(variance1, variance2, sample_size1, sample_size2, significance_level):
    """
    Performs the F-test to compare the variances of two populations.

    Parameters:
        variance1 (float): Variance of the first population.
        variance2 (float): Variance of the second population.
        sample_size1 (int): Sample size of the first population.
        sample_size2 (int): Sample size of the second population.
        significance_level (float): Desired significance level for the test.

    Returns:
        result (str): The result of the F-test.
    """
    df_num = sample_size1 - 1
    df_den = sample_size2 - 1

    f_value = variance1 / variance2
    critical_f = f.ppf(1 - significance_level / 2, df_num, df_den)

    if f_value > critical_f or f_value < 1 / critical_f:
        result = "Reject the null hypothesis: Variances are significantly different."
    else:
        result = "Fail to reject the null hypothesis: Variances are not significantly different."

    return result

# Given data
variance1 = 10
variance2 = 15
sample_size1 = 12
sample_size2 = 12
significance_level = 0.05

# Perform the F-test
result = f_test(variance1, variance2, sample_size1, sample_size2, significance_level)

# Output the result
print(result)

Fail to reject the null hypothesis: Variances are not significantly different.


In this code, we define the `f_test` function to conduct the F-test. We calculate the F-value by dividing the larger variance by the smaller variance. Then, we calculate the critical F-value using the `f.ppf` function for a two-tailed test at the desired significance level.

Finally, we compare the F-value with the critical F-value and make a decision based on whether the F-value is larger or smaller than the critical value. If the F-value is greater than the critical value or smaller than the reciprocal of the critical value, we reject the null hypothesis, indicating that the variances are significantly different. Otherwise, we fail to reject the null hypothesis, suggesting that the variances are not significantly different.

Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.

Answer(Q5):

To conduct an F-test at the 1% significance level and determine if the manufacturer's claim about the variance of the product diameter is justified, we'll follow the steps outlined below:

1. Define the null and alternative hypotheses.
2. Calculate the F-statistic.
3. Determine the critical F-value.
4. Compare the F-statistic with the critical F-value and make a decision.

Let's perform these steps in Python:



In [4]:
from scipy.stats import f

def f_test(sample_variance, claimed_variance, sample_size, significance_level):
    """
    Performs the F-test to compare the sample variance with the claimed variance.

    Parameters:
        sample_variance (float): Sample variance of the product diameter.
        claimed_variance (float): The manufacturer's claimed variance.
        sample_size (int): Sample size.
        significance_level (float): Desired significance level for the test.

    Returns:
        result (str): The result of the F-test.
    """
    df_num = sample_size - 1
    df_den = sample_size

    f_value = sample_variance / claimed_variance
    critical_f = f.ppf(1 - significance_level, df_num, df_den)

    if f_value > critical_f:
        result = "Reject the null hypothesis: The claim is not justified."
    else:
        result = "Fail to reject the null hypothesis: The claim is justified."

    return result

# Given data
claimed_variance = 0.005
sample_variance = 0.006
sample_size = 25
significance_level = 0.01

# Perform the F-test
result = f_test(sample_variance, claimed_variance, sample_size, significance_level)

# Output the result
print(result)


Fail to reject the null hypothesis: The claim is justified.


In this code, we define the `f_test` function to conduct the F-test. We calculate the F-value by dividing the sample variance by the claimed variance. Then, we calculate the critical F-value using the `f.ppf` function for a one-tailed test at the desired significance level.

Finally, we compare the F-value with the critical F-value and make a decision based on whether the F-value is greater than the critical value. If the F-value is larger, we reject the null hypothesis, indicating that the manufacturer's claim is not justified. Otherwise, we fail to reject the null hypothesis, suggesting that the claim is justified at the 1% significance level.

Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.

Answer(Q6):

To calculate the mean and variance of an F-distribution given the degrees of freedom for the numerator and denominator, you can use the properties of the F-distribution. The mean and variance of the F-distribution depend on the degrees of freedom.

Here's a Python function that accomplishes this:

In [None]:
def f_distribution_mean_variance(df_num, df_den):
    """
    Calculates the mean and variance of an F-distribution.

    Parameters:
        df_num (int): Degrees of freedom for the numerator.
        df_den (int): Degrees of freedom for the denominator.

    Returns:
        mean (float): The mean of the F-distribution.
        variance (float): The variance of the F-distribution.
    """
    # Check if degrees of freedom are valid
    if df_num <= 0 or df_den <= 0:
        raise ValueError("Degrees of freedom must be greater than 0.")

    # Calculate the mean and variance of the F-distribution
    if df_den > 2:
        mean = df_den / (df_den - 2)
        variance = (2 * (df_den**2) * (df_num + df_den - 2)) / ((df_num * (df_den - 2)**2 * (df_den - 4)))
    else:
        mean = float('inf')  # When df_den <= 2, the mean is not defined (it approaches infinity)
        variance = float('inf')  # When df_den <= 2, the variance is not defined (it approaches infinity)

    return mean, variance

# Example usage:
df_num = 3
df_den = 20
mean, variance = f_distribution_mean_variance(df_num, df_den)
print("Mean:", mean)
print("Variance:", variance)




In this function, we first check if the provided degrees of freedom are valid (greater than 0) since negative or zero degrees of freedom are not meaningful for the F-distribution.

Then, we calculate the mean and variance of the F-distribution using the formulas:

- Mean (μ) = df_den / (df_den - 2) when df_den > 2
- Variance (σ^2) = (2 * (df_den^2) * (df_num + df_den - 2)) / (df_num * (df_den - 2)^2 * (df_den - 4)) when df_den > 4

For cases when df_den is less than or equal to 2, the mean and variance are not defined (approach infinity). In such cases, we set the mean and variance to `float('inf')`.

Note: The function assumes that the degrees of freedom provided are appropriate for calculating the mean and variance of the F-distribution.

Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.


Answer(Q7):

To conduct an F-test at the 10% significance level and determine if the variances of the two populations are significantly different, we'll follow these steps:

1. Define the null and alternative hypotheses.
2. Calculate the F-statistic.
3. Determine the critical F-value.
4. Compare the F-statistic with the critical F-value and make a decision.

Let's perform these steps in Python:



In [5]:
from scipy.stats import f

def f_test(sample_variance1, sample_variance2, sample_size1, sample_size2, significance_level):
    """
    Performs the F-test to compare the sample variances of two populations.

    Parameters:
        sample_variance1 (float): Sample variance of the first population.
        sample_variance2 (float): Sample variance of the second population.
        sample_size1 (int): Sample size of the first population.
        sample_size2 (int): Sample size of the second population.
        significance_level (float): Desired significance level for the test.

    Returns:
        result (str): The result of the F-test.
    """
    df_num = sample_size1 - 1
    df_den = sample_size2 - 1

    f_value = sample_variance1 / sample_variance2
    critical_f = f.ppf(1 - significance_level, df_num, df_den)

    if f_value > critical_f or f_value < 1 / critical_f:
        result = "Reject the null hypothesis: Variances are significantly different."
    else:
        result = "Fail to reject the null hypothesis: Variances are not significantly different."

    return result

# Given data
sample_variance1 = 25
sample_variance2 = 20
sample_size1 = 10
sample_size2 = 15
significance_level = 0.10

# Perform the F-test
result = f_test(sample_variance1, sample_variance2, sample_size1, sample_size2, significance_level)

# Output the result
print(result)



Fail to reject the null hypothesis: Variances are not significantly different.



In this code, we define the `f_test` function to conduct the F-test. We calculate the F-value by dividing the larger sample variance by the smaller sample variance. Then, we calculate the critical F-value using the `f.ppf` function for a one-tailed test at the desired significance level.

Finally, we compare the F-value with the critical F-value and make a decision based on whether the F-value is larger than the critical value or smaller than the reciprocal of the critical value. If the F-value is greater or smaller, we reject the null hypothesis, indicating that the variances are significantly different. Otherwise, we fail to reject the null hypothesis, suggesting that the variances are not significantly different at the 10% significance level.

Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

Answer(Q8):

To conduct an F-test at the 5% significance level and determine if the variances of the waiting times at Restaurant A and Restaurant B are significantly different, we'll follow these steps:

1. Define the null and alternative hypotheses.
2. Calculate the F-statistic.
3. Determine the critical F-value.
4. Compare the F-statistic with the critical F-value and make a decision.

Let's perform these steps in Python:



In [7]:
import numpy as np
from scipy.stats import f

def f_test(sample1, sample2, significance_level):
    """
    Performs the F-test to compare the sample variances of two samples.

    Parameters:
        sample1 (list): List of waiting times at Restaurant A.
        sample2 (list): List of waiting times at Restaurant B.
        significance_level (float): Desired significance level for the test.

    Returns:
        result (str): The result of the F-test.
    """
    sample_size1 = len(sample1)
    sample_size2 = len(sample2)
    df_num = sample_size1 - 1
    df_den = sample_size2 - 1

    sample_variance1 = np.var(sample1, ddof=1)
    sample_variance2 = np.var(sample2, ddof=1)

    f_value = sample_variance1 / sample_variance2
    critical_f = f.ppf(1 - significance_level, df_num, df_den)

    if f_value > critical_f or f_value < 1 / critical_f:
        result = "Reject the null hypothesis: Variances are significantly different."
    else:
        result = "Fail to reject the null hypothesis: Variances are not significantly different."

    return result

# Given data
sample1 = [24, 25, 28, 23, 22, 20, 27]
sample2 = [31, 33, 35, 30, 32, 36]
significance_level = 0.05

# Perform the F-test
result = f_test(sample1, sample2, significance_level)

# Output the result
print(result)


Fail to reject the null hypothesis: Variances are not significantly different.




In this code, we define the `f_test` function to conduct the F-test. We calculate the F-value by dividing the sample variance of Restaurant A by the sample variance of Restaurant B. Then, we calculate the critical F-value using the `f.ppf` function for a one-tailed test at the desired significance level.

Finally, we compare the F-value with the critical F-value and make a decision based on whether the F-value is greater or smaller than the critical value. If the F-value is larger or smaller, we reject the null hypothesis, indicating that the variances are significantly different. Otherwise, we fail to reject the null hypothesis, suggesting that the variances are not significantly different at the 5% significance level.

Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83; Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.

Answer(Q9):

To conduct an F-test at the 1% significance level and determine if the variances of the test scores for Group A and Group B are significantly different, we'll follow these steps:

1. Define the null and alternative hypotheses.
2. Calculate the F-statistic.
3. Determine the critical F-value.
4. Compare the F-statistic with the critical F-value and make a decision.

Let's perform these steps in Python:


In [6]:
import numpy as np
from scipy.stats import f

def f_test(sample1, sample2, significance_level):
    """
    Performs the F-test to compare the sample variances of two samples.

    Parameters:
        sample1 (list): List of test scores for Group A.
        sample2 (list): List of test scores for Group B.
        significance_level (float): Desired significance level for the test.

    Returns:
        result (str): The result of the F-test.
    """
    sample_size1 = len(sample1)
    sample_size2 = len(sample2)
    df_num = sample_size1 - 1
    df_den = sample_size2 - 1

    sample_variance1 = np.var(sample1, ddof=1)
    sample_variance2 = np.var(sample2, ddof=1)

    f_value = sample_variance1 / sample_variance2
    critical_f = f.ppf(1 - significance_level, df_num, df_den)

    if f_value > critical_f or f_value < 1 / critical_f:
        result = "Reject the null hypothesis: Variances are significantly different."
    else:
        result = "Fail to reject the null hypothesis: Variances are not significantly different."

    return result

# Given data
groupA_scores = [80, 85, 90, 92, 87, 83]
groupB_scores = [75, 78, 82, 79, 81, 84]
significance_level = 0.01

# Perform the F-test
result = f_test(groupA_scores, groupB_scores, significance_level)

# Output the result
print(result)


Fail to reject the null hypothesis: Variances are not significantly different.



In this code, we define the `f_test` function to conduct the F-test. We calculate the F-value by dividing the sample variance of Group A by the sample variance of Group B. Then, we calculate the critical F-value using the `f.ppf` function for a one-tailed test at the desired significance level.

Finally, we compare the F-value with the critical F-value and make a decision based on whether the F-value is greater or smaller than the critical value. If the F-value is larger or smaller, we reject the null hypothesis, indicating that the variances are significantly different. Otherwise, we fail to reject the null hypothesis, suggesting that the variances are not significantly different at the 1% significance level.