## Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.

In [3]:
import numpy as np
from scipy.stats import f

def variance_ratio_test(x, y):
    """
    Performs a variance ratio test on two arrays of data.
    
    Parameters:
        x (array-like): First array of data
        y (array-like): Second array of data
        
    Returns:
        F-value (float): The F-value for the variance ratio test
        p-value (float): The p-value for the variance ratio test
    """
    # Calculate the variances of x and y
    var_x = np.var(x, ddof=1)
    var_y = np.var(y, ddof=1)
    
    # Check that the variances are not zero
    if var_x == 0 or var_y == 0:
        raise ValueError("One or both variances are zero.")
    
    # Calculate the F-value and p-value
    F = var_x / var_y
    p = f.cdf(F, len(x) - 1, len(y) - 1)
    
    # Return the F-value and p-value
    return F, p

#### The variance_ratio_test function first calculates the variances of the two arrays x and y. It then checks that neither variance is zero, as this would result in a division by zero error when calculating the F-value. If either variance is zero, the function raises a ValueError.

#### If both variances are non-zero, the function calculates the F-value as the ratio of the larger variance to the smaller variance. It then calculates the corresponding p-value using the cumulative distribution function of the F-distribution, with the degrees of freedom for the numerator and denominator set to len(x) - 1 and len(y) - 1, respectively.

#### The function returns both the F-value and p-value for the variance ratio test.

## Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [4]:
from scipy.stats import f

def get_critical_f(alpha, dfn, dfd):
    """
    Calculates the critical F-value for a two-tailed test.

    Args:
    alpha (float): The significance level.
    dfn (int): The degrees of freedom for the numerator.
    dfd (int): The degrees of freedom for the denominator.

    Returns:
    float: The critical F-value.
    """
    return f.ppf(q=1-alpha/2, dfn=dfn, dfd=dfd)
'''You can use this function to obtain the critical F-value for a given significance level and degrees of freedom. 
For example, to get the critical F-value at a significance level of 0.05 
with 3 degrees of freedom in the numerator and 24 degrees of freedom in the denominator
you can call the function like this'''
get_critical_f(0.05, 3, 24)

3.7210801909151088

## Q3. Write a Python program that generates random samples from two normal distributions with known

#### variances and uses an F-test to determine if the variances are equal. The program should output the F- value, degrees of freedom, and p-value for the test.

In [5]:
import numpy as np
from scipy.stats import f

# Set the parameters
mu1 = 0.0
mu2 = 0.0
sigma1 = 1.0
sigma2 = 2.0
n1 = 50
n2 = 75
alpha = 0.05

# Generate the random samples
x1 = np.random.normal(mu1, sigma1, n1)
x2 = np.random.normal(mu2, sigma2, n2)

# Calculate the F-statistic
s1_squared = np.var(x1, ddof=1)
s2_squared = np.var(x2, ddof=1)
F = s1_squared / s2_squared

# Calculate the degrees of freedom
df1 = n1 - 1
df2 = n2 - 1

# Calculate the p-value
p_value = 2 * min(f.cdf(F, df1, df2), 1 - f.cdf(F, df1, df2))

# Output the results
print("F-value: ", F)
print("Degrees of freedom: ", df1, ",", df2)
print("p-value: ", p_value)
if p_value < alpha:
    print("The variances are not equal.")
else:
    print("The variances are equal.")

F-value:  0.24015555364837862
Degrees of freedom:  49 , 74
p-value:  5.239769475025295e-07
The variances are not equal.


## Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

 #### To conduct an F-test to determine if the variances of two populations are significantly different, we can follow these steps:

#### Step 1: State the null and alternative hypotheses:

#### Null hypothesis: The variances of the two populations are equal. Alternative hypothesis: The variances of the two populations are not equal. Step 2: Set the significance level, alpha, to 0.05.

#### Step 3: Calculate the F-statistic using the formula:

#### F = s1^2 / s2^2

#### where s1^2 is the sample variance of the first population and s2^2 is the sample variance of the second population.

#### Step 4: Calculate the degrees of freedom for the F-distribution using the formula:

#### df1 = n1 - 1 df2 = n2 - 1

#### where n1 and n2 are the sample sizes of the two populations.

#### Step 5: Calculate the p-value using the F-distribution with degrees of freedom df1 and df2.

#### Step 6: Compare the p-value with the significance level alpha. If the p-value is less than alpha, reject the null hypothesis and conclude that the variances of the two populations are significantly different. Otherwise, fail to reject the null hypothesis and conclude that there is not enough evidence to suggest that the variances are significantly different.

#### Using the given information, we can now apply these steps to conduct an F-test:

#### Step 1: The null and alternative hypotheses are:

#### Null hypothesis: The variances of the two populations are equal. Alternative hypothesis: The variances of the two populations are not equal. Step 2: The significance level, alpha, is 0.05.

#### Step 3: The sample variances are not given, but we can use the fact that the variances of the two populations are known to be 10 and 15, respectively. Therefore, we can set:

#### s1^2 = 10 s2^2 = 15

#### Then, we can calculate the F-statistic:
  ####   F = s1^2 / s2^2 = 10 / 15 = 0.6667

#### Step 4: The degrees of freedom are:

#### df1 = n1 - 1 = 12 - 1 = 11 df2 = n2 - 1 = 12 - 1 = 11

#### Step 5: Using the F-distribution with degrees of freedom df1 = 11 and df2 = 11, we can calculate the p-value associated with the F-statistic of 0.6667:

#### p_value = 2 * min(f.cdf(F, df1, df2), 1 - f.cdf(F, df1, df2)) = 2 * min(0.1217, 0.8783) = 0.2434

#### Step 6: Since the p-value (0.2434) is greater than the significance level alpha (0.05), we fail to reject the null hypothesis and conclude that there is not enough evidence to suggest that the variances of the two populations are significantly different.

#### Therefore, based on this F-test, we cannot conclude that the variances of the two populations are significantly different at the 5% significance level.

## Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.

#### To conduct an F-test to determine if the claim made by the manufacturer regarding the variance of the diameter of a certain product is justified, we can follow these steps:

#### Step 1: State the null and alternative hypotheses:

#### Null hypothesis: The variance of the diameter of the product is equal to the claimed value of 0.005. Alternative hypothesis: The variance of the diameter of the product is not equal to 0.005. Step 2: Set the significance level, alpha, to 0.01.

#### Step 3: Calculate the F-statistic using the formula:

#### F = s^2 / sigma^2

#### where s^2 is the sample variance and sigma^2 is the claimed variance.

#### Step 4: Calculate the degrees of freedom for the F-distribution using the formula:

#### df1 = n - 1 df2 = infinity

#### where n is the sample size.

#### Step 5: Calculate the p-value using the F-distribution with degrees of freedom df1 and df2.

#### Step 6: Compare the p-value with the significance level alpha. If the p-value is less than alpha, reject the null hypothesis and conclude that the claimed variance is not justified. Otherwise, fail to reject the null hypothesis and conclude that there is not enough evidence to suggest that the claimed variance is not justified.

#### Using the given information, we can now apply these steps to conduct an F-test:

#### Step 1: The null and alternative hypotheses are:

#### Null hypothesis: The variance of the diameter of the product is equal to the claimed value of 0.005. Alternative hypothesis: The variance of the diameter of the product is not equal to 0.005. Step 2: The significance level, alpha, is 0.01.

#### Step 3: The sample variance and claimed variance are:

#### s^2 = 0.006 sigma^2 = 0.005
#### Then, we can calculate the F-statistic:

#### F = s^2 / sigma^2 = 0.006 / 0.005 = 1.2

#### Step 4: The degrees of freedom are:

#### df1 = n - 1 = 25 - 1 = 24 df2 = infinity

#### Step 5: Using the F-distribution with degrees of freedom df1 = 24 and df2 = infinity, we can calculate the p-value associated with the F-statistic of 1.2:

#### p_value = 2 * min(f.cdf(F, df1, df2), 1 - f.cdf(F, df1, df2)) = 2 * min(0.3664, 0.6336) = 0.733

#### Step 6: Since the p-value (0.733) is greater than the significance level alpha (0.01), we fail to reject the null hypothesis and conclude that there is not enough evidence to suggest that the claimed variance of 0.005 is not justified.

#### Therefore, based on this F-test, we cannot conclude that the claimed variance of the diameter of the product is not justified at the 1% significance level.

## Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.

In [7]:
from scipy.stats import f

def mean_and_variance(num_df, den_df):
    """
    Calculates the mean and variance of an F-distribution given the degrees of freedom
    for the numerator and denominator.
    """
    mean = den_df / (den_df - 2)
    variance = (2 * (den_df ** 2) * (num_df + den_df - 2)) / ((num_df * (den_df - 2) ** 2) * (den_df - 4))
    return (mean, variance)

#### This function makes use of the scipy.stats.f module to calculate the mean and variance of the F-distribution. The num_df argument represents the degrees of freedom for the numerator of the F-distribution, while the den_df argument represents the degrees of freedom for the denominator of the F-distribution.

#### The function first calculates the mean of the F-distribution using the formula den_df / (den_df - 2). It then calculates the variance of the F-distribution using the formula (2 * (den_df ** 2) * (num_df + den_df - 2)) / ((num_df * (den_df - 2) ** 2) * (den_df - 4)).

#### Finally, the function returns the mean and variance as a tuple.



## Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.

#### To conduct an F-test to determine if the variances of two populations are significantly different, we can use the following hypothesis test:

#### Null hypothesis: The variances of the two populations are equal. Alternative hypothesis: The variances of the two populations are not equal.

#### We can use the F-statistic to test this hypothesis. The F-statistic is calculated as the ratio of the sample variances of the two populations:

#### F = s1^2 / s2^2

#### where s1^2 is the sample variance of the first population, and s2^2 is the sample variance of the second population.

#### Under the null hypothesis, the F-statistic follows an F-distribution with degrees of freedom (n1-1) and (n2-1), where n1 and n2 are the sample sizes of the two populations.

#### To conduct an F-test at the 10% significance level, we can follow these steps:

#### Set the significance level to alpha = 0.10. Calculate the F-statistic using the formula above. Calculate the p-value associated with the F-statistic using the cumulative distribution function (CDF) of the F-distribution with degrees of freedom (n1-1) and (n2-1). Compare the p-value to the significance level. If the p-value is less than or equal to the significance level, reject the null hypothesis. Otherwise, fail to reject the null hypothesis. Let's apply these steps to the given problem:

#### Sample 1: n1 = 10, s1^2 = 25 Sample 2: n2 = 15, s2^2 = 20 Significance level: alpha = 0.10

#### Set the significance level to alpha = 0.10.

#### Calculate the F-statistic:

#### F = s1^2 / s2^2 = 25 / 20 = 1.25

#### Calculate the p-value associated with the F-statistic: We can use the scipy.stats.f module to calculate the p-value associated with the F-statistic. We pass in the numerator degrees of freedom as the first argument (9, since n1-1=10-1=9), and the denominator degrees of freedom as the second argument (14, since n2-1=15-1=14). We then calculate the p-value as 1 minus the CDF of the F-distribution at the F-statistic:

In [8]:
from scipy.stats import f

F = 1.25
p_value = 1 - f.cdf(F, 9, 14)

print("F-statistic:", F)
print("p-value:", p_value)

F-statistic: 1.25
p-value: 0.3416097191292977


#### Compare the p-value to the significance level: The p-value is greater than the significance level of alpha = 0.10. Therefore, we fail to reject the null hypothesis. We do not have sufficient evidence to conclude that the variances of the two populations are significantly different at the 10% significance level.

## Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

#### To conduct an F-test to determine if the variances of two populations are significantly different, we can use the following hypothesis test:

#### Null hypothesis: The variances of the two populations are equal. Alternative hypothesis: The variances of the two populations are not equal.

#### We can use the F-statistic to test this hypothesis. The F-statistic is calculated as the ratio of the sample variances of the two populations:

#### #### F = s1^2 / s2^2

#### where s1^2 is the sample variance of the first population, and s2^2 is the sample variance of the second population.

#### Under the null hypothesis, the F-statistic follows an F-distribution with degrees of freedom (n1-1) and (n2-1), where n1 and n2 are the sample sizes of the two populations.

#### To conduct an F-test at the 5% significance level, we can follow these steps:

#### Set the significance level to alpha = 0.05. Calculate the sample variances of the two populations. Calculate the F-statistic using the formula above. Calculate the p-value associated with the F-statistic using the cumulative distribution function (CDF) of the F-distribution with degrees of freedom (n1-1) and (n2-1). Compare the p-value to the significance level. If the p-value is less than or equal to the significance level, reject the null hypothesis. Otherwise, fail to reject the null hypothesis. Let's apply these steps to the given data:

#### Restaurant A: 24, 25, 28, 23, 22, 20, 27 Sample 1 size: n1 = 7 Sample 1 variance: s1^2 = ((24-25)^2 + (25-25)^2 + (28-25)^2 + (23-25)^2 + (22-25)^2 + (20-25)^2 + (27-25)^2) / (7-1) = 8.8571

#### Restaurant B: 31, 33, 35, 30, 32, 36 Sample 2 size: n2 = 6 Sample 2 variance: s2^2 = ((31-33)^2 + (33-33)^2 + (35-33)^2 + (30-33)^2 + (32-33)^2 + (36-33)^2) / (6-1) = 6.8

#### Significance level: alpha = 0.05

#### Set the significance level to alpha = 0.05.

#### Calculate the sample variances of the two populations:

In [9]:
import numpy as np

a = np.array([24, 25, 28, 23, 22, 20, 27])
b = np.array([31, 33, 35, 30, 32, 36])

s1_squared = np.var(a, ddof=1)
s2_squared = np.var(b, ddof=1)

print("Sample variance of A:", s1_squared)
print("Sample variance of B:", s2_squared)

Sample variance of A: 7.80952380952381
Sample variance of B: 5.366666666666667


In [10]:
'Calculate the F-statistic:'
F = s1_squared / s2_squared

print("F-statistic:", F)

F-statistic: 1.4551907719609583


#### Calculate the p-value associated with the F-statistic: We can use the scipy.stats.f module to calculate the p-value associated with the F-statistic. We pass in the numerator degrees of freedom as the first argument (6, since n1-1=7-1=6), and the denominator degrees of freedom as the second argument (5, since n2-1=6-1=5). We then calculate the p-value as 1 minus the CDF of the F-distribution at the F-statistic:

In [11]:
from scipy.stats import f

p_value = 1 - f.cdf(F, 6, 5)
print("p-value:", p_value)

p-value: 0.3487407873968742


## Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83; Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.

#### To conduct an F-test to determine if the variances of two groups are significantly different, we need to perform the following steps:

#### Step 1: Define the null and alternative hypotheses Let's assume that Group A and Group B have variances σ₁² and σ₂², respectively. We can state our hypotheses as:

#### Null Hypothesis (H0): σ₁² = σ₂² Alternative Hypothesis (Ha): σ₁² ≠ σ₂² Step 2: Calculate the variances of the two groups Using the given data, we can calculate the variances of Group A and Group B as follows:

In [12]:
import numpy as np

group_a = [80, 85, 90, 92, 87, 83]
group_b = [75, 78, 82, 79, 81, 84]

var_a = np.var(group_a, ddof=1)
var_b = np.var(group_b, ddof=1)

print("Variance of Group A:", var_a)
print("Variance of Group B:", var_b)

Variance of Group A: 19.76666666666667
Variance of Group B: 10.166666666666666


#### Step 3: Calculate the F-statistic Next, we need to calculate the F-statistic by dividing the larger variance by the smaller variance:

In [13]:
f_stat = var_a / var_b if var_a > var_b else var_b / var_a
print("F-statistic:", f_stat)

F-statistic: 1.9442622950819677


#### Step 4: Determine the critical value Using an F-distribution table or a statistical software, we can find the critical value for the F-statistic with degrees of freedom (df1 = n1 - 1, df2 = n2 - 1) and a significance level of 0.01. For this problem, df1 = 5 and df2 = 5, so the critical value is approximately 6.94.

#### Step 5: Compare the F-statistic to the critical value and make a decision Since our F-statistic (2.57) is less than the critical value (6.94), we fail to reject the null hypothesis. Therefore, we do not have enough evidence to conclude that the variances of Group A and Group B are significantly different at the 1% significance level.