#### Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.


In [1]:
import scipy.stats as stats

def variance_ratio_test(x, y):
    # Calculate the sample variances
    var_x = x.var(ddof=1)
    var_y = y.var(ddof=1)
    
    # Calculate the F-value
    f_value = var_x / var_y
    
    # Calculate the degrees of freedom
    df_x = len(x) - 1
    df_y = len(y) - 1
    
    # Calculate the p-value
    p_value = stats.f.sf(f_value, df_x, df_y)
    
    # Return the F-value and p-value
    return f_value, p_value


The function first calculates the sample variances of the two arrays x and y using the .var() method with the ddof=1 argument to get the unbiased estimate of the variance. It then calculates the F-value using the formula var_x / var_y.

Next, it calculates the degrees of freedom for the two samples by subtracting 1 from the sample sizes. It then uses the scipy.stats.f.sf() function to calculate the p-value for the F-value, with the degrees of freedom for the numerator being df_x and the degrees of freedom for the denominator being df_y.

Finally, the function returns the F-value and p-value as a tuple. You can call the function with two arrays of data and it will return the calculated F-value and p-value for the variance ratio test.

#### Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.


In [2]:
import scipy.stats as stats

def f_critical(df_num, df_denom, alpha):
    # Calculate the critical F-value
    f_crit = stats.f.ppf(1 - alpha/2, df_num, df_denom)
    
    # Return the critical F-value
    return f_crit


The function uses the scipy.stats.f.ppf() function to calculate the critical F-value for a two-tailed test. The 1 - alpha/2 argument specifies the percentile of the F-distribution corresponding to the desired significance level (alpha) for a two-tailed test. The df_num and df_denom arguments specify the degrees of freedom for the numerator and denominator of the F-distribution, respectively.

You can call the function with the desired degrees of freedom and significance level to obtain the critical F-value for the two-tailed test. For example, if you want the critical F-value for a significance level of 0.05 and degrees of freedom of 4 and 10, you can call the function as follows:

In [3]:
f_crit = f_critical(4, 10, 0.05)
print(f_crit)


4.46834157822528


This will output the critical F-value for the two-tailed test with 4 and 10 degrees of freedom at a significance level of 0.05.

#### Q3. Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the F- value, degrees of freedom, and p-value for the test.



The program uses the numpy and scipy.stats modules for generating random samples and performing the F-test, respectively:

In [5]:
import numpy as np
import scipy.stats as stats

# Set the seed for reproducibility
np.random.seed(123)

# Generate random samples from two normal distributions with known variances
n1 = 20
n2 = 30
mu1 = 0
mu2 = 0
var1 = 4
var2 = 4
x = np.random.normal(mu1, np.sqrt(var1), n1)
y = np.random.normal(mu2, np.sqrt(var2), n2)

# Perform an F-test to determine if the variances are equal
f_value, p_value = stats.f_oneway(x,y)

# Calculate the degrees of freedom for the numerator and denominator
df_num = n1 - 1
df_denom = n2 - 1

# Output the F-value, degrees of freedom, and p-value for the test
print(f"F-value: {f_value:.2f}")
print(f"Degrees of freedom (numerator): {df_num}")
print(f"Degrees of freedom (denominator): {df_denom}")
print(f"p-value: {p_value:.4f}")


F-value: 0.23
Degrees of freedom (numerator): 19
Degrees of freedom (denominator): 29
p-value: 0.6319


The program first sets the seed for reproducibility using np.random.seed(). It then generates random samples from two normal distributions with known variances using the numpy.random.normal() function. The n1 and n2 variables specify the sample sizes, mu1 and mu2 specify the means of the two distributions (set to 0 in this example), and var1 and var2 specify the variances of the two distributions.

Next, the program performs an F-test to determine if the variances are equal using the scipy.stats.ftest() function. The ftest() function takes two arrays of data as arguments and returns the F-value and p-value for the test.

The program then calculates the degrees of freedom for the numerator and denominator of the F-distribution using the sample sizes minus 1. Finally, the program outputs the F-value, degrees of freedom, and p-value for the test using print() statements.









#### Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.


To conduct an F-test to determine if the variances of two populations are significantly different, we need to calculate the F-statistic and compare it to the critical F-value from an F-distribution with (n1-1) and (n2-1) degrees of freedom.

The null hypothesis for the F-test is that the variances of the two populations are equal, and the alternative hypothesis is that they are significantly different.

The formula for the F-statistic is:

F = s1^2 / s2^2

where s1^2 and s2^2 are the sample variances of the two populations.

We first calculate the sample variances using the given information:

s1^2 = 10 (given)
s2^2 = 15 (given)

We then calculate the F-statistic:

F = s1^2 / s2^2 = 10 / 15 = 0.6667

Next, we need to find the critical F-value from an F-distribution with (n1-1) and (n2-1) degrees of freedom. In this case, we have 12 observations from each population, so the degrees of freedom are:

df1 = 12 - 1 = 11
df2 = 12 - 1 = 11

Using a statistical software or an F-table, we can find the critical F-value for a 5% significance level and the given degrees of freedom:

Fcritical = 3.49

Finally, we compare the calculated F-statistic to the critical F-value. If the calculated F-statistic is greater than the critical F-value, we reject the null hypothesis and conclude that the variances of the two populations are significantly different. Otherwise, we fail to reject the null hypothesis.

In this case, the calculated F-statistic (0.6667) is less than the critical F-value (3.49), so we fail to reject the null hypothesis. Therefore, we do not have enough evidence to conclude that the variances of the two populations are significantly different at the 5% significance level.

#### Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.


To conduct an F-test to determine if the claim about the variance of the diameter of a certain product is justified, we need to calculate the F-statistic and compare it to the critical F-value from an F-distribution with (n-1) degrees of freedom.

The null hypothesis for the F-test is that the variance of the diameter of the product is equal to the claimed value of 0.005, and the alternative hypothesis is that it is significantly different.

The formula for the F-statistic is:

F = s^2 / σ^2

where s^2 is the sample variance, and σ^2 is the claimed population variance.

We first calculate the sample variance using the given information:

s^2 = 0.006 (given)

We then use the claimed population variance to calculate the F-statistic:

F = s^2 / σ^2 = 0.006 / 0.005 = 1.2

Next, we need to find the critical F-value from an F-distribution with (n-1) degrees of freedom. In this case, we have 25 observations, so the degrees of freedom are:

df = 25 - 1 = 24

Using a statistical software or an F-table, we can find the critical F-value for a 1% significance level and 24 degrees of freedom:

Fcritical = 2.78

Finally, we compare the calculated F-statistic to the critical F-value. If the calculated F-statistic is greater than the critical F-value, we reject the null hypothesis and conclude that the claim about the variance of the diameter of the product is not justified. Otherwise, we fail to reject the null hypothesis.

In this case, the calculated F-statistic (1.2) is less than the critical F-value (2.78), so we fail to reject the null hypothesis. Therefore, we do not have enough evidence to conclude that the claimed variance of the diameter of the product is not justified at the 1% significance level.

#### Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.


In [6]:
import numpy as np
from scipy.stats import f

def f_distribution_mean_var(df_num, df_denom):
    """
    Calculate the mean and variance of an F-distribution given the degrees of freedom for the numerator and denominator.
    
    Parameters:
    -----------
    df_num : int
        Degrees of freedom for the numerator.
    df_denom : int
        Degrees of freedom for the denominator.
        
    Returns:
    --------
    Tuple(float, float)
        Mean and variance of the F-distribution.
    """
    mean = df_denom / (df_denom - 2.0) if df_denom > 2 else np.nan
    var = (2.0 * df_denom**2 * (df_num + df_denom - 2)) / (df_num * (df_denom - 2)**2 * (df_denom - 4)) if df_denom > 4 else np.nan
    return mean, var


The function takes two integer arguments: df_num for the degrees of freedom of the numerator and df_denom for the degrees of freedom of the denominator. It then calculates the mean and variance of the F-distribution using the formulas:

In [7]:
mean = df_denom / (df_denom - 2)
var = (2 * df_denom^2 * (df_num + df_denom - 2)) / (df_num * (df_denom - 2)^2 * (df_denom - 4))


Note that we use the scipy.stats.f function to check whether the degrees of freedom for the denominator are greater than 2 and 4, which are required for the mean and variance formulas, respectively. If the degrees of freedom are less than these values, the function returns nan (not a number) for the corresponding mean and variance.

The function then returns the mean and variance as a tuple.

#### Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.


To conduct an F-test to determine if two population variances are significantly different, we need to compare the sample variances using the F-statistic and then compare it to the critical F-value at the desired significance level. Here's the step-by-step process:

Step 1: Define the null and alternative hypotheses. The null hypothesis is that the variances of the two populations are equal, while the alternative hypothesis is that they are not equal.


    H0: σ1^2 = σ2^2
    Ha: σ1^2 ≠ σ2^2
Step 2: Set the significance level α. In this case, α = 0.10.

Step 3: Calculate the F-statistic:


F = s1^2 / s2^2
where s1^2 and s2^2 are the sample variances for the first and second populations, respectively.


    s1^2 = 25
    s2^2 = 20
    F = 25 / 20
    F = 1.25

Step 4: Find the critical F-value at α/2 and degrees of freedom (df1, df2). The degrees of freedom for the numerator and denominator are given by (n1 - 1) and (n2 - 1), respectively.


    df1 = n1 - 1 = 10 - 1 = 9
    df2 = n2 - 1 = 15 - 1 = 14
    F_crit = F.ppf(α/2, df1, df2)
    F_crit = F.ppf(0.05, 9, 14)
    F_crit = 2.413
where F.ppf is the percent point function (inverse of the cumulative distribution function) of the F-distribution in the scipy.stats module.

Step 5: Compare the F-statistic to the critical F-value. If the F-statistic is greater than the critical F-value, reject the null hypothesis; otherwise, fail to reject the null hypothesis.


    F < F_crit
    1.25 < 2.413
Therefore, we fail to reject the null hypothesis at the 10% significance level. We do not have sufficient evidence to conclude that the variances of the two populations are significantly different.

#### Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.


To conduct an F-test to determine if two population variances are significantly different, we need to compare the sample variances using the F-statistic and then compare it to the critical F-value at the desired significance level. Here's the step-by-step process:

Step 1: Define the null and alternative hypotheses. The null hypothesis is that the variances of the two populations are equal, while the alternative hypothesis is that they are not equal.


    H0: σ1^2 = σ2^2
    Ha: σ1^2 ≠ σ2^2
Step 2: Set the significance level α. In this case, α = 0.05.

Step 3: Calculate the sample variances for each restaurant.


    n1 = 7, x1 = [24, 25, 28, 23, 22, 20, 27]
    s1^2 = 8.2857

    n2 = 6, x2 = [31, 33, 35, 30, 32, 36]
    s2^2 = 6.6667
where n is the sample size, x is the sample data, and s^2 is the sample variance.

Step 4: Calculate the F-statistic:


    F = s1^2 / s2^2
    F = 8.2857 / 6.6667
    F = 1.2432
Step 5: Find the critical F-value at α/2 and degrees of freedom (df1, df2). The degrees of freedom for the numerator and denominator are given by (n1 - 1) and (n2 - 1), respectively.


    df1 = n1 - 1 = 6
    df2 = n2 - 1 = 5
    F_crit = F.ppf(α/2, df1, df2)
    F_crit = F.ppf(0.025, 6, 5)
    F_crit = 5.987
where F.ppf is the percent point function (inverse of the cumulative distribution function) of the F-distribution in the scipy.stats module.

Step 6: Compare the F-statistic to the critical F-value. If the F-statistic is greater than the critical F-value, reject the null hypothesis; otherwise, fail to reject the null hypothesis.


    F < F_crit
    1.2432 < 5.987
Therefore, we fail to reject the null hypothesis at the 5% significance level. We do not have sufficient evidence to conclude that the variances of the two populations are significantly different.

#### Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83; Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.

To conduct an F-test to determine if two population variances are significantly different, we need to compare the sample variances using the F-statistic and then compare it to the critical F-value at the desired significance level. Here's the step-by-step process:

Step 1: Define the null and alternative hypotheses. The null hypothesis is that the variances of the two populations are equal, while the alternative hypothesis is that they are not equal.


    H0: σ1^2 = σ2^2
    Ha: σ1^2 ≠ σ2^2
Step 2: Set the significance level α. In this case, α = 0.01.

Step 3: Calculate the sample variances for each group.


    n1 = 6, x1 = [80, 85, 90, 92, 87, 83]
    s1^2 = 26.6667

    n2 = 6, x2 = [75, 78, 82, 79, 81, 84]
    s2^2 = 9.2
where n is the sample size, x is the sample data, and s^2 is the sample variance.

Step 4: Calculate the F-statistic:


    F = s1^2 / s2^2
    F = 26.6667 / 9.2
    F = 2.8978
Step 5: Find the critical F-value at α/2 and degrees of freedom (df1, df2). The degrees of freedom for the numerator and denominator are given by (n1 - 1) and (n2 - 1), respectively.


    df1 = n1 - 1 = 5
    df2 = n2 - 1 = 5
    F_crit = F.ppf(α/2, df1, df2)
    F_crit = F.ppf(0.005, 5, 5)
    F_crit = 5.987
where F.ppf is the percent point function (inverse of the cumulative distribution function) of the F-distribution in the scipy.stats module.

Step 6: Compare the F-statistic to the critical F-value. If the F-statistic is greater than the critical F-value, reject the null hypothesis; otherwise, fail to reject the null hypothesis.


    F < F_crit
    2.8978 < 5.987
Therefore, we fail to reject the null hypothesis at the 1% significance level. We do not have sufficient evidence to conclude that the variances of the two populations are significantly different