### Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.

In [1]:
import numpy as np
from scipy.stats import f

def f_test(x, y):
    n_x = len(x)
    n_y = len(y)
    var_x = np.var(x, ddof=1)
    var_y = np.var(y, ddof=1)
    
    if var_x > var_y:
        f_value = var_x / var_y
        dfn = n_x - 1
        dfd = n_y - 1
    else:
        f_value = var_y / var_x
        dfn = n_y - 1
        dfd = n_x - 1
        
    p_value = f.sf(f_value, dfn, dfd) * 2
    
    return f_value, p_value

### Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [2]:
from scipy.stats import f

def critical_f(num_df, denom_df):
    alpha = 0.05
    return f.ppf(alpha/2, num_df, denom_df), f.ppf(1-alpha/2, num_df, denom_df)


The ppf() function in scipy.stats.f calculates the inverse cumulative distribution function (CDF) of the F-distribution, which gives the value at which the specified percentage of the distribution is below that value. In this case, we want to find the critical values that enclose 95% of the distribution.

### Q3. Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the F-value, degrees of freedom, and p-value for the test.

In [3]:
import numpy as np
from scipy.stats import f

# Parameters for the normal distributions
mu1, mu2 = 0, 0
var1, var2 = 10, 10
n1, n2 = 30, 30  # Sample sizes

# Generate random samples from the normal distributions
sample1 = np.random.normal(mu1, np.sqrt(var1), n1)
sample2 = np.random.normal(mu2, np.sqrt(var2), n2)

# Calculate the F-value and p-value for the F-test
f_value, p_value = f_test(sample1, sample2)

print("F-value:", f_value)
print("Degrees of freedom:", len(sample1)-1, len(sample2)-1)
print("p-value:", p_value)

F-value: 1.128193255216913
Degrees of freedom: 29 29
p-value: 0.747537072174733


### Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

To conduct an F-test to determine if the variances of two populations are significantly different, we need to calculate the F-statistic and compare it to the critical value from the F-distribution. Here are the steps to perform the F-test:

    State the null and alternative hypotheses:
        Null hypothesis: The variances of the two populations are equal.
        Alternative hypothesis: The variances of the two populations are significantly different.

    Determine the significance level and degrees of freedom:
        Significance level: 5%
        Degrees of freedom for the numerator (dfn): sample size of population 1 minus 1 = 11
        Degrees of freedom for the denominator (dfd): sample size of population 2 minus 1 = 11

    Calculate the F-statistic:
        F = variance of population 1 / variance of population 2 = 10 / 15 = 0.67

    Find the critical value from the F-distribution table:
        From the F-distribution table with dfn=11 and dfd=11 at 5% significance level, the critical value is 2.75.

    Compare the F-statistic with the critical value:
        Since 0.67 < 2.75, the F-statistic does not exceed the critical value.

    Make a decision and interpret the results:
        Since the F-statistic is not greater than the critical value, we fail to reject the null hypothesis. Therefore, we can conclude that there is no significant difference between the variances of the two populations at the 5% significance leve

### Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.


State the null and alternative hypotheses:

    Null hypothesis: The population variance of the diameter of the product is equal to 0.005.
    Alternative hypothesis: The population variance of the diameter of the product is greater than 0.005.

Determine the significance level and degrees of freedom:

    Significance level: 1%
    Degrees of freedom for the numerator (dfn): sample size minus 1 = 24
    Degrees of freedom for the denominator (dfd): since we are testing against a specific value, we use the total sample size minus 1 = 24

Calculate the F-statistic:

    F = sample variance / population variance = 0.006 / 0.005 = 1.2

Find the critical value from the F-distribution table:

    From the F-distribution table with dfn=24 and dfd=24 at 1% significance level, the critical value is 2.96.

Compare the F-statistic with the critical value:

    Since 1.2 < 2.96, the F-statistic does not exceed the critical value.

Make a decision and interpret the results:

    Since the F-statistic is not greater than the critical value, we fail to reject the null hypothesis. Therefore, we can conclude that the population variance of the diameter of the product is not significantly greater than 0.005 at the 1% significance level. In other words, there is no evidence to suggest that the manufacturer's claim is unjustified

### Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.

In [4]:
import numpy as np

def f_dist_mean_var(df1, df2):   
    mean = df2 / (df2 - 2)
    var = (2 * (df2**2) * (df1 + df2 - 2)) / (df1 * (df2 - 2)**2 * (df2 - 4))
    
    return mean, var

mean, var = f_dist_mean_var(10, 20)
print("Mean:", mean)
print("Variance:", var)

Mean: 1.1111111111111112
Variance: 0.43209876543209874


### Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.

In [5]:
import numpy as np
from scipy.stats import f

s1 = 25
s2 = 20
n1 = 10
n2 = 15
df1 = n1 - 1
df2 = n2 - 1
F = s1/s2
p_value = 2 * (1 - f.cdf(F, df1, df2))
alpha = 0.10
if p_value < alpha:
    print("Reject null hypothesis. Variances are not equal.")
else:
    print("Fail to reject null hypothesis. Variances are equal.")
    
print("F-value:", F)
print("Degrees of freedom:", df1, df2)
print("p-value:", p_value)


Fail to reject null hypothesis. Variances are equal.
F-value: 1.25
Degrees of freedom: 9 14
p-value: 0.6832194382585954


### Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

* Null hypothesis: The variances of waiting times at the two restaurants are equal.

* Alternative hypothesis: The variances of waiting times at the two restaurants are not equal.


    Restaurant A: n1 = 7, s1^2 = 8.2857
    
    Restaurant B: n2 = 6, s2^2 = 6.6667

    Calculate the F-value using the formula: F = s1^2 / s2^2
    
    F = 8.2857 / 6.6667 = 1.2428

    Calculate the critical F-value using the significance level and degrees of freedom:

    Significance level = 0.05
    
    Degrees of freedom for numerator = n1 - 1 = 6
    
    Degrees of freedom for denominator = n2 - 1 = 5
    
    Critical F-value = finv(0.025, 6, 5) = 5.1433, where finv() is the inverse of the F-distribution.

    Compare the F-value and critical F-value:
    
    F-value = 1.2428
    
    Critical F-value = 5.1433
    
    Since the F-value is less than the critical F-value, we fail to reject the null hypothesis.

    Interpret the results:
    
    There is not enough evidence to conclude that the variances of waiting times at the two restaurants are significantly different at the 5%   
    significance level.

Therefore, we can conclude that there is not enough evidence to suggest that the variance of waiting times at Restaurant A is significantly different from the variance of waiting times at Restaurant B at the 5% significance level.

### Q9. The following data represent the test scores of two groups of students: 
    Group A: 80, 85, 90, 92, 87, 83;
    Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.

In [6]:
import numpy as np
from scipy.stats import f

group_a = np.array([80, 85, 90, 92, 87, 83])
group_b = np.array([75, 78, 82, 79, 81, 84])

var_a = np.var(group_a, ddof=1)
var_b = np.var(group_b, ddof=1)

f_value = var_a / var_b

# degrees of freedom for the F-test
dfn = len(group_a) - 1
dfd = len(group_b) - 1

# calculate the critical F-value at the 1% significance level
alpha = 0.01
f_crit = f.ppf(q=1-alpha, dfn=dfn, dfd=dfd)

# compare the calculated F-value with the critical F-value
if f_value > f_crit:
    print("Reject null hypothesis: variances are significantly different.")
else:
    print("Fail to reject null hypothesis: variances are not significantly different.")


Fail to reject null hypothesis: variances are not significantly different.
