### Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.

In [11]:
import numpy as np
import scipy.stats as stat

# Create data
group1 = [23, 20, 26, 28, 5]
group2 = [25, 23, 26, 21, 23]

# Converting the list to an array
x = np.array(group1)
y = np.array(group2)


def f_test(group1, group2):
    
    f = np.var(group1)/np.var(group2)
    nun = x.size-1
    dun = y.size-1
    p_value = 1-stat.f.cdf(f, nun, dun)
    return f, p_value

# perform F-test
f_test(x, y)

(21.921052631578952, 0.005544127322687831)

### Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [25]:
from scipy.stats import f

def critical_f_value(alpha, df1, df2):
    """
    Calculates the critical F-value for a two-tailed test.
    
    Parameters:
        alpha : Significance level
        df1 : Degrees of freedom for the numerator.
        df2 : Degrees of freedom for the denominator.
    

    """
    # Calculate the critical value for each tail
    tail_probability = (1 - alpha) / 2
    critical_value = f.ppf(1 - tail_probability, df1, df2)
    return critical_value


In [10]:
alpha = 0.05
df1 = 3
df2 = 8

critical_f = critical_f_value(alpha, df1, df2)
print(f"Critical F-value: {critical_f}")

Critical F-value: 0.9170920557329801


### Q3. Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the F- value, degrees of freedom, and p-value for the test.

In [29]:
np.random.seed(42)

# creation of normally distributed dataset.
sample1 = np.random.normal(loc = 10, scale = 5, size = 10)
sample2 = np.random.normal(loc = 10, scale = 5, size =10)

# calculating variance
var1 = np.var(sample1)
var2 = np.var(sample2)
alpha_value = 0.05

#Steps 1: Formulating the hypothesis.

print("""Null: The two groups have equal variance i.e σ1 = σ2.
Alternative: The two groups have unequal variance i.e σ1 != σ2.\n""")

#Step 2: Calculating test Statistics.

dof1 = len(sample1) - 1
dof2 = len(sample2) - 1
f = var1/var2
p_value =1 - stat.f.cdf(f, dof1, dof2)
print(f"The value of f-test is: {f}, the value of degree of freedom for both the dataset is: {dof1} and {dof2} and at last the value of p is: {p_value}")

#Step 3: evaluation the results.

if p_value > alpha_value:
    print('\nWe fail to reject the null hypothesis i.e variance of both the groups are equal.')
else:
    print('We reject the null hypothesis i.e variance of both the groups are not equal.')

Null: The two groups have equal variance i.e σ1 = σ2.
Alternative: The two groups have unequal variance i.e σ1 != σ2.

The value of f-test is: 0.9162609470264066, the value of degree of freedom for both the dataset is: 9 and 9 and at last the value of p is: 0.5507572632143998

We fail to reject the null hypothesis i.e variance of both the groups are equal.


### Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

In [15]:
var1 = 10
var2 = 15

# creating hypothesis -
''' Null hypothesis - Variance of both samples is same
Alternate hypothesis - Variance are different
'''

f_test = var1 / var2

# degree of freedom (12 samples from each population)

df1 = 12 - 1
df2 = 12 -1
significance_value = 0.05

critical_value = stat.f.ppf(q=1-significance_value, dfn = df1, dfd = df2)

print(f"critical_value: ", critical_value, "f_test: ", f_test)




critical_value:  2.8179304699530863 f_test:  0.6666666666666666


In [17]:
if f_test > critical_value :
    print("We reject the Null hypothesis")
    
else:
    print("We fail to reject the null hypothesis")

We fail to reject the null hypothesis


**Result - Therefore, we cannot conclude that the variances of the two populations are significantly different based on the given data.**

### Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.

In [18]:


'''State the null and alternative hypotheses.

The null hypothesis (H₀) states that the claimed variance is true,
while the alternative hypothesis (H₁) states that the claimed variance is not justified.'''

var1 = 0.006
var2 = 0.005

f_test = var1 / var2

# degree of freedom (12 samples from each population)

df1 = df2 = 25 - 1
significance_value = 0.01

critical_value = stat.f.ppf(q=1-significance_value, dfn = df1, dfd = df2)

print(f"critical_value: ", critical_value, "f_test: ", f_test)

critical_value:  2.659072104348157 f_test:  1.2


In [19]:
if f_test > critical_value :
    print("We reject the Null hypothesis")
    
else:
    print("We fail to reject the null hypothesis")

We fail to reject the null hypothesis


**Result - The claimed variance 0.005 is true**

### Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.

In [26]:
np.random.seed(42)

def m_v_cal(dfn, dfd):
    sample_size1 = dfn + 1
    sample_size2 = dfd + 1
    f = np.random.f(dfnum = dfn, dfden = dfd, size = 10)
    return f"mean is: {np.mean(f)} and variance is: {np.var(f)}"

In [27]:
m_v_cal(dfn = 10, dfd = 8)

'mean is: 1.432483868780909 and variance is: 0.8845762300994563'


### Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from anothernormal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.

In [28]:
from scipy.stats import f

def conduct_f_test(sample_var1, sample_var2, sample_size1, sample_size2, alpha):
    # Calculate the degrees of freedom
    df1 = sample_size1 - 1
    df2 = sample_size2 - 1

    # Calculate the F-statistic
    f_statistic = sample_var1 / sample_var2

    # Calculate the critical value
    critical_value = f.ppf(1 - alpha/2, df1, df2)

    # Compare the F-statistic with the critical value
    if f_statistic > critical_value or f_statistic < 1 / critical_value:
        print("Reject the null hypothesis.")
        print("The variances are significantly different.")
    else:
        print("Fail to reject the null hypothesis.")
        print("The variances are not significantly different.")

# Sample 1 information
sample_var1 = 25
sample_size1 = 10

# Sample 2 information
sample_var2 = 20
sample_size2 = 15

# Significance level
alpha = 0.10

# Perform the F-test
conduct_f_test(sample_var1, sample_var2, sample_size1, sample_size2, alpha)


Fail to reject the null hypothesis.
The variances are not significantly different.


### Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday night: 

#### Restaurant A: 24, 25, 28, 23, 22, 20, 27; 

#### Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

In [23]:
restA = [24, 25, 28, 23, 22, 20, 27]
restB = [31, 33, 35, 30, 32, 36]

# creating hypothesis -
''' Null hypothesis - Variance of both samples is same
Alternate hypothesis - Variance are different
'''

f_test = np.var(restA)/np.var(restB)

# degree of freedom (12 samples from each population)

df1 = len(restA) - 1
df2 = len(restB) - 1
significance_value = 0.05

critical_value = stat.f.ppf(q=1-significance_value, dfn = df1, dfd = df2)

print(f"critical_value: ", critical_value, "f_test: ", f_test)

critical_value:  4.950288068694318 f_test:  1.496767651159843


In [24]:
if f_test > critical_value :
    print("We reject the Null hypothesis")
    
else:
    print("We fail to reject the null hypothesis i.e variance are not significantly different")

We fail to reject the null hypothesis i.e variance are not significantly different


### Q9. The following data represent the test scores of two groups of students: 

##### Group A: 80, 85, 90, 92, 87, 83;

##### Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.

In [20]:
groupA = [80, 85, 90, 92, 87, 83]
groupB = [75, 78, 82, 79, 81, 84]

# creating hypothesis -
''' Null hypothesis - Variance of both samples is same
Alternate hypothesis - Variance are different
'''

f_test = np.var(groupA)/np.var(groupB)

# degree of freedom (12 samples from each population)

df1 = len(groupA) - 1
df2 = len(groupB) - 1
significance_value = 0.01

critical_value = stat.f.ppf(q=1-significance_value, dfn = df1, dfd = df2)

print(f"critical_value: ", critical_value, "f_test: ", f_test)

critical_value:  10.967020650907992 f_test:  1.9442622950819677


In [22]:
if f_test > critical_value :
    print("We reject the Null hypothesis")
    
else:
    print("We fail to reject the null hypothesis i.e variance are not significantly different")

We fail to reject the null hypothesis i.e variance are not significantly different
