# Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.


In [1]:
# The F-value for a variance ratio test is calculated by comparing the variances of two data sets. The formula for the 
# F-value is:
# F = s_1^2/s_2^2

# where s_1^2 is the variance of the first data set and s_2^2  is the variance of the second data set. The degrees of freedom
# for the two variances are n_1 - 1  and n_2 - 1 respectively, where n_1 is the number of observations in the
# first data set and n_2  is the number of observations in the second data set.

# To calculate the p-value, we can use the cumulative distribution function (CDF) of the F-distribution. You'll need to import
# the scipy.stats module to access these functions.


import numpy as np
from scipy.stats import f

def variance_ratio_test(data1, data2):
    n1 = len(data1)
    n2 = len(data2)
    
    var1 = np.var(data1, ddof=1)  # ddof=1 for unbiased variance estimator
    var2 = np.var(data2, ddof=1)
    
    F = var1 / var2
    df1 = n1 - 1
    df2 = n2 - 1
    
    p_value = 1 - f.cdf(F, df1, df2)
    
    return F, p_value

# Example usage
data1 = [10, 12, 15, 18, 20]
data2 = [8, 9, 11, 14, 16]

F_value, p_value = variance_ratio_test(data1, data2)
print("F-value:", F_value)
print("p-value:", p_value)


# In this example, replace data1 and data2 with your actual data sets. The function calculates the F-value and the
# corresponding p-value for the variance ratio test. The scipy.stats.f.cdf function is used to compute the cumulative 
# distribution function of the F-distribution.

F-value: 1.5044247787610618
p-value: 0.3509826325280263


# Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.


In [4]:
# Certainly! In a two-tailed test, you want to find the critical F-value that corresponds to a given significance level and
# degrees of freedom for both the numerator and denominator of the F-distribution. The critical F-value is the value beyond
# which you would reject the null hypothesis.

# You can use the scipy.stats module to find the critical F-value using the f.ppf (percent point function) function. Here's
# the Python function

from scipy.stats import f

def critical_f_value(significance_level, df1, df2):
    alpha = significance_level
    critical_value = f.ppf(1 - alpha/2, df1, df2)
    return critical_value

# Example usage
significance_level = 0.05
degrees_of_freedom_numerator = 3
degrees_of_freedom_denominator = 12

critical_value = critical_f_value(significance_level, degrees_of_freedom_numerator, degrees_of_freedom_denominator)
print("Critical F-value:", critical_value)

# In this example, replace degrees_of_freedom_numerator and degrees_of_freedom_denominator with your actual degrees of freedom
# values. The function calculates the critical F-value corresponding to a given significance level and degrees of freedom for
# the two-tailed test using the scipy.stats.f.ppf function.

Critical F-value: 4.474184809637748


# Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the F-value, degrees of freedom, and p-value for the test.



In [5]:
# Here's a Python program that generates random samples from two normal distributions with known variances and
# performs an F-test to determine if the variances are equal. The program will output the F-value, degrees of freedom, 
# and p-value for the test

import numpy as np
from scipy.stats import f

def f_test(data1, data2):
    var1 = np.var(data1, ddof=1)  # ddof=1 for unbiased variance estimator
    var2 = np.var(data2, ddof=1)
    
    n1 = len(data1)
    n2 = len(data2)
    
    F = var1 / var2
    df1 = n1 - 1
    df2 = n2 - 1
    
    p_value = 2 * min(f.cdf(F, df1, df2), 1 - f.cdf(F, df1, df2))
    
    return F, df1, df2, p_value

# Generate random samples from two normal distributions
np.random.seed(42)
data1 = np.random.normal(loc=0, scale=2, size=20)  # Mean=0, Variance=4
data2 = np.random.normal(loc=0, scale=2, size=25)  # Mean=0, Variance=4

F_value, degrees_of_freedom1, degrees_of_freedom2, p_value = f_test(data1, data2)

print("F-value:", F_value)
print("Degrees of Freedom (numerator):", degrees_of_freedom1)
print("Degrees of Freedom (denominator):", degrees_of_freedom2)
print("p-value:", p_value)


F-value: 1.0793045934279637
Degrees of Freedom (numerator): 19
Degrees of Freedom (denominator): 24
p-value: 0.8486276740085787


# The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.



In [6]:
# To conduct the F-test, we'll follow these steps:

# 1. Define the null hypothesis (H0) and the alternative hypothesis (H1):
# H0 : The variances of the two populations are equal σ1**2 = σ2**2  
# H1  : The variances of the two populations are significantly different σ1**2 != σ2**2 

# 2. Calculate the F-statistic:

# The F-statistic is calculated as the ratio of the larger sample variance to the smaller sample variance. In this case, we have 
# S1**2 = 10, S2**2 = 15, F = S2**2 / S1**2

# 3.termine the critical F-value:
# We'll use the significance level of 0.05 and the degrees of freedom for the two samples (n1 - n2 =12).Since this is a
# two-tailed test, we'll look for the critical F-value at the 2.5% level for both tails.

# 4.Compare the calculated F-statistic with the critical F-value:

# If the calculated F-statistic is greater than the critical F-value, we reject the null hypothesis and conclude that the 
# variances are significantly different.
# If the calculated F-statistic is not greater than the critical F-value, we fail to reject the null hypothesis and conclude 
# that there is not enough evidence to suggest significant differences in variances.

from scipy.stats import f

# Given data
variance1 = 10
variance2 = 15
sample_size = 12
significance_level = 0.05

# Calculate the F-statistic
F_statistic = variance2 / variance1

# Calculate degrees of freedom
df1 = sample_size - 1
df2 = sample_size - 1

# Calculate the critical F-value
critical_value = f.ppf(1 - significance_level/2, df1, df2)

# Conduct the F-test
if F_statistic > critical_value:
    result = "Reject null hypothesis"
else:
    result = "Fail to reject null hypothesis"

print("F-statistic:", F_statistic)
print("Degrees of Freedom (numerator):", df1)
print("Degrees of Freedom (denominator):", df2)
print("Critical F-value:", critical_value)
print("Result:", result)


F-statistic: 1.5
Degrees of Freedom (numerator): 11
Degrees of Freedom (denominator): 11
Critical F-value: 3.473699051085809
Result: Fail to reject null hypothesis


# A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.



In [7]:
from scipy.stats import f

# Given data
claimed_variance = 0.005
sample_variance = 0.006
sample_size = 25
significance_level = 0.01

# Calculate the F-statistic
F_statistic = sample_variance / claimed_variance

# Calculate degrees of freedom
df1 = sample_size - 1
df2 = 1  # Degrees of freedom for the claimed population variance

# Calculate the critical F-values
critical_lower = f.ppf(significance_level/2, df1, df2)
critical_upper = f.ppf(1 - significance_level/2, df1, df2)

# Conduct the F-test
if F_statistic < critical_lower or F_statistic > critical_upper:
    result = "Reject null hypothesis"
else:
    result = "Fail to reject null hypothesis"

print("F-statistic:", F_statistic)
print("Degrees of Freedom (numerator):", df1)
print("Degrees of Freedom (denominator):", df2)
print("Critical Lower F-value:", critical_lower)
print("Critical Upper F-value:", critical_upper)
print("Result:", result)


F-statistic: 1.2
Degrees of Freedom (numerator): 24
Degrees of Freedom (denominator): 1
Critical Lower F-value: 0.10469807488970448
Critical Upper F-value: 24939.565259943236
Result: Fail to reject null hypothesis


# Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.



In [8]:
def f_distribution_mean_variance(df1, df2):
    if df1 <= 0 or df2 <= 0:
        raise ValueError("Degrees of freedom must be greater than 0.")
    
    if df2 <= 2:
        raise ValueError("Degrees of freedom for the denominator must be greater than 2 for finite variance.")
    
    mean = df2 / (df2 - 2)
    variance = (2 * (df2 ** 2) * (df1 + df2 - 2)) / (df1 * (df2 - 2) ** 2 * (df2 - 4))
    
    return mean, variance

# Example usage
df1 = 3
df2 = 15

mean, variance = f_distribution_mean_variance(df1, df2)
print("Mean:", mean)
print("Variance:", variance)


Mean: 1.1538461538461537
Variance: 1.2910166756320602


# A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.




In [9]:
from scipy.stats import f

# Given data
sample_variance1 = 25
sample_variance2 = 20
sample_size1 = 10
sample_size2 = 15
significance_level = 0.10

# Calculate the F-statistic
F_statistic = sample_variance1 / sample_variance2

# Calculate degrees of freedom
df1 = sample_size1 - 1
df2 = sample_size2 - 1

# Calculate the critical F-values
critical_lower = f.ppf(significance_level/2, df1, df2)
critical_upper = f.ppf(1 - significance_level/2, df1, df2)

# Conduct the F-test
if F_statistic < critical_lower or F_statistic > critical_upper:
    result = "Reject null hypothesis"
else:
    result = "Fail to reject null hypothesis"

print("F-statistic:", F_statistic)
print("Degrees of Freedom (numerator):", df1)
print("Degrees of Freedom (denominator):", df2)
print("Critical Lower F-value:", critical_lower)
print("Critical Upper F-value:", critical_upper)
print("Result:", result)


F-statistic: 1.25
Degrees of Freedom (numerator): 9
Degrees of Freedom (denominator): 14
Critical Lower F-value: 0.3305268601412525
Critical Upper F-value: 2.6457907352338195
Result: Fail to reject null hypothesis


# The following data represent the waiting times in minutes at two different restaurants on a Saturday night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.



In [10]:
import numpy as np
from scipy.stats import f

# Data
data_a = np.array([24, 25, 28, 23, 22, 20, 27])
data_b = np.array([31, 33, 35, 30, 32, 36])

# Given data
significance_level = 0.05

# Calculate the sample variances
sample_variance_a = np.var(data_a, ddof=1)
sample_variance_b = np.var(data_b, ddof=1)

# Calculate the F-statistic
F_statistic = sample_variance_b / sample_variance_a

# Calculate degrees of freedom
df1 = len(data_b) - 1
df2 = len(data_a) - 1

# Calculate the critical F-values
critical_lower = f.ppf(significance_level/2, df1, df2)
critical_upper = f.ppf(1 - significance_level/2, df1, df2)

# Conduct the F-test
if F_statistic < critical_lower or F_statistic > critical_upper:
    result = "Reject null hypothesis"
else:
    result = "Fail to reject null hypothesis"

print("F-statistic:", F_statistic)
print("Degrees of Freedom (numerator):", df1)
print("Degrees of Freedom (denominator):", df2)
print("Critical Lower F-value:", critical_lower)
print("Critical Upper F-value:", critical_upper)
print("Result:", result)


F-statistic: 0.6871951219512196
Degrees of Freedom (numerator): 5
Degrees of Freedom (denominator): 6
Critical Lower F-value: 0.14331366118441086
Critical Upper F-value: 5.987565126046928
Result: Fail to reject null hypothesis


# The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83; Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.



In [11]:
import numpy as np
from scipy.stats import f

# Data
group_a_scores = np.array([80, 85, 90, 92, 87, 83])
group_b_scores = np.array([75, 78, 82, 79, 81, 84])

# Given data
significance_level = 0.01

# Calculate the sample variances
sample_variance_a = np.var(group_a_scores, ddof=1)
sample_variance_b = np.var(group_b_scores, ddof=1)

# Calculate the F-statistic
F_statistic = sample_variance_b / sample_variance_a

# Calculate degrees of freedom
df1 = len(group_b_scores) - 1
df2 = len(group_a_scores) - 1

# Calculate the critical F-values
critical_lower = f.ppf(significance_level/2, df1, df2)
critical_upper = f.ppf(1 - significance_level/2, df1, df2)

# Conduct the F-test
if F_statistic < critical_lower or F_statistic > critical_upper:
    result = "Reject null hypothesis"
else:
    result = "Fail to reject null hypothesis"

print("F-statistic:", F_statistic)
print("Degrees of Freedom (numerator):", df1)
print("Degrees of Freedom (denominator):", df2)
print("Critical Lower F-value:", critical_lower)
print("Critical Upper F-value:", critical_upper)
print("Result:", result)


F-statistic: 0.5143338954468801
Degrees of Freedom (numerator): 5
Degrees of Freedom (denominator): 5
Critical Lower F-value: 0.06693617195469603
Critical Upper F-value: 14.939605459912219
Result: Fail to reject null hypothesis
