### Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.

In [1]:
import scipy.stats as stats

def calculate_f_value(data1, data2):
    # Perform one-way ANOVA
    f_statistic, p_value = stats.f_oneway(data1, data2)
    
    return f_statistic, p_value

# Example usage:
data_group1 = [10, 12, 15, 18, 20]
data_group2 = [8, 11, 14, 16, 19]

f_value, p_value = calculate_f_value(data_group1, data_group2)
print(f"F-value: {f_value}")
print(f"P-value: {p_value}")


F-value: 0.2776203966005666
P-value: 0.6125584193099542


### Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [2]:
import scipy.stats as stats

def calculate_critical_f_value(alpha, df_num, df_denom):
    # Calculate the critical F-value for a two-tailed test
    critical_f_value = stats.f.ppf(1 - alpha / 2, df_num, df_denom)
    
    return critical_f_value

# Example usage:
alpha = 0.05
df_num = 3  # degrees of freedom for the numerator
df_denom = 20  # degrees of freedom for the denominator

critical_f_value = calculate_critical_f_value(alpha, df_num, df_denom)
print(f"Critical F-value: {critical_f_value}")


Critical F-value: 3.8586986662732143


### Q3. Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the F- value, degrees of freedom, and p-value for the test.

In [3]:
import numpy as np
import scipy.stats as stats

def perform_f_test(sample1, sample2):
    # Perform F-test for comparing variances
    f_value, p_value = stats.levene(sample1, sample2)
    
    # Degrees of freedom for the test
    df_between = 1  # Number of groups - 1
    df_within = len(sample1) + len(sample2) - 2  # Total sample size minus the number of groups
    
    return f_value, df_between, df_within, p_value

# Example usage:
np.random.seed(42)  # Set seed for reproducibility

# Generate random samples from two normal distributions
sample1 = np.random.normal(loc=0, scale=1, size=30)
sample2 = np.random.normal(loc=0, scale=1.5, size=30)

# Perform F-test
f_value, df_between, df_within, p_value = perform_f_test(sample1, sample2)

# Output results
print(f"F-value: {f_value}")
print(f"Degrees of Freedom (Between): {df_between}")
print(f"Degrees of Freedom (Within): {df_within}")
print(f"P-value: {p_value}")


F-value: 6.499877950285229
Degrees of Freedom (Between): 1
Degrees of Freedom (Within): 58
P-value: 0.013454320895372985


### Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

In [4]:
import scipy.stats as stats

# Given data
sample_var1 = 10  # Sample variance for population 1
sample_var2 = 15  # Sample variance for population 2
n1 = 12  # Sample size for population 1
n2 = 12  # Sample size for population 2
alpha = 0.05  # Significance level

# Calculate the F-statistic
F_statistic = sample_var1 / sample_var2

# Degrees of freedom for the F-distribution
df_between = n1 - 1
df_within = n2 - 1

# Critical F-value for a two-tailed test
critical_F_value = stats.f.ppf(1 - alpha / 2, df_between, df_within)

# Compare F-statistic with critical F-value
if F_statistic > critical_F_value or F_statistic < 1 / critical_F_value:
    print("Reject the null hypothesis: Variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: Variances are not significantly different.")


Fail to reject the null hypothesis: Variances are not significantly different.


### Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.

In [6]:
import scipy.stats as stats

# Given values
claimed_variance = 0.005
sample_variance = 0.006
sample_size = 25
alpha = 0.01

# Degrees of freedom
df1 = sample_size - 1
df2 = 1  # Degrees of freedom for the claimed population variance

# F-statistic
f_statistic = (sample_variance / claimed_variance)

# Critical values
critical_value_lower = stats.f.ppf(alpha / 2, df1, df2)
critical_value_upper = stats.f.ppf(1 - alpha / 2, df1, df2)

# Check if the F-statistic is in the critical region
in_critical_region = f_statistic < critical_value_lower or f_statistic > critical_value_upper

# Print results
print("F-Statistic:", f_statistic)
print("Critical Values:", critical_value_lower, critical_value_upper)
print("Is in Critical Region:", in_critical_region)


F-Statistic: 1.2
Critical Values: 0.10469807488970448 24939.565259943236
Is in Critical Region: False


To conduct an F-test for the variance, we use the following hypotheses:

- **Null Hypothesis $(H_0)$:** The population variance is equal to the claimed variance.
- **Alternative Hypothesis $(H_1)$:** The population variance is different from the claimed variance.

Mathematically, this can be expressed as:

$[ H_0: \sigma^2 = 0.005 ]$
$[ H_1: \sigma^2 \neq 0.005 ]$

Next, we'll use the F-statistic formula:

$[ F = \frac{s_1^2}{s_2^2} ]$

where $( s_1^2)$ is the sample variance, and $( s_2^2 )$ is the claimed population variance.

Now, we need to determine the critical region and compare the calculated F-statistic with the critical value from the F-distribution table.

Since it's a two-tailed test and the significance level is 1%, we need to find the critical values at the $( \frac{\alpha}{2} )$ and $( 1 - \frac{\alpha}{2} )$ percentiles of the F-distribution with degrees of freedom $( n_1 - 1)$ and $( n_2 - 1 )$, where $( n_1)$ is the sample size.



This code calculates the F-statistic, critical values, and checks whether the F-statistic falls in the critical region. If it does, you reject the null hypothesis.

Please note that the critical values may vary based on the software or tool you are using. Ensure you use the correct critical values based on your chosen method.

### Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.

In [7]:
import scipy.stats as stats

def calculate_f_distribution_mean_and_variance(df1, df2):
    # Calculate the mean and variance of the F-distribution
    mean = df2 / (df2 - 2) if df2 > 2 else None
    variance = (2 * (df2 ** 2) * (df1 + df2 - 2)) / (df1 * (df2 - 2) ** 2 * (df2 - 4)) if df2 > 4 else None
    
    return mean, variance

# Example usage
df1 = 5
df2 = 10
mean, variance = calculate_f_distribution_mean_and_variance(df1, df2)

# Print the results
print("Mean of F-distribution:", mean)
print("Variance of F-distribution:", variance)


Mean of F-distribution: 1.25
Variance of F-distribution: 1.3541666666666667


### Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.

In [8]:
import scipy.stats as stats

# Given values
sample_variance1 = 25
sample_size1 = 10
sample_variance2 = 20
sample_size2 = 15
alpha = 0.10

# Degrees of freedom
df1 = sample_size1 - 1
df2 = sample_size2 - 1

# F-statistic
f_statistic = sample_variance1 / sample_variance2

# Critical values
critical_value_lower = stats.f.ppf(alpha / 2, df1, df2)
critical_value_upper = stats.f.ppf(1 - alpha / 2, df1, df2)

# Check if the F-statistic is in the critical region
in_critical_region = f_statistic < critical_value_lower or f_statistic > critical_value_upper

# Print results
print("F-Statistic:", f_statistic)
print("Critical Values:", critical_value_lower, critical_value_upper)
print("Is in Critical Region:", in_critical_region)


F-Statistic: 1.25
Critical Values: 0.3305268601412525 2.6457907352338195
Is in Critical Region: False


### Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

In [10]:
import scipy.stats as stats

# Given values
data_restaurant_A = [24, 25, 28, 23, 22, 20, 27]
data_restaurant_B = [31, 33, 35, 30, 32, 36]
alpha = 0.05

# Sample variances
sample_variance_A = stats.tvar(data_restaurant_A)
sample_variance_B = stats.tvar(data_restaurant_B)

# Sample sizes
sample_size_A = len(data_restaurant_A)
sample_size_B = len(data_restaurant_B)

# Degrees of freedom
df1 = sample_size_A - 1
df2 = sample_size_B - 1

# F-statistic
f_statistic = sample_variance_A / sample_variance_B

# Critical values
critical_value_lower = stats.f.ppf(alpha / 2, df1, df2)
critical_value_upper = stats.f.ppf(1 - alpha / 2, df1, df2)

# Check if the F-statistic is in the critical region
in_critical_region = f_statistic < critical_value_lower or f_statistic > critical_value_upper

# Print results
print("F-Statistic:", f_statistic)
print("Critical Values:", critical_value_lower, critical_value_upper)
print("Is in Critical Region:", in_critical_region)


F-Statistic: 1.4551907719609583
Critical Values: 0.16701279718024772 6.977701858535566
Is in Critical Region: False


### Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83; Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.

In [11]:
import scipy.stats as stats

# Given values
group_A_scores = [80, 85, 90, 92, 87, 83]
group_B_scores = [75, 78, 82, 79, 81, 84]
alpha = 0.01

# Sample variances
sample_variance_A = stats.tvar(group_A_scores)
sample_variance_B = stats.tvar(group_B_scores)

# Sample sizes
sample_size_A = len(group_A_scores)
sample_size_B = len(group_B_scores)

# Degrees of freedom
df1 = sample_size_A - 1
df2 = sample_size_B - 1

# F-statistic
f_statistic = sample_variance_A / sample_variance_B

# Critical values
critical_value_lower = stats.f.ppf(alpha / 2, df1, df2)
critical_value_upper = stats.f.ppf(1 - alpha / 2, df1, df2)

# Check if the F-statistic is in the critical region
in_critical_region = f_statistic < critical_value_lower or f_statistic > critical_value_upper

# Print results
print("F-Statistic:", f_statistic)
print("Critical Values:", critical_value_lower, critical_value_upper)
print("Is in Critical Region:", in_critical_region)


F-Statistic: 1.9442622950819677
Critical Values: 0.066936171954696 14.939605459912224
Is in Critical Region: False
