## Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.

In [1]:
#Ans ==

import numpy as np
from scipy.stats import f_oneway

def calculate_f_value(arr1, arr2):
    """
    Calculate F-value and p-value for variance ratio test (ANOVA).

    Parameters:
    - arr1: First array of data
    - arr2: Second array of data

    Returns:
    - f_value: F-value for the variance ratio test
    - p_value: Corresponding p-value
    """
    # Check if the arrays are valid and have enough data points
    if len(arr1) < 2 or len(arr2) < 2:
        raise ValueError("Both arrays must have at least 2 data points.")

    # Perform the F-test using scipy.stats.f_oneway
    f_value, p_value = f_oneway(arr1, arr2)

    return f_value, p_value

# Example usage:
array1 = np.random.normal(0, 1, 100)  # Example array 1 with 100 data points
array2 = np.random.normal(1, 1, 100)  # Example array 2 with 100 data points

f_value, p_value = calculate_f_value(array1, array2)

print("F-value:", f_value)
print("P-value:", p_value)


F-value: 83.42164758223048
P-value: 7.862285733887039e-17


## Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [1]:
#Ans ==
from scipy.stats import f

def critical_f_value(alpha, df_num, df_denom):
    # Calculate the critical F-value for a two-tailed test
    critical_value = f.ppf(1 - alpha/2, df_num, df_denom)
    return critical_value

# Example usage:
significance_level = 0.05
degrees_of_freedom_num = 3  # replace with your actual value
degrees_of_freedom_denom = 20  # replace with your actual value

result = critical_f_value(significance_level, degrees_of_freedom_num, degrees_of_freedom_denom)
print(f"Critical F-value: {result}")


Critical F-value: 3.8586986662732143


## Q3. Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the F- value, degrees of freedom, and p-value for the test.

In [2]:
import numpy as np
from scipy.stats import f

def f_test(sample1, sample2):
    # Calculate variances
    var1 = np.var(sample1, ddof=1)
    var2 = np.var(sample2, ddof=1)

    # Calculate degrees of freedom
    df1 = len(sample1) - 1
    df2 = len(sample2) - 1

    # Calculate F-value
    f_value = var1 / var2 if var1 > var2 else var2 / var1

    # Calculate p-value
    p_value = 2 * min(f.cdf(f_value, df1, df2), 1 - f.cdf(f_value, df1, df2))

    return f_value, df1, df2, p_value

# Generate random samples from two normal distributions
np.random.seed(42)  # Set seed for reproducibility
sample1 = np.random.normal(loc=0, scale=1, size=100)
sample2 = np.random.normal(loc=0, scale=1.5, size=100)

# Perform F-test
f_value, df1, df2, p_value = f_test(sample1, sample2)

# Print results
print(f"F-value: {f_value:.4f}")
print(f"Degrees of freedom: ({df1}, {df2})")
print(f"P-value: {p_value:.4f}")

# Interpret the results
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis. Variances are not equal.")
else:
    print("Fail to reject the null hypothesis. Variances are equal.")


F-value: 2.4811
Degrees of freedom: (99, 99)
P-value: 0.0000
Reject the null hypothesis. Variances are not equal.


## Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

# Ans ==

To conduct an F-test to determine if the variances of two populations are significantly different, you can use the following steps:

1. State the Hypotheses:

•Null Hypothesis (H0): The variances are equal (σ₁² = σ₂²)
•Alternative Hypothesis (H1): The variances are not equal (σ₁² ≠ σ₂²)
2. Select the Significance Level:

•α = 0.05 (5%)
3. Calculate the F-statistic: The formula for the F-statistic is given by:

𝐹=𝑆12/𝑆22

where 𝑆12 and 𝑆22 are sample variances from the two populations.

In this case, 𝑆12 is the sample variance from the population with a known variance of 10, and 𝑆22 is the sample variance from the population with a known variance of 15.

𝐹=𝑆12/𝑆22=10/15

4. Determine the Critical Region:

•Degrees of freedom for the numerator (𝑑𝑓1) is the number of observations in the first sample minus 1.
•Degrees of freedom for the denominator (𝑑𝑓2) is the number of observations in the second sample minus 1.
You can find critical values for the F-distribution with 𝑑𝑓1 and 𝑑𝑓2 degrees of freedom at the 5% significance level.

5. Make a Decision:

•If the calculated F-statistic is in the critical region, reject the null hypothesis.
•If the calculated F-statistic is not in the critical region, fail to reject the null hypothesis.
Note: When using tables or software to find critical values, make sure to match the degrees of freedom for the numerator and denominator.

Without the specific sample variances, it's not possible to provide the exact critical values or make a decision on whether to reject the null hypothesis. You need to compute the F-statistic using the sample variances and compare it to the critical value from the F-distribution.



## Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.

# Ans ==
To conduct an F-test to determine if the claim about the variance is justified, we can use the following hypotheses:

•Null Hypothesis (H0): The population variance is equal to the claimed variance.
•Alternative Hypothesis (H1): The population variance is greater than the claimed variance.
Mathematically:

•H0: σ² = 0.005
•H1: σ² > 0.005
Here, σ² is the population variance.

The test statistic (F) is given by:

𝐹=𝑠2/𝜎2

where:

•𝑠2 is the sample variance
•𝜎2 is the claimed population variance
In this case, the critical region for the right-tailed test at a 1% significance level will be determined by the F-distribution with degrees of freedom (𝑛−1) for the numerator (sample variance) and (sample size−1) for the denominator (claimed variance).

Let's calculate the critical F-value and compare it with the calculated F-value:

Critical F-value=𝐹𝛼,(𝑛−1),(25−1)

Now, let's perform the calculations:

Calculated F-value=0.0060.005

Degrees of freedom for the numerator (𝑑𝑓1): 25−1=24 Degrees of freedom for the denominator (𝑑𝑓2): 1 (since there's only one claimed variance)

Critical F-value=𝐹0.01,24,1

You can look up the critical F-value from an F-table or use statistical software for this purpose.

If the calculated F-value is greater than the critical F-value, you reject the null hypothesis and conclude that there is enough evidence to support the manufacturer's claim. Otherwise, if the calculated F-value is less than or equal to the critical F-value, you fail to reject the null hypothesis.



## Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.

In [3]:
def f_distribution_mean_variance(df1, df2):
    if df2 <= 2:
        raise ValueError("Degrees of freedom for denominator (df2) must be greater than 2.")
    
    if df2 <= 4:
        raise ValueError("Degrees of freedom for denominator (df2) must be greater than 4 for variance calculation.")

    mean = df2 / (df2 - 2)
    variance = (2 * df2**2 * (df1 + df2 - 2)) / (df1 * (df2 - 2)**2 * (df2 - 4))

    return mean, variance

# Example usage:
df1 = 3  # Degrees of freedom for numerator
df2 = 10  # Degrees of freedom for denominator

mean, variance = f_distribution_mean_variance(df1, df2)
print(f"Mean: {mean}, Variance: {variance}")


Mean: 1.25, Variance: 1.9097222222222223


## Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.

In [4]:
#Ans ==
from scipy.stats import f

# Given data
sample_var1 = 25
sample_var2 = 20
alpha = 0.10
df1 = 10 - 1  # degrees of freedom for the first sample
df2 = 15 - 1  # degrees of freedom for the second sample

# Test statistic
F = sample_var1 / sample_var2

# Critical values from F-distribution
critical_value_lower = f.ppf(alpha/2, df1, df2)
critical_value_upper = f.ppf(1 - alpha/2, df1, df2)

# Decision
if F < critical_value_lower or F > critical_value_upper:
    print("Reject the null hypothesis. Variances are significantly different.")
else:
    print("Fail to reject the null hypothesis. Variances are not significantly different.")

print("Test Statistic:", F)
print("Critical Values (Lower, Upper):", critical_value_lower, critical_value_upper)


Fail to reject the null hypothesis. Variances are not significantly different.
Test Statistic: 1.25
Critical Values (Lower, Upper): 0.3305268601412525 2.6457907352338195


## Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

In [5]:
import numpy as np
from scipy.stats import f

# Given data
waiting_times_A = np.array([24, 25, 28, 23, 22, 20, 27])
waiting_times_B = np.array([31, 33, 35, 30, 32, 36])
alpha = 0.05

# Sample variances
variance_A = np.var(waiting_times_A, ddof=1)
variance_B = np.var(waiting_times_B, ddof=1)

# Test statistic
F = variance_A / variance_B

# Degrees of freedom
df1 = len(waiting_times_A) - 1
df2 = len(waiting_times_B) - 1

# Critical values from F-distribution
critical_value_lower = f.ppf(alpha/2, df1, df2)
critical_value_upper = f.ppf(1 - alpha/2, df1, df2)

# Decision
if F < critical_value_lower or F > critical_value_upper:
    print("Reject the null hypothesis. Variances are significantly different.")
else:
    print("Fail to reject the null hypothesis. Variances are not significantly different.")

print("Test Statistic:", F)
print("Critical Values (Lower, Upper):", critical_value_lower, critical_value_upper)


Fail to reject the null hypothesis. Variances are not significantly different.
Test Statistic: 1.4551907719609583
Critical Values (Lower, Upper): 0.16701279718024772 6.977701858535566


## Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83; Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.

In [6]:
#Ans ==
import numpy as np
from scipy.stats import f

# Given data
test_scores_A = np.array([80, 85, 90, 92, 87, 83])
test_scores_B = np.array([75, 78, 82, 79, 81, 84])
alpha = 0.01

# Sample variances
variance_A = np.var(test_scores_A, ddof=1)
variance_B = np.var(test_scores_B, ddof=1)

# Test statistic
F = variance_A / variance_B

# Degrees of freedom
df1 = len(test_scores_A) - 1
df2 = len(test_scores_B) - 1

# Critical values from F-distribution
critical_value_lower = f.ppf(alpha/2, df1, df2)
critical_value_upper = f.ppf(1 - alpha/2, df1, df2)

# Decision
if F < critical_value_lower or F > critical_value_upper:
    print("Reject the null hypothesis. Variances are significantly different.")
else:
    print("Fail to reject the null hypothesis. Variances are not significantly different.")

print("Test Statistic:", F)
print("Critical Values (Lower, Upper):", critical_value_lower, critical_value_upper)


Fail to reject the null hypothesis. Variances are not significantly different.
Test Statistic: 1.9442622950819677
Critical Values (Lower, Upper): 0.066936171954696 14.939605459912224
