### Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.

In [11]:
import numpy as np
from scipy.stats import f

def variance_ratio_test(data1, data2):
    """
    Perform a variance ratio test (F-test) on two datasets.
    
    Parameters:
    data1 (array-like): First dataset.
    data2 (array-like): Second dataset.
    
    Returns:
    F-value (float): The calculated F-value.
    p-value (float): The corresponding p-value for the test.
    """
    # Calculate variances of the two datasets
    var1 = np.var(data1, ddof=1)
    var2 = np.var(data2, ddof=1)
    
    # Calculate the F-value
    if var1 > var2:
        F = var1 / var2
        dfn = len(data1) - 1  # degrees of freedom for numerator
        dfd = len(data2) - 1  # degrees of freedom for denominator
    else:
        F = var2 / var1
        dfn = len(data2) - 1  # degrees of freedom for numerator
        dfd = len(data1) - 1  # degrees of freedom for denominator
    
    # Calculate the p-value
    p_value = 1 - f.cdf(F, dfn, dfd)
    
    return F, p_value

# Example usage:
data1 = [12, 14, 16, 18, 20]
data2 = [22, 24, 26, 28, 30]
F, p_value = variance_ratio_test(data1, data2)
F, p_value


(1.0, 0.5)

### Explanation
- **Variance Calculation:** The variances of the two datasets (var1 and var2) are calculated using np.var with ddof=1 to get the sample variance.
- **F-value Calculation:** The F-value is calculated as the ratio of the larger variance to the smaller variance. This ensures that the F-value is always greater than or equal to 1.
- **Degrees of Freedom:** The degrees of freedom for the numerator (dfn) and the denominator (dfd) are determined based on which variance is larger.
- **p-value Calculation:** The p-value is calculated using the cumulative distribution function (CDF) of the F-distribution (f.cdf). The p-value represents the probability of observing an F-value as extreme as, or more extreme than, the one calculated, under the null hypothesis that the variances are equal.

Example Usage

The example provided at the end shows how to use the variance_ratio_test function with two sample datasets. The function returns the F-value and the p-value, which can be used to assess the equality of variances.

### Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [12]:
from scipy.stats import f

def critical_f_value(alpha, dfn, dfd):
    """
    Calculate the critical F-value for a two-tailed test.

    Parameters:
    alpha (float): Significance level (e.g., 0.05 for a 5% significance level).
    dfn (int): Degrees of freedom for the numerator.
    dfd (int): Degrees of freedom for the denominator.

    Returns:
    critical_value (float): The critical F-value for the two-tailed test.
    """
    # For a two-tailed test, we need to split the significance level in half for each tail
    alpha_half = alpha / 2

    # Calculate the critical F-values for each tail
    f_critical_low = f.ppf(alpha_half, dfn, dfd)
    f_critical_high = f.ppf(1 - alpha_half, dfn, dfd)

    return f_critical_low, f_critical_high

# Example usage:
alpha = 0.05
dfn = 5  # degrees of freedom for numerator
dfd = 10  # degrees of freedom for denominator
critical_values = critical_f_value(alpha, dfn, dfd)
critical_values


(0.15107670102998208, 4.236085668188633)

### Q3. Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the F-value, degrees of freedom, and p-value for the test.

# F-test for Equality of Variances

This Python program generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program outputs the F-value, degrees of freedom, and p-value for the test.



In [13]:

import numpy as np
from scipy.stats import f

def generate_samples(mean1, var1, n1, mean2, var2, n2):
    """
    Generate random samples from two normal distributions.
    
    Parameters:
    mean1 (float): Mean of the first distribution.
    var1 (float): Variance of the first distribution.
    n1 (int): Number of samples from the first distribution.
    mean2 (float): Mean of the second distribution.
    var2 (float): Variance of the second distribution.
    n2 (int): Number of samples from the second distribution.
    
    Returns:
    data1 (array): Random samples from the first distribution.
    data2 (array): Random samples from the second distribution.
    """
    data1 = np.random.normal(mean1, np.sqrt(var1), n1)
    data2 = np.random.normal(mean2, np.sqrt(var2), n2)
    
    return data1, data2

def variance_ratio_test(data1, data2):
    """
    Perform a variance ratio test (F-test) on two datasets.
    
    Parameters:
    data1 (array-like): First dataset.
    data2 (array-like): Second dataset.
    
    Returns:
    F-value (float): The calculated F-value.
    p-value (float): The corresponding p-value for the test.
    dfn (int): Degrees of freedom for the numerator.
    dfd (int): Degrees of freedom for the denominator.
    """
    # Calculate variances of the two datasets
    var1 = np.var(data1, ddof=1)
    var2 = np.var(data2, ddof=1)
    
    # Calculate the F-value
    if var1 > var2:
        F = var1 / var2
        dfn = len(data1) - 1  # degrees of freedom for numerator
        dfd = len(data2) - 1  # degrees of freedom for denominator
    else:
        F = var2 / var1
        dfn = len(data2) - 1  # degrees of freedom for numerator
        dfd = len(data1) - 1  # degrees of freedom for denominator
    
    # Calculate the p-value
    p_value = 1 - f.cdf(F, dfn, dfd)
    
    return F, p_value, dfn, dfd

# Parameters for the normal distributions
mean1 = 0
var1 = 1
n1 = 30

mean2 = 0
var2 = 2
n2 = 30

# Generate random samples
data1, data2 = generate_samples(mean1, var1, n1, mean2, var2, n2)

# Perform the variance ratio test
F, p_value, dfn, dfd = variance_ratio_test(data1, data2)

# Output the results
print(f"F-value: {F}")
print(f"Degrees of freedom (numerator): {dfn}")
print(f"Degrees of freedom (denominator): {dfd}")
print(f"p-value: {p_value}")


F-value: 1.431410609701287
Degrees of freedom (numerator): 29
Degrees of freedom (denominator): 29
p-value: 0.16981167196351632


### Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

# F-test for Equality of Known Population Variances

Given the variances of two populations are known to be 10 and 15, and a sample of 12 observations is taken from each population, we will conduct an F-test at the 5% significance level to determine if the variances are significantly different.

## Steps to Conduct the F-test

1. **Define the hypotheses**:
    - Null hypothesis (\(H_0\)): The variances of the two populations are equal (\(\sigma_1^2 = \sigma_2^2\)).
    - Alternative hypothesis (\(H_1\)): The variances of the two populations are not equal (\(\sigma_1^2 \neq \sigma_2^2\)).

2. **Calculate the F-value**:
    - \(F = \frac{\text{larger variance}}{\text{smaller variance}}\)

3. **Determine the critical F-values** for a two-tailed test at the 5% significance level.

4. **Compare the calculated F-value** to the critical F-values to decide whether to reject the null hypothesis.

```python

In [14]:

from scipy.stats import f

# Given data
var1 = 10
var2 = 15
n1 = 12
n2 = 12
alpha = 0.05

# Calculate the F-value
if var1 > var2:
    F = var1 / var2
    dfn = n1 - 1  # degrees of freedom for numerator
    dfd = n2 - 1  # degrees of freedom for denominator
else:
    F = var2 / var1
    dfn = n2 - 1  # degrees of freedom for numerator
    dfd = n1 - 1  # degrees of freedom for denominator

# Determine the critical F-values for a two-tailed test
alpha_half = alpha / 2
f_critical_low = f.ppf(alpha_half, dfn, dfd)
f_critical_high = f.ppf(1 - alpha_half, dfn, dfd)

# Calculate the p-value
p_value = 1 - f.cdf(F, dfn, dfd)

# Output the results
print(f"Calculated F-value: {F}")
print(f"Degrees of freedom (numerator): {dfn}")
print(f"Degrees of freedom (denominator): {dfd}")
print(f"Critical F-value (lower): {f_critical_low}")
print(f"Critical F-value (upper): {f_critical_high}")
print(f"p-value: {p_value}")

# Conclusion
if F < f_critical_low or F > f_critical_high:
    print("Reject the null hypothesis: The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: The variances are not significantly different.")


Calculated F-value: 1.5
Degrees of freedom (numerator): 11
Degrees of freedom (denominator): 11
Critical F-value (lower): 0.28787755798459863
Critical F-value (upper): 3.473699051085809
p-value: 0.25619489936789996
Fail to reject the null hypothesis: The variances are not significantly different.


### Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.

# F-test for Variance of Product Diameter

A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. We will conduct an F-test at the 1% significance level to determine if the claim is justified.

## Steps to Conduct the F-test

1. **Define the hypotheses**:
    - Null hypothesis (\(H_0\)): The variance of the product diameter is 0.005 (\(\sigma^2 = 0.005\)).
    - Alternative hypothesis (\(H_1\)): The variance of the product diameter is not 0.005 (\(\sigma^2 \neq 0.005\)).

2. **Calculate the F-value**:
    - \(F = \frac{\text{sample variance}}{\text{claimed variance}}\)

3. **Determine the critical F-values** for a two-tailed test at the 1% significance level.

4. **Compare the calculated F-value** to the critical F-values to decide whether to reject the null hypothesis.



In [15]:

from scipy.stats import f

# Given data
claimed_variance = 0.005
sample_variance = 0.006
n = 25
alpha = 0.01

# Calculate the F-value
F = sample_variance / claimed_variance
dfn = n - 1  # degrees of freedom for the sample

# Determine the critical F-values for a two-tailed test
alpha_half = alpha / 2
f_critical_low = f.ppf(alpha_half, dfn, float('inf'))
f_critical_high = f.ppf(1 - alpha_half, dfn, float('inf'))

# Calculate the p-value
p_value = 2 * min(f.cdf(F, dfn, float('inf')), 1 - f.cdf(F, dfn, float('inf')))

# Output the results
print(f"Calculated F-value: {F}")
print(f"Degrees of freedom: {dfn}")
print(f"Critical F-value (lower): {f_critical_low}")
print(f"Critical F-value (upper): {f_critical_high}")
print(f"p-value: {p_value}")

# Conclusion
if F < f_critical_low or F > f_critical_high:
    print("Reject the null hypothesis: The variance is significantly different from 0.005.")
else:
    print("Fail to reject the null hypothesis: The variance is not significantly different from 0.005.")


Calculated F-value: 1.2
Degrees of freedom: 24
Critical F-value (lower): nan
Critical F-value (upper): nan
p-value: 0.0
Fail to reject the null hypothesis: The variance is not significantly different from 0.005.


### Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.

# Mean and Variance of an F-Distribution

This Python function calculates the mean and variance of an F-distribution given the degrees of freedom for the numerator and denominator.

## Formulas

For an F-distribution with degrees of freedom \(d_1\) (numerator) and \(d_2\) (denominator):
- The mean is defined as \( \mu = \frac{d_2}{d_2 - 2} \) for \(d_2 > 2\).
- The variance is defined as \( \sigma^2 = \frac{2d_2^2(d_1 + d_2 - 2)}{d_1(d_2 - 2)^2(d_2 - 4)} \) for \(d_2 > 4\).



In [16]:

def f_distribution_mean_variance(dfn, dfd):
    """
    Calculate the mean and variance of an F-distribution.

    Parameters:
    dfn (int): Degrees of freedom for the numerator.
    dfd (int): Degrees of freedom for the denominator.

    Returns:
    tuple: (mean, variance) of the F-distribution.
    """
    if dfd <= 2:
        mean = float('inf')  # Mean is not defined for dfd <= 2
    else:
        mean = dfd / (dfd - 2)

    if dfd <= 4:
        variance = float('inf')  # Variance is not defined for dfd <= 4
    else:
        variance = (2 * dfd**2 * (dfn + dfd - 2)) / (dfn * (dfd - 2)**2 * (dfd - 4))

    return mean, variance

# Example usage:
dfn = 5  # degrees of freedom for numerator
dfd = 10  # degrees of freedom for denominator
mean, variance = f_distribution_mean_variance(dfn, dfd)
mean, variance


(1.25, 1.3541666666666667)

### Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.

# F-test for Equality of Variances from Two Normal Populations

A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. We will conduct an F-test at the 10% significance level to determine if the variances are significantly different.

## Steps to Conduct the F-test

1. **Define the hypotheses**:
    - Null hypothesis (\(H_0\)): The variances of the two populations are equal (\(\sigma_1^2 = \sigma_2^2\)).
    - Alternative hypothesis (\(H_1\)): The variances of the two populations are not equal (\(\sigma_1^2 \neq \sigma_2^2\)).

2. **Calculate the F-value**:
    - \(F = \frac{\text{larger variance}}{\text{smaller variance}}\)

3. **Determine the critical F-values** for a two-tailed test at the 10% significance level.

4. **Compare the calculated F-value** to the critical F-values to decide whether to reject the null hypothesis.



In [17]:

from scipy.stats import f

# Given data
sample_var1 = 25
sample_var2 = 20
n1 = 10
n2 = 15
alpha = 0.10

# Calculate the F-value
if sample_var1 > sample_var2:
    F = sample_var1 / sample_var2
    dfn = n1 - 1  # degrees of freedom for numerator
    dfd = n2 - 1  # degrees of freedom for denominator
else:
    F = sample_var2 / sample_var1
    dfn = n2 - 1  # degrees of freedom for numerator
    dfd = n1 - 1  # degrees of freedom for denominator

# Determine the critical F-values for a two-tailed test
alpha_half = alpha / 2
f_critical_low = f.ppf(alpha_half, dfn, dfd)
f_critical_high = f.ppf(1 - alpha_half, dfn, dfd)

# Calculate the p-value
p_value = 2 * min(f.cdf(F, dfn, dfd), 1 - f.cdf(F, dfn, dfd))

# Output the results
print(f"Calculated F-value: {F}")
print(f"Degrees of freedom (numerator): {dfn}")
print(f"Degrees of freedom (denominator): {dfd}")
print(f"Critical F-value (lower): {f_critical_low}")
print(f"Critical F-value (upper): {f_critical_high}")
print(f"p-value: {p_value}")

# Conclusion
if F < f_critical_low or F > f_critical_high:
    print("Reject the null hypothesis: The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: The variances are not significantly different.")


Calculated F-value: 1.25
Degrees of freedom (numerator): 9
Degrees of freedom (denominator): 14
Critical F-value (lower): 0.3305268601412525
Critical F-value (upper): 2.6457907352338195
p-value: 0.6832194382585952
Fail to reject the null hypothesis: The variances are not significantly different.


### Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

# F-test for Equality of Variances of Waiting Times at Two Restaurants

The following data represent the waiting times in minutes at two different restaurants on a Saturday night:
- Restaurant A: 24, 25, 28, 23, 22, 20, 27
- Restaurant B: 31, 33, 35, 30, 32, 36

We will conduct an F-test at the 5% significance level to determine if the variances are significantly different.

## Steps to Conduct the F-test

1. **Define the hypotheses**:
    - Null hypothesis (\(H_0\)): The variances of the two populations are equal (\(\sigma_1^2 = \sigma_2^2\)).
    - Alternative hypothesis (\(H_1\)): The variances of the two populations are not equal (\(\sigma_1^2 \neq \sigma_2^2\)).

2. **Calculate the sample variances** for each restaurant.

3. **Calculate the F-value**:
    - \(F = \frac{\text{larger variance}}{\text{smaller variance}}\)

4. **Determine the critical F-values** for a two-tailed test at the 5% significance level.

5. **Compare the calculated F-value** to the critical F-values to decide whether to reject the null hypothesis.



In [18]:

from scipy.stats import f
import numpy as np

# Given data
restaurant_A = [24, 25, 28, 23, 22, 20, 27]
restaurant_B = [31, 33, 35, 30, 32, 36]
alpha = 0.05

# Calculate the sample variances
sample_var_A = np.var(restaurant_A, ddof=1)
sample_var_B = np.var(restaurant_B, ddof=1)

# Calculate the F-value
if sample_var_A > sample_var_B:
    F = sample_var_A / sample_var_B
    dfn = len(restaurant_A) - 1  # degrees of freedom for numerator
    dfd = len(restaurant_B) - 1  # degrees of freedom for denominator
else:
    F = sample_var_B / sample_var_A
    dfn = len(restaurant_B) - 1  # degrees of freedom for numerator
    dfd = len(restaurant_A) - 1  # degrees of freedom for denominator

# Determine the critical F-values for a two-tailed test
alpha_half = alpha / 2
f_critical_low = f.ppf(alpha_half, dfn, dfd)
f_critical_high = f.ppf(1 - alpha_half, dfn, dfd)

# Calculate the p-value
p_value = 2 * min(f.cdf(F, dfn, dfd), 1 - f.cdf(F, dfn, dfd))

# Output the results
print(f"Sample variance of Restaurant A: {sample_var_A}")
print(f"Sample variance of Restaurant B: {sample_var_B}")
print(f"Calculated F-value: {F}")
print(f"Degrees of freedom (numerator): {dfn}")
print(f"Degrees of freedom (denominator): {dfd}")
print(f"Critical F-value (lower): {f_critical_low}")
print(f"Critical F-value (upper): {f_critical_high}")
print(f"p-value: {p_value}")

# Conclusion
if F < f_critical_low or F > f_critical_high:
    print("Reject the null hypothesis: The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: The variances are not significantly different.")


Sample variance of Restaurant A: 7.80952380952381
Sample variance of Restaurant B: 5.366666666666667
Calculated F-value: 1.4551907719609583
Degrees of freedom (numerator): 6
Degrees of freedom (denominator): 5
Critical F-value (lower): 0.16701279718024772
Critical F-value (upper): 6.977701858535566
p-value: 0.6974815747937484
Fail to reject the null hypothesis: The variances are not significantly different.


### Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83 Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.

# F-test for Equality of Variances of Test Scores between Two Groups

The following data represent the test scores of two groups of students:
- Group A: 80, 85, 90, 92, 87, 83
- Group B: 75, 78, 82, 79, 81, 84

We will conduct an F-test at the 1% significance level to determine if the variances are significantly different.

## Steps to Conduct the F-test

1. **Define the hypotheses**:
    - Null hypothesis (\(H_0\)): The variances of the two groups are equal (\(\sigma_1^2 = \sigma_2^2\)).
    - Alternative hypothesis (\(H_1\)): The variances of the two groups are not equal (\(\sigma_1^2 \neq \sigma_2^2\)).

2. **Calculate the sample variances** for each group.

3. **Calculate the F-value**:
    - \(F = \frac{\text{larger variance}}{\text{smaller variance}}\)

4. **Determine the critical F-values** for a two-tailed test at the 1% significance level.

5. **Compare the calculated F-value** to the critical F-values to decide whether to reject the null hypothesis.



In [19]:

from scipy.stats import f
import numpy as np

# Given data
group_A = [80, 85, 90, 92, 87, 83]
group_B = [75, 78, 82, 79, 81, 84]
alpha = 0.01

# Calculate the sample variances
sample_var_A = np.var(group_A, ddof=1)
sample_var_B = np.var(group_B, ddof=1)

# Calculate the F-value
if sample_var_A > sample_var_B:
    F = sample_var_A / sample_var_B
    dfn = len(group_A) - 1  # degrees of freedom for numerator
    dfd = len(group_B) - 1  # degrees of freedom for denominator
else:
    F = sample_var_B / sample_var_A
    dfn = len(group_B) - 1  # degrees of freedom for numerator
    dfd = len(group_A) - 1  # degrees of freedom for denominator

# Determine the critical F-values for a two-tailed test
alpha_half = alpha / 2
f_critical_low = f.ppf(alpha_half, dfn, dfd)
f_critical_high = f.ppf(1 - alpha_half, dfn, dfd)

# Calculate the p-value
p_value = 2 * min(f.cdf(F, dfn, dfd), 1 - f.cdf(F, dfn, dfd))

# Output the results
print(f"Sample variance of Group A: {sample_var_A}")
print(f"Sample variance of Group B: {sample_var_B}")
print(f"Calculated F-value: {F}")
print(f"Degrees of freedom (numerator): {dfn}")
print(f"Degrees of freedom (denominator): {dfd}")
print(f"Critical F-value (lower): {f_critical_low}")
print(f"Critical F-value (upper): {f_critical_high}")
print(f"p-value: {p_value}")

# Conclusion
if F < f_critical_low or F > f_critical_high:
    print("Reject the null hypothesis: The variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: The variances are not significantly different.")


Sample variance of Group A: 19.76666666666667
Sample variance of Group B: 10.166666666666666
Calculated F-value: 1.9442622950819677
Degrees of freedom (numerator): 5
Degrees of freedom (denominator): 5
Critical F-value (lower): 0.06693617195469603
Critical F-value (upper): 14.939605459912219
p-value: 0.4831043549070688
Fail to reject the null hypothesis: The variances are not significantly different.
