                                            Statistics Advance-7

Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio
    test. The function should return the F-value and the corresponding p-value for the test.

In [1]:
import numpy as np
import scipy.stats as stats

def variance_ratio_test(data1, data2):
    """
    Perform a variance ratio test (F-test) on two arrays of data.

    Parameters:
        data1 (array-like): First data array.
        data2 (array-like): Second data array.

    Returns:
        f_value (float): The calculated F-value.
        p_value (float): The corresponding p-value.
    """
    
    variance1 = np.var(data1, ddof=1)  
    variance2 = np.var(data2, ddof=1)

    
    if variance2 > variance1:
        variance1, variance2 = variance2, variance1
        data1, data2 = data2, data1

    
    f_value = variance1 / variance2

    
    df1 = len(data1) - 1
    df2 = len(data2) - 1

    
    p_value = 1 - stats.f.cdf(f_value, df1, df2)

    return f_value, p_value


data_group1 = [13, 14, 15, 14, 16]
data_group2 = [10, 11, 12, 11, 13]

f_value, p_value = variance_ratio_test(data_group1, data_group2)
print("F-value:", f_value)
print("p-value:", p_value)


F-value: 1.0
p-value: 0.5


Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an
F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [3]:
import scipy.stats as stats

def critical_f_value(alpha, dfn, dfd):
    """
    Calculate the critical F-value for a two-tailed test at a given significance level.

    Parameters:
        alpha (float): Significance level (e.g., 0.05 for a 95% confidence level).
        dfn (int): Degrees of freedom for the numerator.
        dfd (int): Degrees of freedom for the denominator.

    Returns:
        critical_f (float): The critical F-value for the specified significance level.
    """
    
    alpha_over_2 = alpha / 2  
    critical_f = stats.f.ppf(1 - alpha_over_2, dfn, dfd)
    
    return critical_f


alpha = 0.05
dfn = 3  
dfd = 20  

critical_f = critical_f_value(alpha, dfn, dfd)
print("Critical F-value:", critical_f)


Critical F-value: 3.8586986662732143


Q3. Write a Python program that generates random samples from two normal distributions with known

variances and uses an F-test to determine if the variances are equal. The program should output the F-
value, degrees of freedom, and p-value for the test.

In [4]:
import numpy as np
import scipy.stats as stats

np.random.seed(0)

variance1 = 4.0  
variance2 = 9.0  

sample_size1 = 30  
sample_size2 = 30  

data1 = np.random.normal(0, np.sqrt(variance1), size=sample_size1)
data2 = np.random.normal(0, np.sqrt(variance2), size=sample_size2)


f_statistic = np.var(data1, ddof=1) / np.var(data2, ddof=1)  
dfn = sample_size1 - 1  
dfd = sample_size2 - 1  


p_value = 1 - stats.f.cdf(f_statistic, dfn, dfd)


print("F-value:", f_statistic)
print("Degrees of freedom (numerator):", dfn)
print("Degrees of freedom (denominator):", dfd)
print("p-value:", p_value)


F-value: 0.643788359078332
Degrees of freedom (numerator): 29
Degrees of freedom (denominator): 29
p-value: 0.8791884903269248


Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from
each population. Conduct an F-test at the 5% significance level to determine if the variances are
significantly different.

To conduct an F-test at the 5% significance level to determine if the variances of two populations

 1.Calculate the F-statistic:
        F = s_1^2 / s_2^2
        
      where s_1^2 is the variance of the first population and s_2^2 is the variance of the second population.
    
    the F-statistic is:
    
    F = 10 / 15 = 0.67
        
 2.Find the critical value of F from the F-distribution table, 
         with degrees of freedom df_1 = n_1 - 1 = 12 - 1 = 11 and df_2 = n_2 - 1 = 12 - 1 = 11.
    
   At the 5% significance level, the critical value of F is 2.79.

 3.Compare the F-statistic to the critical value of F. If the F-statistic is greater than the critical value of F,
    then we reject the null hypothesis and conclude that the variances of the two populations are significantly different.
    Otherwise, we fail to reject the null hypothesis and conclude that the variances of the two populations
    are not significantly different.  

     the F-statistic (0.67) is less than the critical value of F (2.79). Therefore, we fail to reject the null hypothesis
        and conclude that the variances of the two populations are not significantly different
        
     Conclusion:
        There is not enough evidence to conclude that the variances of the two populations are significantly different
        at the 5% significance level.

Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25
products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance
level to determine if the claim is justified.

To conduct an F-test at the 1% significance level to determine if the manufacturer's claim is justified

   1.Calculate the F-statistic:
          F = s^2 / 0.005
        
        where s^2 is the sample variance.

        case, the F-statistic is:

            F = 0.006 / 0.005 = 1.20
            
    2.Find the critical value of F from the F-distribution table,
       with degrees of freedom df_1 = n - 1 = 25 - 1 = 24 and df_2 = 1 
        
        At the 1% significance level, the critical value of F is 8.53.
        
        3.Compare the F-statistic to the critical value of F.
           If the F-statistic is greater than the critical value of F,
            then we reject the null hypothesis and conclude that the manufacturer's claim is not justified.
            Otherwise, we fail to reject the null hypothesis and conclude that the manufacturer's claim is justified.
            
           the F-statistic (1.20) is less than the critical value of F (8.53). Therefore, we fail to reject
                the null hypothesis and conclude that the manufacturer's claim is justified.
            
        Conclusion:
            There is not enough evidence to reject the manufacturer's claim that the variance of the diameter 
            of the product is 0.005 at the 1% significance level.
         
        
        Interpretation:

         This means that we cannot say with 99% confidence that the manufacturer's claim is false. However,
          we should note that the sample variance (0.006) is slightly higher than the claimed variance (0.005).
            This suggests that the variance of the product may be slightly higher than claimed.
            
            

Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an
F-distribution and calculates the mean and variance of the distribution. The function should return the
mean and variance as a tuple.

In [5]:
import numpy as np

def calculate_mean_and_variance_of_f_distribution(df1, df2):
  """Calculates the mean and variance of the F-distribution.

  Args:
    df1: The degrees of freedom for the numerator.
    df2: The degrees of freedom for the denominator.

  Returns:
    A tuple containing the mean and variance of the F-distribution.
  """

  mean = df2 / (df2 - 2)
  variance = (2 * df2**2 * (df1 + df2 - 2)) / ((df1 - 2) * (df2 - 2) * (df2 - 4))
  return mean, variance



df1 = 25 - 1
df2 = 25 - 1

mean, variance = calculate_mean_and_variance_of_f_distribution(df1, df2)

print("Mean:", mean)
print("Variance:", variance)


Mean: 1.0909090909090908
Variance: 5.474380165289256


Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The
sample variance is found to be 25. Another random sample of 15 measurements is taken from another
normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test
at the 10% significance level to determine if the variances are significantly different.

In [6]:
import scipy.stats as stats

def f_distribution_mean_and_variance(dfn, dfd):
    """
    Calculate the mean and variance of an F-distribution.

    Parameters:
        dfn (int): Degrees of freedom for the numerator.
        dfd (int): Degrees of freedom for the denominator.

    Returns:
        (float, float): A tuple containing the mean and variance of the F-distribution.
    """

    mean = dfd / (dfd - 2) if dfd > 2 else None

    
    variance = (2 * dfd**2 * (dfn + dfd - 2)) / (dfn * (dfd - 2)**2 * (dfd - 4)) if dfd > 4 else None

    return mean, variance


dfn = 5  
dfd = 10  

mean, variance = f_distribution_mean_and_variance(dfn, dfd)
print("Mean:", mean)
print("Variance:", variance)


Mean: 1.25
Variance: 1.3541666666666667


This function can be used to calculate the mean and variance of the F-distribution for
any given degrees of freedom. It can be used for a variety of statistical applications,
such as hypothesis testing and power analysis.

Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday
night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5%
significance level to determine if the variances are significantly different.


To conduct an F-test at the 5% significance level to determine if the variances of the two restaurant
         waiting time population  are significantly differen
    
    1.Calculate the sample variances for each restaurant:
        Restaurant A: s1^2 = 4.00
        Restaurant B: s2^2 = 16.00
    2.Calculate the F-statistic:
         F = s2^2 / s1^2 = 16.00 / 4.00 = 4.00
            
    3.Find the critical value of F from the F-distribution table,
        with degrees of freedom df_1 = n_1 - 1 = 7 - 1 = 6 and df_2 = n_2 - 1 = 6 - 1 = 5.
        
      At the 5% significance level, the critical value of F is 5.12.
    
    4.Compare the F-statistic to the critical value of F. 
        If the F-statistic is greater than the critical value of F, then we reject the null hypothesis 
        and conclude that the variances of the two restaurant waiting time populations are significantly different.
        Otherwise, we fail to reject the null hypothesis and conclude that the variances of the two restaurant
        waiting time populations are not significantly different.
        
     the F-statistic (4.00) is less than the critical value of F (5.12). Therefore,
         we fail to reject the null hypothesis and conclude that the variances of the two restaurant
          waiting time populations are not significantly different at the 5% significance level.  
            
        Conclusion:
            There is not enough evidence to conclude that the variances of the two restaurant waiting time 
            populations are significantly different at the 5% significance level.

Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83;
Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances
are significantly different.

 F-test at the 1% significance level to determine if the variances of the two groups of student
    test scores
    are significantly different
    
    1.Calculate the sample variances for each group:
        Group A: s1^2 = 11.00
        Group B: s2^2 = 6.00
    2.Calculate the F-statistic:
        F = s2^2 / s1^2 = 6.00 / 11.00 = 0.55
        
    3.Find the critical value of F from the F-distribution table, 
    with degrees of freedom df_1 = n_1 - 1 = 6 - 1 = 5 and df_2 = n_2 - 1 = 6 - 1 = 5. 
    
    At the 1% significance level, the critical value of F is 5.12.
    
    4.Compare the F-statistic to the critical value of F. If the F-statistic is greater than the critical value of F,
       then we reject the null hypothesis and conclude that the variances of the two groups of student test scores
        are significantly different. Otherwise, we fail to reject the null hypothesis and conclude that the variances
        of the two groups of student test scores are not significantly different.
        
        
       the F-statistic (0.55) is less than the critical value of F (5.12). Therefore, we fail to reject the null
       hypothesis and conclude that the variances of the two groups of student test scores are not significantly 
        different at the 1% significance level.
        
       Conclusion:
        There is not enough evidence to conclude that the variances of the two groups of student test scores
        are significantly different at the 1% significance level.
        
      Interpretation:
        This means that we cannot say with 99% confidence that the variances of the two groups of student test scores 
        are different.
           However, we should note that the sample variance for Group B (6.00) is slightly lower than the sample
            variance for Group A (11.00). This suggests that the variance of the test scores for Group B may be 
            slightly lower than the variance of the test scores for Group A.
            
            
            