# PW SKILLS

## Assignment Questions 

### Q1. Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.
### Answer : Certainly! You can use the scipy.stats module to perform the variance ratio test. Here's a simple Python function that takes two arrays of data and calculates the F-value and corresponding p-value for the variance ratio test:

In [1]:
import numpy as np
from scipy.stats import f_oneway

def variance_ratio_test(data1, data2):
    """
    Perform variance ratio test on two arrays of data.

    Parameters:
    - data1: First array of data
    - data2: Second array of data

    Returns:
    - f_value: F-value for the variance ratio test
    - p_value: Corresponding p-value
    """

    # Ensure that the input data are arrays
    data1 = np.asarray(data1)
    data2 = np.asarray(data2)

    # Perform variance ratio test using scipy.stats.f_oneway
    f_value, p_value = f_oneway(data1, data2)

    return f_value, p_value

# Example usage:
data1 = [1, 2, 3, 4, 5]
data2 = [2, 4, 6, 8, 10]

f_value, p_value = variance_ratio_test(data1, data2)
print("F-value:", f_value)
print("P-value:", p_value)


F-value: 3.6
P-value: 0.0943497728424377


### This function uses scipy.stats.f_oneway to perform a one-way ANOVA, which is equivalent to a variance ratio test when there are two groups. The F-value and p-value are then returned by the function.

### Q2. Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.
### Answer : Certainly! The critical F-value for a two-tailed test with a given significance level and degrees of freedom for the numerator and denominator can be obtained using the scipy.stats module. Here's a Python function that does that:

In [2]:
from scipy.stats import f

def critical_f_value(significance_level, df_num, df_denom):
    """
    Calculate the critical F-value for a two-tailed test.

    Parameters:
    - significance_level: Significance level (e.g., 0.05)
    - df_num: Degrees of freedom for the numerator
    - df_denom: Degrees of freedom for the denominator

    Returns:
    - critical_f: Critical F-value
    """

    alpha = significance_level / 2  # Two-tailed test, so divide significance level by 2

    # Calculate the critical F-value using scipy.stats.f.ppf
    critical_f = f.ppf(1 - alpha, df_num, df_denom)

    return critical_f

# Example usage:
significance_level = 0.05
df_num = 3
df_denom = 20

critical_f = critical_f_value(significance_level, df_num, df_denom)
print("Critical F-value:", critical_f)


Critical F-value: 3.8586986662732143


This function uses scipy.stats.f.ppf to calculate the critical F-value based on the given significance level, degrees of freedom for the numerator (df_num), and degrees of freedom for the denominator (df_denom). The result is the critical F-value for a two-tailed test.

### Q3. Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the Fvalue, degrees of freedom, and p-value for the test.
### Answer : Certainly! Here's a Python program that generates random samples from two normal distributions with known variances and performs an F-test to determine if the variances are equal:

In [3]:
import numpy as np
from scipy.stats import f

def f_test_for_equal_variances(sample1, sample2):
    """
    Perform F-test for equal variances on two samples.

    Parameters:
    - sample1: First sample
    - sample2: Second sample

    Returns:
    - f_value: F-value for the F-test
    - df_between: Degrees of freedom between groups
    - df_within: Degrees of freedom within groups
    - p_value: Corresponding p-value
    """

    # Calculate sample variances
    var1 = np.var(sample1, ddof=1)
    var2 = np.var(sample2, ddof=1)

    # Degrees of freedom
    n1 = len(sample1)
    n2 = len(sample2)
    df_between = 1
    df_within = n1 + n2 - 2

    # F-test
    f_value = var1 / var2 if var1 >= var2 else var2 / var1
    p_value = 2 * min(f.cdf(f_value, df_between, df_within), 1 - f.cdf(f_value, df_between, df_within))

    return f_value, df_between, df_within, p_value

# Example usage:
np.random.seed(42)  # Set seed for reproducibility

# Generate random samples from two normal distributions
sample1 = np.random.normal(loc=0, scale=1, size=30)
sample2 = np.random.normal(loc=0, scale=1.5, size=30)

# Perform F-test for equal variances
f_value, df_between, df_within, p_value = f_test_for_equal_variances(sample1, sample2)

# Output results
print("F-value:", f_value)
print("Degrees of freedom between groups:", df_between)
print("Degrees of freedom within groups:", df_within)
print("P-value:", p_value)


F-value: 2.4081636671276785
Degrees of freedom between groups: 1
Degrees of freedom within groups: 58
P-value: 0.2522877635580332


This program generates random samples from two normal distributions using NumPy and then performs an F-test for equal variances using the f_test_for_equal_variances function. The results, including the F-value, degrees of freedom between and within groups, and the p-value, are then printed.

### Q4.The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.
### Answer : Certainly! In this case, you can conduct an F-test for the equality of variances using the provided information. Here's a Python script that performs the F-test and prints the results:

In [4]:
from scipy.stats import f

def f_test_for_equal_variances(variance1, variance2, sample_size1, sample_size2):
    """
    Perform F-test for equal variances.

    Parameters:
    - variance1: Variance of the first population
    - variance2: Variance of the second population
    - sample_size1: Sample size of the first population
    - sample_size2: Sample size of the second population

    Returns:
    - f_value: F-value for the F-test
    - df_between: Degrees of freedom between groups
    - df_within: Degrees of freedom within groups
    - p_value: Corresponding p-value
    """

    # Degrees of freedom
    df_between = 1
    df_within = sample_size1 + sample_size2 - 2

    # F-test
    f_value = variance1 / variance2 if variance1 >= variance2 else variance2 / variance1
    p_value = 2 * min(f.cdf(f_value, df_between, df_within), 1 - f.cdf(f_value, df_between, df_within))

    return f_value, df_between, df_within, p_value

# Given data
variance1 = 10
variance2 = 15
sample_size1 = 12
sample_size2 = 12
significance_level = 0.05

# Perform F-test for equal variances
f_value, df_between, df_within, p_value = f_test_for_equal_variances(variance1, variance2, sample_size1, sample_size2)

# Output results
print("F-value:", f_value)
print("Degrees of freedom between groups:", df_between)
print("Degrees of freedom within groups:", df_within)
print("P-value:", p_value)

# Check if the null hypothesis is rejected based on the significance level
if p_value < significance_level:
    print("Reject the null hypothesis. Variances are significantly different.")
else:
    print("Fail to reject the null hypothesis. Variances are not significantly different.")


F-value: 1.5
Degrees of freedom between groups: 1
Degrees of freedom within groups: 22
P-value: 0.46724804798391895
Fail to reject the null hypothesis. Variances are not significantly different.


This script uses the provided variances and sample sizes to perform an F-test for equal variances. The results, including the F-value, degrees of freedom, p-value, and the decision based on the significance level, are then printed.

### Q5. A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.
### Answer : In this scenario, you want to test whether the sample variance is significantly different from the claimed population variance. You can use an F-test for the equality of variances. Here's a Python script to perform the F-test and make a decision based on the 1% significance level:

In [5]:
from scipy.stats import f

def f_test_for_population_variance(claimed_variance, sample_variance, sample_size, significance_level):
    """
    Perform F-test for population variance.

    Parameters:
    - claimed_variance: Claimed population variance
    - sample_variance: Sample variance
    - sample_size: Sample size
    - significance_level: Significance level for the test

    Returns:
    - f_value: F-value for the F-test
    - df_between: Degrees of freedom between groups
    - df_within: Degrees of freedom within groups
    - p_value: Corresponding p-value
    """

    # Degrees of freedom
    df_between = 1
    df_within = sample_size - 1

    # F-test
    f_value = sample_variance / claimed_variance
    p_value = 2 * min(f.cdf(f_value, df_between, df_within), 1 - f.cdf(f_value, df_between, df_within))

    return f_value, df_between, df_within, p_value

# Given data
claimed_variance = 0.005
sample_variance = 0.006
sample_size = 25
significance_level = 0.01  # 1% significance level

# Perform F-test for population variance
f_value, df_between, df_within, p_value = f_test_for_population_variance(claimed_variance, sample_variance, sample_size, significance_level)

# Output results
print("F-value:", f_value)
print("Degrees of freedom between groups:", df_between)
print("Degrees of freedom within groups:", df_within)
print("P-value:", p_value)

# Check if the null hypothesis is rejected based on the significance level
if p_value < significance_level:
    print("Reject the null hypothesis. The claimed variance is not justified.")
else:
    print("Fail to reject the null hypothesis. The claimed variance is justified.")


F-value: 1.2
Degrees of freedom between groups: 1
Degrees of freedom within groups: 24
P-value: 0.5683866337677688
Fail to reject the null hypothesis. The claimed variance is justified.


This script uses the claimed population variance, sample variance, and sample size to perform an F-test for the population variance. The results, including the F-value, degrees of freedom, p-value, and the decision based on the 1% significance level, are then printed.

### Q6. Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.
### Answer : 

Certainly! You can use the formulas for the mean and variance of an F-distribution to create the Python function. The mean (
�
μ) and variance (
�
2
σ 
2
 ) of an F-distribution with degrees of freedom 
�
�
1
df 
1
​
  (numerator) and 
�
�
2
df 
2
​
  (denominator) are given by:

�
=
�
�
2
�
�
2
−
2
μ= 
df 
2
​
 −2
df 
2
​
 
​
 
�
2
=
2
⋅
(
�
�
2
2
⋅
(
�
�
1
+
�
�
2
−
2
)
)
�
�
1
⋅
(
�
�
2
−
2
)
2
⋅
(
�
�
2
−
4
)
σ 
2
 = 
df 
1
​
 ⋅(df 
2
​
 −2) 
2
 ⋅(df 
2
​
 −4)
2⋅(df 
2
2
​
 ⋅(df 
1
​
 +df 
2
​
 −2))
​
 

Here's the Python function:

In [6]:
def calculate_f_distribution_mean_and_variance(df_num, df_denom):
    """
    Calculate the mean and variance of an F-distribution.

    Parameters:
    - df_num: Degrees of freedom for the numerator
    - df_denom: Degrees of freedom for the denominator

    Returns:
    - mean: Mean of the F-distribution
    - variance: Variance of the F-distribution
    """

    # Calculate mean
    mean = df_denom / (df_denom - 2)

    # Calculate variance
    variance = (2 * df_denom ** 2 * (df_num + df_denom - 2)) / (df_num * (df_denom - 2) ** 2 * (df_denom - 4))

    return mean, variance

# Example usage:
df_num = 3
df_denom = 20

mean, variance = calculate_f_distribution_mean_and_variance(df_num, df_denom)
print("Mean of the F-distribution:", mean)
print("Variance of the F-distribution:", variance)


Mean of the F-distribution: 1.1111111111111112
Variance of the F-distribution: 1.0802469135802468


This function, calculate_f_distribution_mean_and_variance, takes the degrees of freedom for the numerator (df_num) and denominator (df_denom) as input and returns the mean and variance of the corresponding F-distribution.

### Q7. A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.
### Answer : 

In this scenario, you want to test whether the variances of two populations are significantly different. You can use an F-test for the equality of variances. Here's a Python script to perform the F-test and make a decision based on the 10% significance level:

In [7]:
from scipy.stats import f

def f_test_for_equal_variances(sample_variance1, sample_size1, sample_variance2, sample_size2, significance_level):
    """
    Perform F-test for equal variances.

    Parameters:
    - sample_variance1: Sample variance of the first population
    - sample_size1: Sample size of the first population
    - sample_variance2: Sample variance of the second population
    - sample_size2: Sample size of the second population
    - significance_level: Significance level for the test

    Returns:
    - f_value: F-value for the F-test
    - df_between: Degrees of freedom between groups
    - df_within: Degrees of freedom within groups
    - p_value: Corresponding p-value
    """

    # Degrees of freedom
    df_between = 1
    df_within = sample_size1 + sample_size2 - 2

    # F-test
    f_value = sample_variance1 / sample_variance2 if sample_variance1 >= sample_variance2 else sample_variance2 / sample_variance1
    p_value = 2 * min(f.cdf(f_value, df_between, df_within), 1 - f.cdf(f_value, df_between, df_within))

    return f_value, df_between, df_within, p_value

# Given data
sample_variance1 = 25
sample_size1 = 10
sample_variance2 = 20
sample_size2 = 15
significance_level = 0.10  # 10% significance level

# Perform F-test for equal variances
f_value, df_between, df_within, p_value = f_test_for_equal_variances(sample_variance1, sample_size1, sample_variance2, sample_size2, significance_level)

# Output results
print("F-value:", f_value)
print("Degrees of freedom between groups:", df_between)
print("Degrees of freedom within groups:", df_within)
print("P-value:", p_value)

# Check if the null hypothesis is rejected based on the significance level
if p_value < significance_level:
    print("Reject the null hypothesis. Variances are significantly different.")
else:
    print("Fail to reject the null hypothesis. Variances are not significantly different.")


F-value: 1.25
Degrees of freedom between groups: 1
Degrees of freedom within groups: 23
P-value: 0.5501896575179035
Fail to reject the null hypothesis. Variances are not significantly different.


This script uses the given sample variances and sample sizes to perform an F-test for equal variances. The results, including the F-value, degrees of freedom, p-value, and the decision based on the 10% significance level, are then printed.

### Q8. The following data represent the waiting times in minutes at two different restaurants on a Saturday night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.
### Answer : 

To conduct an F-test for the equality of variances, you can follow a similar approach as in the previous examples. Here's a Python script for this specific case:

In [8]:
from scipy.stats import f

def f_test_for_equal_variances(sample1, sample2, significance_level):
    """
    Perform F-test for equal variances on two samples.

    Parameters:
    - sample1: First sample
    - sample2: Second sample
    - significance_level: Significance level for the test

    Returns:
    - f_value: F-value for the F-test
    - df_between: Degrees of freedom between groups
    - df_within: Degrees of freedom within groups
    - p_value: Corresponding p-value
    """

    # Calculate sample variances
    var1 = sum((x - np.mean(sample1))**2 for x in sample1) / (len(sample1) - 1)
    var2 = sum((x - np.mean(sample2))**2 for x in sample2) / (len(sample2) - 1)

    # Degrees of freedom
    df_between = 1
    df_within = len(sample1) + len(sample2) - 2

    # F-test
    f_value = var1 / var2 if var1 >= var2 else var2 / var1
    p_value = 2 * min(f.cdf(f_value, df_between, df_within), 1 - f.cdf(f_value, df_between, df_within))

    return f_value, df_between, df_within, p_value

# Given data
sample_A = [24, 25, 28, 23, 22, 20, 27]
sample_B = [31, 33, 35, 30, 32, 36]
significance_level = 0.05  # 5% significance level

# Perform F-test for equal variances
f_value, df_between, df_within, p_value = f_test_for_equal_variances(sample_A, sample_B, significance_level)

# Output results
print("F-value:", f_value)
print("Degrees of freedom between groups:", df_between)
print("Degrees of freedom within groups:", df_within)
print("P-value:", p_value)

# Check if the null hypothesis is rejected based on the significance level
if p_value < significance_level:
    print("Reject the null hypothesis. Variances are significantly different.")
else:
    print("Fail to reject the null hypothesis. Variances are not significantly different.")


F-value: 1.4551907719609583
Degrees of freedom between groups: 1
Degrees of freedom within groups: 11
P-value: 0.5060024144323725
Fail to reject the null hypothesis. Variances are not significantly different.


This script calculates the sample variances, performs the F-test for equal variances, and prints the results, including the F-value, degrees of freedom, p-value, and the decision based on the 5% significance level.

### Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83; Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.Q9. The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83; Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.
### Answer : To conduct an F-test for the equality of variances, you can use a similar approach as in the previous examples. Here's a Python script for this specific case:

In [9]:
from scipy.stats import f

def f_test_for_equal_variances(sample1, sample2, significance_level):
    """
    Perform F-test for equal variances on two samples.

    Parameters:
    - sample1: First sample
    - sample2: Second sample
    - significance_level: Significance level for the test

    Returns:
    - f_value: F-value for the F-test
    - df_between: Degrees of freedom between groups
    - df_within: Degrees of freedom within groups
    - p_value: Corresponding p-value
    """

    # Calculate sample variances
    var1 = sum((x - np.mean(sample1))**2 for x in sample1) / (len(sample1) - 1)
    var2 = sum((x - np.mean(sample2))**2 for x in sample2) / (len(sample2) - 1)

    # Degrees of freedom
    df_between = 1
    df_within = len(sample1) + len(sample2) - 2

    # F-test
    f_value = var1 / var2 if var1 >= var2 else var2 / var1
    p_value = 2 * min(f.cdf(f_value, df_between, df_within), 1 - f.cdf(f_value, df_between, df_within))

    return f_value, df_between, df_within, p_value

# Given data
sample_A = [80, 85, 90, 92, 87, 83]
sample_B = [75, 78, 82, 79, 81, 84]
significance_level = 0.01  # 1% significance level

# Perform F-test for equal variances
f_value, df_between, df_within, p_value = f_test_for_equal_variances(sample_A, sample_B, significance_level)

# Output results
print("F-value:", f_value)
print("Degrees of freedom between groups:", df_between)
print("Degrees of freedom within groups:", df_within)
print("P-value:", p_value)

# Check if the null hypothesis is rejected based on the significance level
if p_value < significance_level:
    print("Reject the null hypothesis. Variances are significantly different.")
else:
    print("Fail to reject the null hypothesis. Variances are not significantly different.")


F-value: 1.9442622950819677
Degrees of freedom between groups: 1
Degrees of freedom within groups: 10
P-value: 0.3868180178830529
Fail to reject the null hypothesis. Variances are not significantly different.


This script calculates the sample variances, performs the F-test for equal variances, and prints the results, including the F-value, degrees of freedom, p-value, and the decision based on the 1% significance level.