### Problem_1: Write a Python function that takes in two arrays of data and calculates the F-value for a variance ratio test. The function should return the F-value and the corresponding p-value for the test.

In [2]:
from scipy.stats import f_oneway
import numpy as np
def f_test(data1, data2):
  # Check if data is valid
  if not isinstance(data1, (list, np.ndarray)) or not isinstance(data2, (list, np.ndarray)):
    raise TypeError("Input data must be lists or NumPy arrays.")
  
  # Perform F-test with scipy.stats.f_oneway
  F, p = f_oneway(data1, data2)
  return F, p

# Example usage
data1 = [1, 2, 3, 4, 5]
data2 = [6, 7, 8, 9, 10]
F, p = f_test(data1, data2)

print(f"F-value: {F:.4f}")
print(f"p-value: {p:.4f}")

F-value: 25.0000
p-value: 0.0011


### Problem_2: Given a significance level of 0.05 and the degrees of freedom for the numerator and denominator of an F-distribution, write a Python function that returns the critical F-value for a two-tailed test.

In [3]:
from scipy.stats import f

def critical_f_value(alpha, df1, df2):
  # Check if alpha is between 0 and 1
  if not 0 < alpha < 1:
    raise ValueError("alpha must be between 0 and 1.")
  
  # Calculate critical F-value with scipy.stats.f.ppf
  return f.ppf(1 - alpha / 2, df1, df2)

# Example usage
alpha = 0.05
df1 = 10
df2 = 15
critical_f = critical_f_value(alpha, df1, df2)

print(f"Critical F-value (alpha={alpha:.4f}, df1={df1}, df2={df2}): {critical_f:.4f}")


Critical F-value (alpha=0.0500, df1=10, df2=15): 3.0602


### Problem_3: Write a Python program that generates random samples from two normal distributions with known variances and uses an F-test to determine if the variances are equal. The program should output the F-value, degrees of freedom, and p-value for the test.

In [19]:
from scipy.stats import norm, f_oneway

def f_test_simulation(mu1, mu2, sigma1, sigma2, sample_size, alpha):
  # Generate random samples
  data1 = norm.rvs(loc=mu1, scale=sigma1, size=sample_size)
  data2 = norm.rvs(loc=mu2, scale=sigma2, size=sample_size)

  # Perform F-test with f_oneway
  F, p = f_oneway(data1, data2)

  # Calculate degrees of freedom
  df1 = sample_size - 1
  df2 = sample_size - 1

  # Print results
  print(f"F-test results (alpha={alpha:.4f}):")
  print(f"\tF-value: {F:.4f}")
  print(f"\tDegrees of freedom (numerator, denominator): ({df1}, {df2})")
  print(f"\tp-value: {p:.4f}")

  # Decision rule 
  if p < alpha:
    print("\tReject null hypothesis (variances are not equal).")
  else:
    print("\tFail to reject null hypothesis (variances might be equal).")

# Example usage
mu1 = 5
mu2 = 7
sigma1 = 2
sigma2 = 1
sample_size = 20
alpha = 0.05

f_test_simulation(mu1, mu2, sigma1, sigma2, sample_size, alpha)

F-test results (alpha=0.0500):
	F-value: 15.8093
	Degrees of freedom (numerator, denominator): (19, 19)
	p-value: 0.0003
	Reject null hypothesis (variances are not equal).


### Problem_4: The variances of two populations are known to be 10 and 15. A sample of 12 observations is taken from each population. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

  - We can't directly perform the F-test in this scenario because we don't have the actual data points from the populations. The F-test is designed to compare sample variances, not population variances.
  
Here's the breakdown:
  - Known Population Variances: We know the population variances are 10 and 15.
  - Sample Size: We have a sample size of 12 from each population.
  - Significance Level: The significance level is set at 0.05 (5%).    
  
Since the population variances are different (10 vs 15), we would expect the sample variances to also be different with some level of confidence.

What the F-test would likely show:
  - The F-test statistic would likely be greater than 1 (since 15 is larger than 10).
  - By looking up the critical F-value for a two-tailed test with alpha = 0.05, degrees of freedom (df1) = 11 (sample size - 1 for population 1), and degrees of freedom (df2) = 11 (sample size - 1 for population 2), we would compare the F-statistic to the critical value.
  - If the F-statistic is greater than the critical F-value, we would reject the null hypothesis (variances are equal) at the 5% significance level.
  
Conclusion:
  - Based on the known population variances being different, the F-test would likely provide evidence to reject the null hypothesis of equal variances between the two populations at the 5% significance level.

### Problem_5: A manufacturer claims that the variance of the diameter of a certain product is 0.005. A sample of 25 products is taken, and the sample variance is found to be 0.006. Conduct an F-test at the 1% significance level to determine if the claim is justified.

  - Null Hypothesis (H0): The population variance (σ^2) is equal to 0.005 (as claimed by the manufacturer).
  - Alternative Hypothesis (H1): The population variance (σ^2) is not equal to 0.005.

Steps:
1. Set Significance Level (α): α = 0.01 (1%).
2. Sample Data: We have a sample size (n) of 25 and sample variance (s^2) of 0.006.
3. Degrees of Freedom: Degrees of freedom for the numerator (df1) is n - 1 = 25 - 1 = 24.
4. Critical F-value: We need to find the critical F-value for a two-tailed test with α = 0.01, df1 = 24, and degrees of freedom for the denominator (df2) which is also 24 (since we are comparing variances from a single population).

In [20]:
from scipy.stats import f

# Significance level
alpha = 0.01

# Degrees of freedom
df1 = 24
df2 = 24

# Calculate critical F-value
critical_f = f.ppf(1 - alpha / 2, df1, df2)

print(f"Critical F-value (alpha={alpha:.4f}, df1={df1}, df2={df2}): {critical_f:.4f}")

Critical F-value (alpha=0.0100, df1=24, df2=24): 2.9667


5. F-Statistic Calculation: The F-statistic is calculated as the ratio of the sample variance (s^2) to the claimed population variance (σ^2):
F = s^2 / σ^2 = 0.006 / 0.005 = 1.2000

6. Decision Rule:
   - If the F-statistic (F) is greater than the critical F-value, we reject the null hypothesis (H0).
   - If F is less than or equal to the critical F-value, we fail to reject H0.    
   
Interpretation:
  - Assuming the critical F-value is greater than 1.2000 (calculated in step 4), then based on the decision rule, we would fail to reject the null hypothesis (H0). This means there is not enough evidence at the 1% significance level to conclude that the manufacturer's claim of a population variance of 0.005 is incorrect.

### Problem_6: Write a Python function that takes in the degrees of freedom for the numerator and denominator of an F-distribution and calculates the mean and variance of the distribution. The function should return the mean and variance as a tuple.

In [22]:
from scipy.stats import f 
import numpy as np
def f_mean_variance(df1, df2):
  # Check if degrees of freedom are positive integers
  if not isinstance(df1, int) or not isinstance(df2, int) or df1 <= 0 or df2 <= 0:
    raise ValueError("Degrees of freedom must be positive integers.")
  
  # Mean of F-distribution (only defined if df2 > 2)
  if df2 > 2:
    mu = df2 / (df1 - 2)  
  else:
    mu = np.nan  # Not defined for df2 <= 2

  # Variance of F-distribution (only defined if df2 > 4)
  if df2 > 4:
    variance = 2 * df2**2 * (df1**2 + 2 * df1) / ((df1 - 2)**2 * (df1 - 4))
  else:
    variance = np.nan  # Not defined for df2 <= 4

  return mu, variance

# Example usage
df1 = 10
df2 = 15
mu, var = f_mean_variance(df1, df2)

print(f"F-distribution (df1={df1}, df2={df2}):")
print(f"\tMean: {mu:.4f}" if not np.isnan(mu) else "Mean: Not defined")
print(f"\tVariance: {var:.4f}" if not np.isnan(var) else "Variance: Not defined")


F-distribution (df1=10, df2=15):
	Mean: 1.8750
	Variance: 140.6250


### Problem_7: A random sample of 10 measurements is taken from a normal population with unknown variance. The sample variance is found to be 25. Another random sample of 15 measurements is taken from another normal population with unknown variance, and the sample variance is found to be 20. Conduct an F-test at the 10% significance level to determine if the variances are significantly different.

  - Null Hypothesis (H0): The population variances (σ₁² and σ₂²) are equal.
  - Alternative Hypothesis (H1): The population variances (σ₁² and σ₂²) are not equal.

Steps:
1. Significance Level (α): α = 0.10 (10%).
2. Sample Data:
   - Sample size for population 1 (n₁) = 10.
   - Sample variance for population 1 (s₁²) = 25.
   - Sample size for population 2 (n₂) = 15.
   - Sample variance for population 2 (s₂²) = 20.
3. Degrees of Freedom:
   - Degrees of freedom for numerator (df1) = n₁ - 1 = 10 - 1 = 9.
   - Degrees of freedom for denominator (df2) = n₂ - 1 = 15 - 1 = 14.

F-Statistic Calculation:
  - The F-statistic is calculated as the ratio of the larger sample variance (s₁² in this case) to the smaller sample variance (s₂²):     

F = s₁² / s₂² = 25 / 20 = 1.25

#### Critical F-value Calculation:

In [23]:
from scipy.stats import f
# Significance level
alpha = 0.10

# Degrees of freedom
df1 = 9
df2 = 14

# Calculate critical F-value (one-tailed for two-tailed test)
critical_f_upper = f.ppf(1 - alpha / 2, df1, df2)
critical_f_lower = 1 / critical_f_upper  # Lower critical value for two-tailed test

print(f"Critical F-values (alpha={alpha:.4f}, df1={df1}, df2={df2}):")
print(f"\tUpper: {critical_f_upper:.4f}")
print(f"\tLower: {critical_f_lower:.4f}")


Critical F-values (alpha=0.1000, df1=9, df2=14):
	Upper: 2.6458
	Lower: 0.3780


4. Decision Rule:
   - If the F-statistic (F) is greater than the upper critical F-value, we reject H0 (variances are different).
   - If F is less than the lower critical F-value, we reject H0 (variances are different).
   - If F falls between the critical F-values, we fail to reject H0.

5. Interpretation:
   - Assuming the critical F-values are calculated and the upper value is greater than 1.25 (F-statistic), then based on the decision rule, we would fail to reject the null hypothesis (H0). This means there is not enough evidence at the 10% significance level to conclude that the variances of the two populations are significantly different.

### Problem_8: The following data represent the waiting times in minutes at two different restaurants on a Saturday night: Restaurant A: 24, 25, 28, 23, 22, 20, 27; Restaurant B: 31, 33, 35, 30, 32, 36. Conduct an F-test at the 5% significance level to determine if the variances are significantly different.

  - Null Hypothesis (H0): The population variances (σ₁² and σ₂²) for waiting times at Restaurant A and Restaurant B are equal.
  - Alternative Hypothesis (H1): The population variances (σ₁² and σ₂²) are not equal.

Steps:
1. Significance Level (α): α = 0.05 (5%).
2. Data: We have the waiting times for each restaurant.
3. Sample Statistics:
   - Calculate the sample variance (s₁²) for Restaurant A.
   - Calculate the sample variance (s₂²) for Restaurant B.

#### Sample Variance Calculation:

In [25]:
# Sample data (Example Dataset)
restaurant_a = [24, 25, 28, 23, 22, 20, 27]
restaurant_b = [31, 33, 35, 30, 32, 36]

# Calculate sample variance (unbiased estimator)
def unbiased_variance(data):
  n = len(data)
  mean = sum(data) / n
  return sum((x - mean)**2 for x in data) / (n - 1)  # Formula Sample Variance

variance_a = unbiased_variance(restaurant_a)
variance_b = unbiased_variance(restaurant_b)

print(f"Sample variance (unbiased):")
print(f"\tRestaurant A: {variance_a:.4f}")
print(f"\tRestaurant B: {variance_b:.4f}")


Sample variance (unbiased):
	Restaurant A: 7.8095
	Restaurant B: 5.3667


4. Degrees of Freedom:
   - Degrees of freedom for numerator (df1) = sample size for Restaurant A - 1.
   - Degrees of freedom for denominator (df2) = sample size for Restaurant B - 1.
   
F-Statistic Calculation:
   - The F-statistic is calculated as the ratio of the larger sample variance to the smaller sample variance:        
F = max(variance_a, variance_b) / min(variance_a, variance_b)        
F = 7.8095/5.3667 = 1.455

Critical F-value Calculation:
  - We need to find the critical F-value for a two-tailed test with α = 0.05, df1 (from step 4), and df2 (from step 4). We can use statistical tables.
  - So, the Critical F-value = 6.977
5. Decision Rule:
   - If the F-statistic (F) is greater than the upper critical F-value, we reject H0 (variances are different).
   - If F is less than the lower critical F-value, we reject H0 (variances are different).
   - If F falls between the critical F-values, we fail to reject H0.
6. Interpretation:
   - After calculating the sample variances, degrees of freedom, F-statistic, and critical F-values, we can make a conclusion that if f_value(1.455) is greater than critical_f_value(6.977) than Reject the Null Hypothesis(Variance are Significantly different) else Fail to Reject the Null Hypothesis (Variance are not significantly different). Here, It means we fail to reject the null hypothesis : variance are not significantly different.

### Problem_9: The following data represent the test scores of two groups of students: Group A: 80, 85, 90, 92, 87, 83; Group B: 75, 78, 82, 79, 81, 84. Conduct an F-test at the 1% significance level to determine if the variances are significantly different.

  - Null Hypothesis (H0): The population variances (σ₁² and σ₂²) for the test scores in Group A and Group B are equal.
  - Alternative Hypothesis (H1): The population variances (σ₁² and σ₂²) are not equal.

  - Significance Level (α): α = 0.01 (1%).

Steps:

1. Sample Data:
   - Group A: 80, 85, 90, 92, 87, 83
   - Group B: 75, 78, 82, 79, 81, 84

2. Sample Variance Calculation:

In [31]:
# Sample data
group_a = [80, 85, 90, 92, 87, 83]
group_b = [75, 78, 82, 79, 81, 84]

# Sample variance (unbiased estimator)
def unbiased_variance(data):
  n = len(data)
  mean = sum(data) / n
  return sum((x - mean)**2 for x in data) / (n - 1) ## Formula for sample Variance

variance_a = unbiased_variance(group_a)
variance_b = unbiased_variance(group_b)

print(f"Sample variance (unbiased):")
print(f"\tGroup A: {variance_a:.4f}")
print(f"\tGroup B: {variance_b:.4f}")

Sample variance (unbiased):
	Group A: 19.7667
	Group B: 10.1667


3. Degrees of Freedom:
   - Degrees of freedom for numerator (df1) = sample size for Group A - 1 = 6 - 1 = 5
   - Degrees of freedom for denominator (df2) = sample size for Group B - 1 = 6 - 1 = 5
   
4. F-Statistic Calculation:
   - The F-statistic is the ratio of the larger sample variance to the smaller sample variance:        
   - F = max(variance_a, variance_b) / min(variance_a, variance_b)        
   - F = 19.7667/10.1667 = 1.944
5. Critical F-Value Calculation:
   - We need the critical F-value for a two-tailed test with α = 0.01, df1 = 5, and df2 = 5. You can use statistical tables  
   - So, the Critical F-value =14.939
6. Decision Rule:
   - If F > critical F-value, reject H0 (variances are different).
   - If F <= critical F-value, fail to reject H0 (variances might be equal).

7. Conclusion:
  - By analyzing these steps and interpreting the results based on the critical F-value, we would fail to reject the null hypothesis (H0). (equal variances) at the 1% significance level. This means that there is not enough  evi

In [35]:
import numpy as np
from scipy.stats import f

# Data for Group A and Group B
group_a_scores = np.array([80, 85, 90, 92, 87, 83])
group_b_scores = np.array([75, 78, 82, 79, 81, 84])

# Calculate variances
var_a = np.var(group_a_scores, ddof=1)  # Use ddof=1 for unbiased estimate
var_b = np.var(group_b_scores, ddof=1)

# Calculate degrees of freedom
df1 = len(group_a_scores) - 1
df2 = len(group_b_scores) - 1

# Calculate F-value
f_value = var_a / var_b if var_a > var_b else var_b / var_a

# Calculate critical F-value for a significance level of 0.01
critical_f_value = f.ppf(0.995, df1, df2)  # Two-tailed test, so using 0.995

# Determine if the null hypothesis should be rejected
if f_value > critical_f_value:
    print("Reject the null hypothesis: Variances are significantly different.")
else:
    print("Fail to reject the null hypothesis: Variances are not significantly different.")
critical_f_value

Fail to reject the null hypothesis: Variances are not significantly different.


14.939605459912224