In [1]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
import scipy.stats as stats
%matplotlib inline

**Example 1:** A factory uses two machines to manufacture parts. The quality control team wants to check if the output variability (variance in dimensions) differs between the machines.

In [2]:
# Sample data for variances and sample sizes
s1_squared = 2.8  # variance for Machine 1
s2_squared = 1.9  # variance for Machine 2
n1 = 15  # sample size for Machine 1
n2 = 15  # sample size for Machine 2
alpha = 0.05

Testing the null hypothesis

>$H_0:\sigma_1^2=\sigma_2^2$

against the alternate hypothesis

>$H_1:\sigma_1^2\neq\sigma_2^2$

In [4]:
#calculating the hypothesis using p value
#calculating f static
f_stat=s1_squared/s2_squared
#degrees of freedom
df1=n1-1
df2=n2-1
#p-value
p_value=2*(1-stats.f.cdf(f_stat,df1,df2))
p_value

0.47743040147928806

since p value is greater than alpha we fail to reject the null hypothesis

In [9]:
#calculating the hypothesis using critical value
lower=stats.f.ppf(alpha/2,df1,df2)
upper=stats.f.ppf(1-alpha/2,df1,df2)
print(lower,upper)

0.33572960066081176 2.97858752410188


In [11]:
f_stat

1.4736842105263157

since f_stat lies in between the critical_values(0.33,2.97) we fail to reject the null hypothesis

**Example 2:** A school has two classes, and it wants to test if the variance in test scores differs between the two. If one class shows more variability in performance, it might indicate different teaching methods or external factors.

In [12]:
# Sample data for variances and sample sizes
s1_squared = 9.2  # variance for Class 1
s2_squared = 7.5  # variance for Class 2
n1 = 20  # sample size for Class 1
n2 = 22  # sample size for Class 2
alpha = 0.05  # significance level

Testing the null hypothesis

>$H_0:\sigma_1^2=\sigma_2^2$

against the alternate hypothesis

>$H_1:\sigma_1^2\neq\sigma_2^2$

In [14]:
#F_static
f_stat=s1_squared/s2_squared
#degrees of freedom
df1=n1-1
df2=n2-1
#p_value
p_value=2*(1-stats.f.cdf(f_stat,df1,df2))
p_value

0.6466082905271744

since p value is greater than alpha we fail to reject the null hypothesis

In [15]:
#calculating the hypothesis using critical value
lower=stats.f.ppf(alpha/2,df1,df2)
upper=stats.f.ppf(1-alpha/2,df1,df2)
print(lower,upper)

0.4011308780615479 2.442404049690661


In [16]:
f_stat

1.2266666666666666

since the f_stat lies in between the critical area(0.40,2.44) we fail to reject the null hypothesis

**Example 3:** A school wants to check if the variance in test scores for Class 1 is greater than the variance in test scores for Class 2. The greater variability in Class 1 could indicate inconsistent performance among students.

In [18]:
# Sample data for variances and sample sizes
s1_squared = 20  # variance for Class 1
s2_squared = 15  # variance for Class 2
n1 = 30          # sample size for Class 1
n2 = 25          # sample size for Class 2
alpha = 0.05     # significance level

Testing the null hypothesis

>$H_0:\sigma_1^2≤\sigma_2^2$

against the alternate hypothesis

>$H_1:\sigma_1^2>\sigma_2^2$

In [19]:
#f_stat
df1=n1-1
df2=n2-1
f_stat=s1_squared/s2_squared
p_value=1-stats.f.cdf(alpha,df1,df2)
p_value

0.9999999999970753

since p value is greater than alpha we fail to reject the null hypothesis

In [20]:
#calculating the hypothesis using critical value
critical_value=1-stats.f.ppf(alpha,df1,df2)
critical_value

0.4738313466037325

In [21]:
f_stat

1.3333333333333333

since f_stat is greater than the critcal value we fail to reject the null hypothesis

**Example 4:** A factory has two machines, and it wants to check if the variance in the temperature of Machine 1 is less than that of Machine 2. A smaller variance would mean Machine 1 has more consistent temperature control.

In [22]:
# Sample data for variances and sample sizes
s1_squared = 10  # variance for Machine 1
s2_squared = 12  # variance for Machine 2
n1 = 50          # sample size for Machine 1
n2 = 50          # sample size for Machine 2
alpha = 0.05     # significance level


Testing the null hypothesis

>$H_0:\sigma_1^2≥\sigma_2^2$

against the alternate hypothesis

>$H_1:\sigma_1^2<\sigma_2^2$

In [24]:
# Degrees of freedom
df1 = n1 - 1
df2 = n2 - 1

# Calculate F-statistic
F_statistic = s1_squared / s2_squared

# p-value for one-tailed test 
p_value = f.cdf(F_statistic, df1, df2)
p_value

0.2628963224048937

since p value is greater than alpha we fail to reject the null hypothesis

In [25]:
# Critical value 
critical_value = f.ppf(alpha, df1, df2)
critical_value

0.6221654675017775

In [29]:
F_statistic

0.8333333333333334

since f_statistic is less than the critical value we fail to reject the null hypothesis

**Example 5:** Two different marketing campaigns are run by a company, and it wants to check if the variance in the response rates of Campaign 1 is greater than that of Campaign 2. A greater variance would suggest that the results of Campaign 1 are less consistent.

In [30]:
# Sample data for variances and sample sizes
s1_squared = 25  # variance for Campaign 1
s2_squared = 20  # variance for Campaign 2
n1 = 40          # sample size for Campaign 1
n2 = 35          # sample size for Campaign 2
alpha = 0.05     # significance level

Testing the null hypothesis

>$H_0:\sigma_1^2≤\sigma_2^2$

against the alternate hypothesis

>$H_1:\sigma_1^2>\sigma_2^2$

In [32]:
# Calculate F-statistic
F_statistic = s1_squared / s2_squared

# Degrees of freedom
df1 = n1 - 1
df2 = n2 - 1

# p-value for one-tailed test 
p_value = 1 - f.cdf(F_statistic, df1, df2)
p_value

0.25529895339775455

since p value is greater than alpha we fail to reject the null hypothesis


In [33]:
# Critical value 
critical_value = f.ppf(1 - alpha, df1, df2)
critical_value

1.7490730378791757

In [34]:
F_statistic 

1.25

since f_stat is less than the critical value we fail to reject the null hypothesis

**Example 6:** A company wants to check if the variance in salaries for Department A is less than the variance in salaries for Department B. This can indicate that Department A has more uniform salaries compared to Department B.

In [35]:
# Sample data for variances and sample sizes
s1_squared = 18  # variance for Department A
s2_squared = 24  # variance for Department B
n1 = 60          # sample size for Department A
n2 = 55          # sample size for Department B
alpha = 0.05     # significance level

Testing the null hypothesis

>$H_0:\sigma_1^2≥\sigma_2^2$

against the alternate hypothesis

>$H_1:\sigma_1^2<\sigma_2^2$

In [36]:
# Calculate F-statistic
F_statistic = s1_squared / s2_squared

# Degrees of freedom
df1 = n1 - 1
df2 = n2 - 1

# p-value for one-tailed test 
p_value = f.cdf(F_statistic, df1, df2)
p_value


0.140128347751143

since p value is greater than alpha we fail to reject the null hypothesis

In [39]:
# Critical value 
critical_value = f.ppf(alpha, df1, df2)
critical_value

0.6444820810346026

In [38]:
F_statistic

0.75

since f_stat is greater than the critical value we fail to reject the null hypothesis