# Test of Single Variance
1. Example 1:-

With individual lines at its various windows, a post office finds that the standard deviation for waiting times for customers on Friday afternoon is 7.2 minutes. The post office experiments with a single, main waiting line and finds that for a random sample of 25 customers, the waiting times for customers have a standard deviation of 3.5 minutes on a Friday afternoon.

With a significance level of 5%, test the claim that a single line causes lower variation among waiting times for customers.

Since the claim is that a single line causes less variation, this is a test of a single variance. The parameter is the population variance, σ2.

Random Variable: The sample standard deviation, s, is the random variable. Let s = standard deviation for the waiting times.

H0: σ2 ≥ 7.22
Ha: σ2 < 7.22
The word “less” tells you this is a left-tailed test.

Distribution for the test:{\chi }_{24}^{2}, where:

n = the number of customers sampled
df = n – 1 = 25 – 1 = 24

In [34]:
import statsmodels.api as sm
import scipy.stats.distributions as dist
import numpy as np
import scipy.stats as stats

In [12]:
df=24-1
test_stat=(25-1)*(3.5**2)/(7.22**2)
print("Test Statistic =",np.round(test_stat,2))

critical_value=dist.chi2.ppf(q=0.05,df=df)  # q=lower tail probality
print("Critical Value Corresponding to 5% Confidence =",np.round(critical_value,2))
print("="*50)
if critical_value>test_stat:
    print("Rejecting Null Hypothesis")
else:
    print("Fail to Reject Null Hypothesis")
print("="*50)
print("Using P-Value Approach")
p_value=dist.chi2.cdf(x=test_stat,df=df)

if p_value>0.05:
    print("Fail to Reject Null Hypothesis")
else:
    print("Rejecting Null Hypothesis")

Test Statistic = 5.64
Critical Value Corresponding to 5% Confidence = 13.09
Rejecting Null Hypothesis
Using P-Value Approach
Rejecting Null Hypothesis


2. Example 2:-

Professor Hadley has a weakness for cream filled donuts, but he believes that some bakeries are not properly filling the donuts. A sample of 24 donuts reveals a mean amount of filling equal to 0.04 cups, and the sample standard deviation is 0.11 cups. Professor Hadley has an interest in the average quantity of filling, of course, but he is particularly distressed if one donut is radically different from another. Professor Hadley does not like surprises.

Test at 95% the null hypothesis that the population variance of donut filling is significantly different from the average amount of filling.

This is clearly a problem dealing with variances. In this case we are testing a single sample rather than comparing two samples from different populations. The null and alternative hypotheses are thus:

![image.png](attachment:image.png)
![image-2.png](attachment:image-2.png)

In [33]:
n=24
df=n-1
test_stat=df*(0.11**2)/(0.04**2)
print("Test Statistic =",test_stat)
critical_value_0=dist.chi2.ppf(q=0.025,df=df)
critical_value_1=dist.chi2.ppf(q=1-0.025,df=df)
critical_values=(critical_value_0,critical_value_1)
print("Critical Values =",critical_values)

if critical_values[0]<test_stat<critical_values[1]:
    print("Fail to Reject Null Hypothesis")
else:
    print("Rejecting Null Hypothesis")

print("="*50)
print("Using P-Value Approach")

p_value=1-dist.chi2.cdf(x=critical_value_1,df=df)+dist.chi2.cdf(x=critical_value_0,df=df)

if p_value>0.05:
    print("Fail to Reject Null Hypothesis")
else:
    print("Rejecting Null Hypothesis")


Test Statistic = 173.9375
Critical Values = (11.688551922452438, 38.0756272503558)
Rejecting Null Hypothesis
Using P-Value Approach
Rejecting Null Hypothesis


# Goodness-of-Fit Test
Because of the Chi-Squared Distribution the test is most of the times right-tail test.
Null Hypothesis assumes that data fits the given distribution.

Employers want to know which days of the week employees are absent in a five-day work week. Most employers would like to believe that employees are absent equally during the week. Suppose a random sample of 60 managers were asked on which day of the week they had the highest number of employee absences. The results were distributed as in (Figure). For the population of employees, do the days for the highest number of absences occur with equal frequencies during a five-day work week? Test at a 5% significance level.

Day of the Week Employees were Most Absent

Monday	Tuesday	Wednesday	Thursday	Friday

Number of absences	15	12	9	9	15

The null and alternative hypotheses are:

H0: The absent days occur with equal frequencies, that is, they fit a uniform distribution.
Ha: The absent days occur with unequal frequencies, that is, they do not fit a uniform distribution.

In [43]:
observed=np.array([15,12,9,9,15])
total_absent=15+12+9+9+15
expected=np.repeat(total_absent/5,5,)
expected=expected.astype(int)
print("Expected Values =",expected)
test_stat=0
for i in range(5):
    test_stat+=(observed[i]-expected[i])**2/expected[i]
print("Test Statistic =",test_stat)

critical_stat=dist.chi2.ppf(q=1-0.05,df=4)
print("Critical Stat =",critical_stat)

if test_stat<critical_stat:
    print("Fail to Reject Null Hypothesis")
else:
    print("Rejecting Null Hypothesis")
    
print("="*50)
print("Using P-Value Approach")
p_value=1-dist.chi2.cdf(test_stat,df=4)
if p_value>0.05:
    print("Fail to Reject Null Hypothesis")
else:
    print("Rejecting Null Hypothesis")

Expected Values = [12 12 12 12 12]
Test Statistic = 3.0
Critical Stat = 9.487729036781154
Fail to Reject Null Hypothesis
Using P-Value Approach
Fail to Reject Null Hypothesis


#  Test of Independence

Null Hypothesis:- Thr Groups are independent.
A volunteer group, provides from one to nine hours each week with disabled senior citizens. The program recruits among community college students, four-year college students, and nonstudents. In (Figure) is a sample of the adult volunteers and the number of hours they volunteer per week.

Number of Hours Worked Per Week by Volunteer Type (Observed)The table contains observed (O) values (data).

Type of volunteer	1–3 Hours	4–6 Hours	7–9 Hours	Row total

Community college students	111	96	48	255

Four-year college students	96	133	61	290

Nonstudents	                91	150	53	294

Column total	            298	379	162	839

In [47]:
observed=np.array([[111,96,48],[96,133,61],[91,150,53]])
print("Observed =",observed)

expected=stats.chi2_contingency(observed)[3]
print("Expected =",expected)



Observed = [[111  96  48]
 [ 96 133  61]
 [ 91 150  53]]
Expected = [[ 90.57210965 115.19070322  49.23718713]
 [103.00357569 131.0011919   55.99523242]
 [104.42431466 132.80810489  56.76758045]]


In [49]:
for i,j in zip(observed,expected):
    print(i)
    print(j)
    print("="*50)

[111  96  48]
[ 90.57210965 115.19070322  49.23718713]
[ 96 133  61]
[103.00357569 131.0011919   55.99523242]
[ 91 150  53]
[104.42431466 132.80810489  56.76758045]


In [58]:
test_stat=sum([(o-e)**2/e for o,e in zip(observed,expected)]).sum()
print("Test Statistic =",test_stat)
df=(3-1)*(3-1)
critical_stat=dist.chi2.ppf(q=1-0.05,df=df)
print("Critical Statistic =",critical_stat)

if test_stat>critical_stat:
    print(" Reject Null Hypothesis")
else:
    print("Fail to Reject Null Hypothesis")
    
print("="*50)
print("Using P-Value Approach")

p_value=1-dist.chi2.cdf(x=test_stat,df=df)
if p_value>0.05:
    print("Fail to Reject Null Hypothesis")
else:
    print("Reject Null Hypothesis")
    

Test Statistic = 12.990918513170868
Critical Statistic = 9.487729036781154
 Reject Null Hypothesis
Using P-Value Approach
Reject Null Hypothesis
