<a href="https://colab.research.google.com/github/darshan-jain/19CSE304-FDS/blob/main/hypo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Hypothesis Testing

In [2]:
import numpy as np
import scipy.stats as st

## Section 1 : z-Test

## Example 1.1 : Ages of Medical doctors
A researcher believes that the mean age of medical doctors in a large hospital system is older than the average age of doctors in the United States, which is 46. Assume the population standard deviation is 4.2 years. A random sample of 30 doctors from the system is selected, and the mean age of the sample is 48.6. Test the claim at α = 0.05.

In [3]:
#H0 : μ =46, Ha :  μ >46
n = 30
xbar = 48.6 #sample
mu = 46   #population
sigma = 4.2
alpha = 0.05

In [4]:
z_critical = abs(st.norm.ppf(alpha)) #Absolute value taken as the it's a right-tailed test and the original value will be negative
z_critical

1.6448536269514729

In [5]:
z = (xbar-mu)/(sigma/np.sqrt(n))
z

3.3906634512224585

In [6]:
if (z<z_critical): #Right-tailed test
    print("Null hypothesis cannot be rejected")
else:
    print("Reject null hypothesis")

Reject null hypothesis


## Section 2 : z-Test using P-value

## Example 2. 1 : Wind Speed


A researcher claims that the average wind speed in a certain city is 8 miles per hour. A sample of 32 days has an average wind speed of 8.2 miles per hour. The standard deviation of the population is 0.6 mile per hour. At α = 0.05, is there enough evidence to reject the claim? Use the P-value method.

In [7]:
#H0 : μ =8  and Ha : μ != 8
n = 32
xbar = 8.2
mu = 8
sigma = 0.6
alpha = 0.05

In [8]:
z = (xbar-mu)/(sigma/np.sqrt(n))
z

1.8856180831641203

In [10]:
p_val = 2*(1-st.norm.cdf(abs(z)))
p_val

0.0593464387919207

In [11]:
if (p_val>alpha):
    print("Null hypothesis cannot be rejected")
else:
    print("Reject null hypothesis")

Null hypothesis cannot be rejected


## Section 3 : t-Test

## Example 3.1 : Hospital Infections
A medical investigation claims that the average number of infections per week at a ­hospital in southwestern Pennsylvania is 16.3. A random sample of 10 weeks had a mean number of 17.7 infections. The sample standard deviation is 1.8. Is there enough evidence to reject the investigator’s claim at α = 0.05? Assume the variable is normally distributed.

In [12]:
#H0 : μ =16.3, Ha : μ !=16.3
n = 10
degrees_of_freedom = n-1
xbar = 17.7
mu = 16.3
s = 1.8
alpha = 0.05

In [13]:
t = (xbar-mu)/(s/np.sqrt(n))
t

2.4595492912420704

In [14]:
t_critical = st.t.ppf(alpha/2,degrees_of_freedom)
t_critical

-2.262157162740992

In [15]:
if (abs(t)>abs(t_critical)): #Absolute value taken as the it's a two-tailed test and the original t_critical value might be negative
    print("Null hypothesis cannot be rejected")
else:
    print("Reject null hypothesis")

Null hypothesis cannot be rejected


## Section 4 : t-Test using P-value

## Example 4.1 : Jogger’s Oxygen Uptake
A physician claims that joggers’ maximal volume oxygen uptake is greater than the average of all adults. A random sample of 15 joggers has a mean of 40.6 milliliters per kilogram (ml/kg) and a standard deviation of 6 ml/kg. If the average of all adults is 36.7 ml/kg, is there enough evidence to support the physician’s claim at α = 0.05? Assume the variable is normally distributed. 

In [16]:
#H0 : μ =36.7, Ha : μ >36.7
n = 15
degrees_of_freedom = n-1
xbar = 40.6
mu = 36.7
s = 6
alpha = 0.05

In [17]:
t = (xbar-mu)/(s/np.sqrt(n))
t

2.51743917503482

In [18]:
p_val = (1 - st.t.cdf(abs(t),degrees_of_freedom)) #"1 - cdf" because it's a right-tailed test
p_val

0.012311189053656801

In [19]:
if (p_val>alpha):
    print("Null hypothesis cannot be rejected")
else:
    print("Reject null hypothesis")

Reject null hypothesis


## Section 5 : Chi-Square Test

## Example 5.1 : IQ Test
A psychologist wishes to see if the variance in IQ of 10 of her counseling patients is less than the variance of the population, which is 225. The variance of the IQs of her 10 patients was 206. Test her claim at α = 0.05.

In [23]:
#H0 : σ2=225 , Ha : σ2 <225
n = 10
degrees_of_freedom = n-1
s_square = 206
sigma_square = 225
alpha = 0.05

In [24]:
chi_square = ((n-1)*s_square) / (sigma_square)
chi_square

8.24

In [26]:
chi_square_critical =  st.chi2.ppf(alpha,degrees_of_freedom)#"1-alpha" as per Bluman's table
chi_square_critical

3.325112843066815

In [27]:
if (chi_square > chi_square_critical):
    print("Null hypothesis cannot be rejected")
else:
    print("Reject null hypothesis")

Null hypothesis cannot be rejected


## Section 6 : Chi-Square Test using P-Value

## Example 6.1 : Car Inspection Times
A researcher knows from past studies that the standard deviation of the time it takes to inspect a car is 16.8 minutes. A random sample of 24 cars is selected and inspected. The standard deviation is 12.5 minutes. At α = 0.05, can it be concluded that the standard deviation has changed? Use the P-value method. Assume the variable is normally distributed. 

In [32]:
#H0 : σ =16.8, Ha :  σ !=16.8
n = 24
degrees_of_freedom = n-1
s = 12.5
sigma = 16.7
alpha = 0.05

In [33]:
chi_square = ((n-1)*(s**2))/(sigma**2)
chi_square

12.885904837032522

In [34]:
p_val = st.chi2.cdf(chi_square,degrees_of_freedom)*2
p_val

0.09115855673359283

In [35]:
if (p_val > alpha):
    print("Null hypothesis cannot be rejected")
else:
    print("Reject null hypothesis")

Null hypothesis cannot be rejected
