# Hypothesis Testing - examples

<img src="image.jpeg" width="500">

### Objective: To accept or reject the Null Hypothesis and give the final conclusion of each example


# Importing libraries

In [4]:
import statsmodels.api as sm
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Examples of Hypothesis Testing

## _Population Proportion_

In previous years, 67% of parents believed that electronics and social media was the cause of their teenager’s lack of sleep. 

<img src="image1.jpeg" width="700">


#### Search Question: _Do more parents today believe that their teenager’s lack of sleep is caused due to electronics and social media?_

#### Hypothesis Testing Details
* Population: Parents with a teenager (age 13-18)  
* Parameter of Interest: p  

    * H0: p = 0.53 [Social media and electronics bother the kids sleeping process]
    * H1 or Ha: p > 0.53 [Social media and electronics doesn't bother the kids sleeping process]
    (note that this is a one-sided test)

#### Data Information
   * quantity: 2000 people were surveyed.
   * 57% out of 100% believe that their teenager’s lack of sleep is caused due to electronics and social media.

In [40]:
# coding the hypothesis testing details
n = 2000
pnull = .53
phat = .57

In [42]:
# applying proportions_ztest from statsmodels that returns Z-statistic and p-value
# larger means one-sided test
a = sm.stats.proportions_ztest(phat * n, n, pnull, alternative='larger')

In [79]:
print (f' Z-statistc is around {a[0]:.4f}')
print (f' p-value is around {a[1]:.6f}')
print()
if a[1]<=0.05:
    
    print("Conclusion: Reject Null Hypothesis")
else:
    print("Conclusion: Accept Null Hypothesis")

 Z-statistc is around 0.2695
 p-value is around 0.787670

Conclusion: Accept Null Hypothesis


### Conclusion of the hypothesis test
Since the p-value is very tiny if compared with p (parameter of interest), we can reject the Null hypothesis. Although, we do not accept the alternate hypothesis, this informally means that there is a good chance of this proportion being more than 53%.

## _Difference in Population Proportions_
In a random research, parents have been asked about their child has or hasn't had some swimming lessons at school.

<img src="image2.jpeg" width="700">

#### Research Question: _Is there a significant difference between the population proportions of parents of black children and parents of Hispanic children who report that their child has had some swimming lessons?_

#### Hypothesis Testing Details
- Populations: Parents of black children (age 6-18) and also parents of hispanic children (age 6-18)  
- Parameter of Interest: p1 - p2, where p1 = black and p2 = hispanic  
    * H0: p1 - p2 = 0  
    * H1 or Ha: p1 - p2 $\neq$ 0  

#### Data Information
- 250 parents of black children where 36.8% of them reports that their child has had some swimming lessons. 
- 350 parents of hispanic children where 38.9% of parents report that their child has had some swimming lessons.

In [70]:
# coding the hypothesis testing details
n1 = 250
p1 = .368

n2 = 350
p2 = .389

population1 = np.random.binomial(1, p1, n1)
population2 = np.random.binomial(1, p2, n2)

In [72]:
# applying ttest_ind from statsmodels that returns test-statistic. p-value and degrees of freedom
# larger means one-sided test
a = sm.stats.ttest_ind(population1, population2)

In [78]:
print (f' test-statistc = {a[0]:.4f}')
print (f' p-value = {a[1]:.4f}')
print (f' degrees of freedom = {a[2]:.1f}')

print()
if a[1]<=0.05:
    
    print("Conclusion: Reject Null Hypothesis")
else:
    print("Conclusion: Accept Null Hypothesis")

 test-statistc = 0.2695
 p-value = 0.7877
 degrees of freedom = 598.0

Conclusion: Accept Null Hypothesis


### Conclusion of the hypothesis test
Since the p-value is quite high ~0.7877, we taken on the Null hypothesis in this case i.e. the difference in the population proportions are not statistically significant.

## _One Population Mean_

Physical Competition among adults and its results to measure one population mean
<img src="image3.jpeg" width="700">

### Research Question: _Is the average cartwheel distance (in inches) for adults more than 80 inches?_

### Hypothesis Testing:
- Population: All adults  
- Parameter of Interest: $\mu$, population mean cartwheel distance.
    * H0: $\mu$ = 80 
    * H1 ou Ha: $\mu$ > 80

### Data information:
<br>25 adult participants. 
<br>$\mu = 83.84$
<br>$\sigma = 10.72$

In [91]:
# coding the hypothesis testing details
data = np.array([80.57, 98.96, 85.28, 83.83, 69.94, 89.59, 91.09, 66.25, 91.21, 82.7 , 73.54, 81.99, 54.01, 
                 82.89, 75.88, 98.32, 107.2 , 85.53, 79.08, 84.3 , 89.32, 86.35, 78.98, 92.26, 87.01])
n = len(data)
mean = data.mean()
sd = data.std()

print (f' adult participants = {n:.1f}')
print (f' mean = {mean:.2f}')
print (f' standard-deviatio = {sd:.2f}')


 adult participants = 25.0
 mean = 83.84
 standard-deviatio = 10.72


In [93]:
# applying proportions_ztest from statsmodels that returns Z-statistic and p-value
# larger means one-sided test
a = sm.stats.ztest(data, value = 80, alternative = "larger")

In [94]:
print (f' Z-statistc is around {a[0]:.4f}')
print (f' p-value is around {a[1]:.6f}')
print()
if a[1]<=0.05:
    
    print("Conclusion: Reject Null Hypothesis")
else:
    print("Conclusion: Accept Null Hypothesis")

 Z-statistc is around 1.7570
 p-value is around 0.039461

Conclusion: Reject Null Hypothesis


### Conclusion of the hypothesis test
Since the p-value  (0.0394) is lower than the standard confidence level (0.05), we can reject the Null hypothesis (the mean cartwheel distance for adults is equal to 80 inches). Therefore, there's a strong evidence in support for the alternatine hypothesis that the mean cartwheel distance is, in fact, higher than 80 inches.