<h1 align="center">Z Test Tutorial: One Tailed Test</h1>

In [31]:
import pandas as pd
import numpy as np

### Null Hypothesis: Housing inflation is 10%
### Alternate Hypothesis: Housing inflation is > 10%


For this test, we have collected a sample of 100 home prices inflation numbers. We will load that in pandas dataframe

In [2]:
df = pd.read_csv("house_price_increase.csv")
df.head()

Unnamed: 0,house_id,price_increase_pct
0,NJ001,12.7
1,NJ002,11.3
2,NJ003,11.9
3,NJ004,13.2
4,NJ005,12.8


In [3]:
df.shape

(100, 2)

In [4]:
population_mean = 10
population_std_dev = 4

In [5]:
sample_mean = df.price_increase_pct.mean()
sample_mean

11.0

In [6]:
sample_size = df.shape[0]
sample_size

100

In [8]:
standard_error = population_std_dev/np.sqrt(sample_size)
standard_error

0.4

In [13]:
z_score = (sample_mean-population_mean)/standard_error
z_score

2.5

We define significance level (alpha) to be 5%.

In [30]:
alpha = 0.05 # alpha means significance level

### Z Test Using Rejection Region

In [28]:
from scipy import stats

z_critical = stats.norm.ppf(1-alpha) # get z score from the area probability
z_critical

1.6448536269514722

In [21]:
z_score, z_critical

(2.5, 1.6448536269514722)

Since z_score > z_critical, we will reject the NULL hypothesis. This means the alternate hypothesis becomes true which means the inflation is indeed higher than 10%

### Z Test Using p-Value

In [22]:
from scipy import stats

stats.norm.cdf(z_score)

0.9937903346742238

In [26]:
p_value = 1 - stats.norm.cdf(z_score) # get p value from z score
p_value

0.006209665325776159

In [29]:
p_value, alpha

(0.006209665325776159, 0.05)

Since P value (0.006) is less than alpha (i.e. significance level -> 0.05), we reject the null hypothesis. This means there is statistically significant evidence to support the claim that the inflation rate in house prices is higher than the reported 10%.