## z - test

##### What is Z-Test?
Z-test is a statistical test that is used to determine whether the mean of a sample is significantly different from a known population mean when the population standard deviation is known. It is particularly useful when the sample size is large (>30).

Z-test can also be defined as a statistical method that is used to determine whether the distribution of the test statistics can be approximated using the normal distribution or not. It is the method to determine whether two sample means are approximately the same or different when their variance is known and the sample size is large (should be >= 30).

The Z-test compares the difference between the sample mean and the population means by considering the standard deviation of the sampling distribution. The resulting Z-score represents the number of standard deviations that the sample mean deviates from the population mean. This Z-Score is also known as Z-Statistics, and can be formulated as:

![image.png](attachment:933a09d1-77fb-44f7-a8a1-5649b8b934a1.png)
where,

x bar  : mean of the sample.<br>
u  : mean of the population.<br>
sigma  : Standard deviation of the population<br>

##### When to Use Z-test:
The sample size should be greater than 30. Otherwise, we should use the t-test.
Samples should be drawn at random from the population.
The standard deviation of the population should be known.
Samples that are drawn from the population should be independent of each other.
The data should be normally distributed, however, for a large sample size, it is assumed to have a normal distribution because central limit theorem

### One tailed test

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats

#### 1.Right tailed test

In [2]:
sample_mean = 110
population_mean = 100
population_std = 15
sample_size = 50
alpha = 0.05

# since sample_mean > population_mean that's why it is right tailed test

In [3]:
z_score = (sample_mean-population_mean)/(population_std/np.sqrt(sample_size))

In [4]:
z_score

4.714045207910317

In [5]:
# Approach 1: Using Critical Z-Score
z_critical = stats.norm.ppf(1-alpha)

In [6]:
z_critical

1.6448536269514722

In [7]:
if z_score > z_critical:
    print("Reject the Null hypothesis")
else:
    print("Fail to reject Null hypothesis")

Reject the Null hypothesis


In [8]:
# Approach 2:Using P-Value
p_value = 1 - stats.norm.cdf(z_score) # for right tailed test we minus p-value from one

In [9]:
p_value

1.2142337364462463e-06

In [10]:
if p_value < alpha:
    print("Reject the Null hypothesis")
else:
    print('Fail to reject Null hypothesis')

Reject the Null hypothesis


#### 2.Left tailed test

In [11]:
sample_mean = 100
population_mean = 120
population_std = 15
sample_size = 50
alpha = 0.05

# since sample_mean < population_mean that's why it is left tailed test

In [12]:
z_score = (sample_mean-population_mean)/(population_std/np.sqrt(sample_size))

In [13]:
z_score

-9.428090415820634

In [14]:
z_critical = -stats.norm.ppf(1-alpha)

In [15]:
z_critical

-1.6448536269514722

In [16]:
# Approach 1:Using critical value
if z_score < z_critical:
    print("Reject the null hypothesis")
else:
    print("Fail to reject null hypothesis")

Reject the null hypothesis


In [17]:
# Approach 2:Using P-Value
p_value = stats.norm.cdf(z_score) # for left tailed test we don't minus p-value from one

In [18]:
p_value

2.088112459630109e-21

In [19]:
if p_value < alpha:
    print("Reject the Null hypothesis")
else:
    print('Fail to reject Null hypothesis')

Reject the Null hypothesis


### Two tailed test

##### Example : 

In [20]:
michigan_mean = 6873
usa_mean = 6800
usa_std = 400
sample_size = 100
alpha = 0.05

In [21]:
z_score = (michigan_mean-usa_mean)/(usa_std/np.sqrt(sample_size))

In [22]:
z_score

1.825

In [23]:
z_critical = stats.norm.ppf(1-alpha/2)

In [24]:
z_critical

1.959963984540054

In [25]:
if abs(z_score) > z_critical:
    print("Reject the null hypothesis")
else:
    print("Fail to reject the null hypothesis")

Fail to reject the null hypothesis
