## __Z-Test and P-Value Using Python__

A Z-test is used when we want to compare the sample means of two samples with the population mean when the sample size is greater than 30. 

For less than 30, we use t-tests. 

There are two types of Z-tests, and they are:


1.   One-sample Z-Test
2.   Two-sample Z-Test


## Step 1: Import the Z-Test Library and Set the Seed

- Import the necessary library and set the random seed


In [None]:
import random
random.seed(20)
from statsmodels.stats.weightstats import ztest as ztest

  return f(*args, **kwds)
  return f(*args, **kwds)
  return f(*args, **kwds)


Assume that a doctor knew that the mean IQ is 100 with a standard deviation of 15:
- $ \mu = 100 $ <br>
- $ \sigma = 15 $


He creates a new drug and tries to see if that drug has an impact on the IQ of the people.


Let's create a random sample representing the values that he recorded.


## Step 2: One-Sample Z-Test

- Here, the null hypothesis is the mean of the sample is the same as the value that we provide (which means that the drug did not work). 

   i.e. - null hypothesis: $\mu_{s} = value$

- The alternate hypothesis is the mean of the sample is not equal to the value that we provide (which means that the drug works).

   i.e. - alternate hypothesis: $\mu_{s} \ne value$ 


Let's go ahead and do the Z-test.

### Step 2.1: Create a Sample with the Same Mean:


In [None]:
a = [random.gauss(100,15) for x in range(40)]

In [None]:
ztest(a,value = 100)

(1.3430675401429018, 0.17925010504664385)

__Observation__

We can see that the value is greater than 0.05. We need to accept the null hypothesis, which means that the drug did not work because we gave the mean as 100.

### Step 2.2: Create a Sample with a Mean Greater than That of the Population:

Let's increase the sample to 120 to check if the drug actually works better. 

- Pass a value equal to 100
- Ensure alternative hypothesis that is equal to **larger**


In [None]:
a = [random.gauss(120,15) for x in range(40)]

In [None]:
ztest(a,value = 100,alternative = 'larger')

(-0.6365535358622426, 0.7377921506234024)

__Observation__

- We can see that the p-value is very low and is less than 0.05. Hence, we need to reject the null hypothesis.

- This means that the drug actually worked. This is true because the average mean that we gave in random data generation is 120.

### Step 2.3: Create a Sample with a Mean Smaller than That of the Population:

Here, the alternate hypothesis would be the mean of the sample is less than the value.

- Ensure alternative hypothesis is equal to **smaller**



In [None]:
a = [random.gauss(80,15) for x in range(40)]

In [None]:
# Perform z-test with the population mean of 100, alternative='smaller'
ztest(a,value=100,alternative = 'smaller')

(-8.827369053119002, 5.358336411467834e-19)



__Observation__


- We can see that the p-value is less than 0.05. This means we need to reject our null hypothesis.


- Hence, we accept the alternative hypothesis, which means that the mean (-8.827369053119002, 5.358336411467834e-19) is less than 100 and that the drug actually deteriorated the IQ.

## Step 3: Two-Sample Z-Test

Let's take an example where we need to check if the IQs in the two cities are different.

- The first city has an average IQ of 100.
- The second city has an average IQ of 120.

The value would be the mean of sample **a** minus the mean of sample **b** minus the value that we provide.


### Step 3.1: Sample with Different Means

In [None]:
a = [random.gauss(100,15) for x in range(40)]
b = [random.gauss(120,15) for x in range(40)]

State the null and alternate hypotheses for two samples using the Z-test:


- Null hypothesis: $\mu_{d} = 0$
- Alternate hypothesis: $\mu_{d} \ne 0$

In [None]:
ztest(a,b,value = 0)

(-5.723730941795308, 1.0420974415988871e-08)

__Observartion__

- We can see that the p-value is very low; hence, we need to reject the null hypothesis.

- This means that the IQs of the two cities are not the same.

### Step 3.2: Sample with the Same Mean

Let's give the same IQ for both cities and then do a Z-test.

In [None]:
a = [random.gauss(100,15) for x in range(40)]
b = [random.gauss(100,15) for x in range(40)]

In [None]:
ztest(a,b,value = 0)

(-0.32089978531778995, 0.7482863366289499)

__Observation__

- We can see that the p-value is greater than the mean. This means we need to accept the null hypothesis.

- This means that the IQ in both cities is the same.