In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import scipy.stats as stats
%matplotlib inline

### One proportion Z-test

**Assumptions:**
* The sample data must be randomly selected from the population
*  For the Z-test to be valid, both $np_0$ and $n(1-p_0)$ should be at least 5, ensuring the sampling distribution of the sample proportion approximates normality.
* Observations in the sample should be independent.

**Example 1:** A candidate’s campaign team wants to verify if more than 55% of a sample of 500 voters support their candidate. In the sample, 280 voters support the candidate.

* Sample size $(n)$ = 500
* Observed successes $(x)$ = 280 
* Hypothesized proportion $(p_0)$ = 0.55

**Assumptions:**
* $np_0$=500X0.55=275
* $n(1-p_0)$=500X0.45=225

Testing the null hypothesis 

>$H_0:p=0.55$

against the alternate hypothesis

>$H_1=p>0.55$

In [2]:
# Data
n = 500
x = 280
p0 = 0.55
# Z-test calculation
from statsmodels.stats.proportion import proportions_ztest
z_stat, p_value = proportions_ztest(count=x, nobs=n, value=p0, alternative='larger')
p_value

0.32618624782631384

since p value is greater than alpha we fail to reject the null hypothesis

In [3]:
#testing the hypothesis using critical value
alpha = 0.05
critical_value = stats.norm.ppf(1 - alpha)
critical_value

1.6448536269514722

In [4]:
z_stat

0.4504687313477799

since z-stat is less than the critical value we fail to reject the null hypothesis

**Example 2:** A company claims its defect rate is 2%. A quality control manager wants to test if the defect rate is less than 2% based on a sample of 1,000 units, where 15 defects were found.

* Sample size $(n)$ = 1,000
* Observed successes $(x)$ = 15
* Hypothesized proportion $(p_0)$ = 0.02

**Assumptions**
* $np_0$=1000X0.02=20
* $n(1-p_0)$=1000X0.98=980

since both are greater than 5 it satisfies the assumptions

Testing the null hypothesis

>$H_0:p=0.02$

against the alternate hypothesis

>$H_1:p<0.02$

In [5]:
# Data
n = 1000
x = 15
p0 = 0.02
#z-test
z_stat,p_value=proportions_ztest(count=x,nobs=n,value=p0,alternative='smaller')
p_value

0.09666564976962666

since p value is greater than alpha we fail to reject the null hypothesis

In [6]:
#testing the hypothesis using critical value
alpha=0.05
critical_value=stats.norm.ppf(alpha)
critical_value

-1.6448536269514729

In [7]:
z_stat

-1.3007872144692096

since the z_stat is less than the critical value we fail to reject the null hypothesis

**Example 3:**  A service center claims that at least 80% of their customers are satisfied. Out of 250 surveyed customers, 190 indicated satisfaction.

* Sample size $(n)$ = 250
* Observed successes $(x)$ = 190
* Hypothesized proportion $(p_0)$ = 0.80

**Assumptions**
* $np_0$=250X0.80=200
* $n(1-p_0)$=250X0.20=50

since both are greater than 5 it satisfies all the assumptions

Testing the null hypothesis

>$H_0:p=0.80$

against the alternate hypothesis

>$H_1:p<0.80$

In [8]:
# Data
n = 250
x = 190
p0 = 0.80
#z-test
z_stat,p_value=proportions_ztest(count=x,nobs=n,value=p0,alternative='smaller')
p_value

0.0693203169066091

since p value is greater than alpha we fail to reject the null hypothesis

In [9]:
#hypothesis using critical value
alpha=0.05
critical_value=stats.norm.ppf(alpha)
critical_value

-1.6448536269514729

In [10]:
z_stat

-1.480872194397732

since z stat is less than critical value we fail to reject the null hypothesis

**Example 4:** A marketing team believes that their online ad campaign has a conversion rate of at least 5%. Out of 600 users who viewed the ad, 25 made a purchase.

* Sample size $(n)$ = 600
* Observed successes $(x)$ = 25
* Hypothesized proportion $(p_0)$ = 0.05

**Assumptions**
* $np_0$=600X0.05=30
* $n(1-p_0)$=600X0.95=570

since both are greater than 5 it satisfies all the assumptions

Testing the null hypothesis

>$H_0:p=0.05$

against the alternate hypothesis

>$H_1:p<0.05$

In [11]:
# Data
n = 600
x = 25
p0 = 0.05

#z_stat
z_stat,p_value=proportions_ztest(count=x,nobs=n,value=p0,alternative='smaller')
p_value

0.15350694901691242

since p value is greater than alpha we fail to reject the null hypothesis

In [12]:
#testing the hypothesis using critical value
alpha=0.05
critical_value=stats.norm.ppf(alpha)
critical_value

-1.6448536269514729

In [13]:
z_stat

-1.0215078369104988

since z_stat is less than critical value we fail to reject the null hypothesis

**Example 5:** A health organization claims that at least 70% of a region’s population is vaccinated. Out of a sample of 1,200 individuals, 820 were vaccinated.

* Sample size $(n)$ = 1200
* Observed successes $(x)$ = 820
* Hypothesized proportion $(p_0)$ = 0.70

**Assumptions**
* $np_0$=1200X0.70=840
* $n(1-p_0)$=1200X0.30=360

since both are greater than 5 it satisfies all the assumptions

Testing the null hypothesis 

>$H_0:p=0.70$

against the alternate hypothesis

>$H_1:p<0.70$

In [14]:
# Data
n = 1200
x = 820
p0 = 0.70

# Z-test calculation
z_stat,p_value=proportions_ztest(count=x,nobs=n,value=p0,alternative='smaller')
p_value

0.10727642555328526

since p value is greater than alpha we fail to reject the null hypothesis

In [15]:
#testing hypothesis using critical value
alpha=0.05
critical_value=stats.norm.ppf(alpha)
critical_value

-1.6448536269514729

In [16]:
z_stat

-1.2411432056761775

since z_stat is less than critical value we fail to reject the null hypothesis

**Example 6:** A bank manager wants to confirm that the loan approval rate is at least 40%. Out of 500 loan applications, 180 were approved.

* Sample size $(n)$ = 500
* Observed successes $(x)$ = 180
* Hypothesized proportion $(p_0)$ = 0.40

**Assumptions**
* $np_0$=500X0.40=200
* $n(1-p_0)$=500X0.60=300

since both are greater than 5 it satisfies all the assumptions

Testing the null hypothesis 

>$H_0:p=0.40$

against the alternate hypothesis

>$H_1:p<0.40$

In [17]:
# Data
n = 500
x = 180
p0 = 0.40

# Z-test calculation
z_stat, p_value = proportions_ztest(count=x, nobs=n, value=p0, alternative='smaller')
p_value

0.03120370928435278

since p value is less than alpha we reject the null hypothesis

In [18]:
#finding hypothesis using critical value
alpha=0.05
critical_value=stats.norm.ppf(alpha)
critical_value

-1.6448536269514729

In [19]:
z_stat

-1.8633899812498265

since z_stat is greater than critical value we reject the null hypothesis

**Example 7:** A car manufacturer claims that the defect rate in a specific model should not exceed 2%. Out of 1,000 cars inspected, 30 were found to be defective.

* Sample size $(n)$ = 1000
* Observed successes $(x)$ = 30
* Hypothesized proportion $(p_0)$ = 0.02

**Assumptions**
* $np_0$=1000X0.02=20
* $n(1-p_0)$=1000X0.98=980

since both are greater than 5 it satisfies all the assumptions

Testing the null hypothesis 

>$H_0:p=0.02$

against the alternate hypothesis

>$H_1:p>0.02$

In [20]:
# Data
n = 1000
x = 30
p0 = 0.02

# Z-test calculation
z_stat, p_value = proportions_ztest(count=x, nobs=n, value=p0, alternative='larger')
p_value

0.0318867521351953

since p value less than alpha we reject the null hypothesis

In [21]:
#finding hypothesis using critical value
alpha=0.05
critical_value=stats.norm.ppf(1-alpha)
critical_value

1.6448536269514722

In [22]:
z_stat

1.8537599944001615

since z stat is greater than critical_value we reject the null hypothesis

**Example 8:** A company claims that at least 85% of its employees are satisfied with their job. In a survey of 400 employees, 320 indicated they were satisfied.

* Sample size $(n)$ = 400
* Observed successes $(x)$ = 320
* Hypothesized proportion $(p_0)$ = 0.85

**Assumptions**
* $np_0$=400X0.85=340
* $n(1-p_0)$=400X0.15=60

since both are greater than 5 it satisfies all the assumptions

Testing the null hypothesis 

>$H_0:p=0.85$

against the alternate hypothesis

>$H_1:p<0.85$

In [23]:
# Data
n = 400
x = 320
p0 = 0.85

# Z-test calculation
z_stat, p_value = proportions_ztest(count=x, nobs=n, value=p0, alternative='smaller')
p_value

0.006209665325776195

since p value is less than alpha we reject the null hypothesis

In [24]:
#hypothesis using critical value
alpha=0.05
critical_value=stats.norm.ppf(alpha)
critical_value

-1.6448536269514729

In [25]:
z_stat

-2.4999999999999964

since z_stat is greater than critical value we reject the null hypothesis

**Example 9:** A candidate claims that at least 65% of voters support their platform. In a poll of 1,200 voters, 780 expressed support.


* Sample size $(n)$ = 1200
* Observed successes $(x)$ = 780
* Hypothesized proportion $(p_0)$ = 0.65

**Assumptions**
* $np_0$=1200X0.65=780
* $n(1-p_0)$=1200X0.35=420

since both are greater than 5 it satisfies all the assumptions

Testing the null hypothesis 

>$H_0:p=0.65$

against the alternate hypothesis

>$H_1:p<0.65$

In [26]:
# Data
n = 1200
x = 780
p0 = 0.65

# Z-test calculation
z_stat, p_value = proportions_ztest(count=x, nobs=n, value=p0, alternative='smaller')
p_value

0.5

since p value is greater than alpha we fail to reject the null hypothesis

In [27]:
# Critical value approach
alpha = 0.05
critical_value = stats.norm.ppf(alpha)
critical_value

-1.6448536269514729

In [28]:
z_stat

0.0

since z_stat is less than critical_value we fail to reject the null hypothesis

**Example 10:** A university claims that 55% of high school graduates in its region enroll in college. Out of a sample of 800 high school graduates, 420 were found to have enrolled in college.

* Sample size $(n)$ = 800
* Observed successes $(x)$ = 420
* Hypothesized proportion $(p_0)$ = 0.55

**Assumptions**
* $np_0$=800X0.55=440
* $n(1-p_0)$=800X0.45=360

since both are greater than 5 it satisfies all the assumptions

Testing the null hypothesis 

>$H_0:p=0.55$

against the alternate hypothesis

>$H_1:p<0.55$

In [29]:
# Data
n = 800
x = 420
p0 = 0.55

# Z-test calculation
z_stat, p_value = proportions_ztest(count=x, nobs=n, value=p0, alternative='smaller')
p_value

0.07838999925606543

since p value is greater than alpha we fail to reject the null hypothesis

In [30]:
#critical value approach
alpha = 0.05
critical_value = stats.norm.ppf(alpha)
critical_value

-1.6448536269514729

In [31]:
z_stat

-1.4159846508095786

since z stat is less than critical value we fail to reject the null hypothesis