In [6]:
from scipy import stats
from statistics import NormalDist

To find the z score

$ z = \frac{X - \mu}{\sigma} $

Assuming a standard normal distribution of weekly income ($\mu = 1000, \sigma = 100$) what is the area under the normal curve between 840 and 1200?

In [38]:
z_manual_1 = (840 - 1000) /  100
z_manual_2 = (1200 -1000) /  100
z_manual_1, z_manual_2

(-1.6, 2.0)

using the statistics module of the standard library to find the Z scores

In [23]:
nml_dst_1 = NormalDist(mu=1000, sigma=100).zscore(840)
nml_dst_2 = NormalDist(mu=1000, sigma=100).zscore(1200)
nml_dst_1, nml_dst_2

(-1.6, 2.0)

Thus we have a range between -1.6 and 2 standard deviations of values

using scipy

The location (loc) keyword specifies the mean. The scale (scale) keyword specifies the standard deviation.

In [41]:
p1 = stats.norm.cdf(x=840,loc=1000,scale=100)
p2 = stats.norm.cdf(x=1200,loc=1000,scale=100)
p1, p2

(np.float64(0.054799291699557974), np.float64(0.9772498680518208))

So we're starting with the cdf which is found on the z table and working backwards to find the z score...

the PPF returns the exact point where the probability of everything to the left is equal to y . This can be thought of as the percentile function since the PPF tells us the value of a given percentile of the data.

some of the math is here: https://math.stackexchange.com/questions/3170171/normal-distribution-formula-for-percentile-point-function
I think I'll let scipy do the work...

Here, we first calculate the cumulative probability 'p' of obtaining 'xcritical' value given 'mean' and 'stdev' using norm.cdf(). norm.cdf() calculates the percentage of area under a normal distribution curve from negative infinity till an 'x' value ('xritical' in this case). Then, we pass this probability to norm.ppf() to obtain the z-score corresponding to that 'x' value. norm.ppf() is percent point function which yields the (z)value corresponding to passed lower tail probability in a standard normal distributed curve.

In [43]:
z_score1 = stats.norm.ppf(p1)
z_score2 = stats.norm.ppf(p2)
z_score1, z_score2

(np.float64(-1.6000000000000003), np.float64(2.0000000000000004))

the area under the curve can be derived by subtracting the cdf of p2 - p1

In [46]:
p2 - p1

np.float64(0.9224505763522628)

therefore the area under the curve for a z of 2.00 and -1.6 represents 92% of the distribution