In [17]:
from __future__ import print_function, division
import scipy.stats

**Exercise 5.1** In the BRFSS (see Section 5.4), the distribution of heights is
roughly normal with parameters µ = 178 cm and σ = 7.7 cm for men, and
µ = 163 cm and σ = 7.3 cm for women.
In order to join Blue Man Group, you have to be male between 5’10” and
6’1” (see http://bluemancasting.com). What percentage of the U.S. male
population is in this range?

In [2]:
#scipy.stats.norm represents a normal distribution.
mu = 178
sigma = 7.7
dist = scipy.stats.norm(loc=mu, scale=sigma)
type(dist)

scipy.stats._distn_infrastructure.rv_frozen

In [3]:
#A "frozen random variable" can compute its mean and standard deviation.
dist.mean(), dist.std()

(178.0, 7.7)

In [4]:
#It can also evaluate its CDF.  How many people are more than one standard deviation below the mean?  About 16%
dist.cdf(mu-sigma)

0.1586552539314574

In [5]:
#How many people are between 5'10" and 6'1"?
#Solution

low = dist.cdf(177.8)    # 5'10"
high = dist.cdf(185.4)   # 6'1"
low, high, high-low

(0.48963902786483265, 0.8317337108107857, 0.3420946829459531)

In [8]:
high-low

0.3420946829459531

**Exercise 5.2** To get a feel for the Pareto distribution, let’s see how different
the world would be if the distribution of human height were Pareto. With the
parameters xm = 1 m and α = 1.7, we get a distribution with a reasonable
minimum, 1 m, and median, 1.5 m.
Plot this distribution. What is the mean human height in Pareto world?
What fraction of the population is shorter than the mean? If there are 7
billion people in Pareto world, how many do we expect to be taller than 1
km? How tall do we expect the tallest person to be?


In [9]:
# scipy.stats.pareto represents a pareto distribution. 
#In Pareto world, the distribution of human heights has parameters alpha=1.7 and xmin=1 meter. 
#So the shortest person is 100 cm and the median is 150.
alpha = 1.7
xmin = 1       # meter
dist = scipy.stats.pareto(b=alpha, scale=xmin)
dist.median()

1.5034066538560549

In [10]:
#What is the mean height in Pareto world?
#Solution

dist.mean()

2.428571428571429

In [11]:
#What fraction of people are shorter than the mean?
#Solution

dist.cdf(dist.mean())

0.778739697565288

In [13]:
#Out of 7 billion people, how many do we expect to be taller than 1 km?  
#Solution

#Option 1 using dist.cdf
print( (1 - dist.cdf(1000)) * 7e9)

#Option 2 using dist.sf
dist.sf(1000) * 7e9

55602.976430479954


55602.97643069972

In [14]:
#How tall do we expect the tallest person to be?

In [15]:
# Solution

# One way to solve this is to search for a height that we
# expect one person out of 7 billion to exceed.

# It comes in at roughly 600 kilometers.

dist.sf(600000) * 7e9 

1.0525455861201714

In [16]:
# Solution

# Another way is to use `ppf`, which evaluates the "percent point function", which
# is the inverse CDF.  So we can compute the height in meters that corresponds to
# the probability (1 - 1/7e9).

dist.ppf(1 - 1/7e9)

618349.6106759505