# Exercises from Think Stats, 2nd Edition

http://thinkstats2.com

Copyright 2016 Allen B. Downey

MIT License: https://opensource.org/licenses/MIT


## Exercises

**Exercise:** In the BRFSS (see Section 5.4), the distribution of heights is roughly normal with parameters µ = 178 cm and σ = 7.7 cm for men, and µ = 163 cm and σ = 7.3 cm for women.

In order to join Blue Man Group, you have to be male between 5’10” and 6’1” (see http://bluemancasting.com). What percentage of the U.S. male population is in this range? Hint: use `scipy.stats.norm.cdf`.

`scipy.stats` contains objects that represent analytic distributions

In [10]:
import scipy.stats

For example <tt>scipy.stats.norm</tt> represents a normal distribution.

In [11]:
mu = 178
sigma = 7.7
dist = scipy.stats.norm(loc=mu, scale=sigma)
type(dist)

scipy.stats._distn_infrastructure.rv_frozen

A "frozen random variable" can compute its mean and standard deviation.

In [12]:
dist.mean(), dist.std()

(178.0, 7.7)

It can also evaluate its CDF.  How many people are more than one standard deviation below the mean?  About 16%

In [13]:
dist.cdf(mu-sigma)

0.1586552539314574

How many people are between 5'10" and 6'1"?
### Ans: 34%

In [14]:
# Solution goes here
# Converting the heights from ft to cm
h_ft = 5
h_inch = 10

h_inch += h_ft * 12
h_cm1 = round(h_inch * 2.54, 1)

h_ft = 6
h_inch = 1

h_inch += h_ft * 12
h_cm2 = round(h_inch * 2.54, 1)
h_cm1, h_cm2
# Now incorporating the heights into the CDF of the distribution
# Taking the difference between the two heights will show the percent 
dist.cdf(h_cm1) , dist.cdf(h_cm2)

dist.cdf(h_cm2) - dist.cdf(h_cm1)

0.3420946829459531

**Exercise:** To get a feel for the Pareto distribution, let’s see how different the world would be if the distribution of human height were Pareto. With the parameters xm = 1 m and α = 1.7, we get a distribution with a reasonable minimum, 1 m, and median, 1.5 m.

Plot this distribution. What is the mean human height in Pareto world? What fraction of the population is shorter than the mean? If there are 7 billion people in Pareto world, how many do we expect to be taller than 1 km? How tall do we expect the tallest person to be?

`scipy.stats.pareto` represents a pareto distribution.  In Pareto world, the distribution of human heights has parameters alpha=1.7 and xmin=1 meter.  So the shortest person is 100 cm and the median is 150.

In [15]:
alpha = 1.7
xmin = 1       # meter
dist = scipy.stats.pareto(b=alpha, scale=xmin)
dist.median()

1.5034066538560549

What is the mean height in Pareto world?
### Ans: 2.43cm

In [17]:
# Solution goes here
mean_hgt = dist.mean()
mean_hgt

2.428571428571429

What fraction of people are shorter than the mean?
### Ans: 78% 

In [18]:
# Solution goes here
dist.cdf(mean_hgt)

0.778739697565288

In [25]:
# Survival function (1 - CDF)
dist.sf(mean_hgt)

0.22126030243471195

Out of 7 billion people, how many do we expect to be taller than 1 km?  You could use <tt>dist.cdf</tt> or <tt>dist.sf</tt>.
### Ans: 55603 approx

In [27]:
# Solution goes here
tot_pop = 7000000000 # Total Population is 7 Billion
n = dist.sf(1000) # fraction of people > 1km

n1 = tot_pop * n # no of people > 1 km
n1

55602.97643069972

How tall do we expect the tallest person to be?
### Ans: approx 618349 m

In [33]:
# Solution goes here
# The below loop stops at 618350
tot_pop = 7000000000 # Total Population is 7 Billion

for i in range(1000000):
  if dist.sf(i) * tot_pop <= 1:
    print(i)
    break

618350


In [37]:
n1 = dist.sf(618349) * tot_pop
n1

1.000001872668398