# Exercise 3 (The missing women)

For the sake of simplicity let us assume that the probability of being born a boy P(B) is the same as the probability of being born a girl P(G), namely 1/2. Let us assume that the sex of different children are stochastically independent, and that there are no multiple births or adoptions.

In this case we would expect the population to be about half male and half female. But is it? According to Nobel Prize wining economist Amartya Sen, due to differential mortality, in Europe and North America there are about 105 females for every 100 males. But in other countries the ratio is considerably lower. The number of females per 100 males is 94 in China, 93 in India, and 92 in Pakistan, and 84 in Saudi Arabia (which has a large migrant male workforce). These latter countries are sometimes described as having “missing women.” Sen conjectures that this discrepancy is due to the neglect of female children, causing them to have higher mortality rates.

An alternative explanation might be that in some countries, parents prefer to have boys, so they may continue to have children until they have a boy or maybe two boys. Let’s see if this resolves Sen’s problem.
Let $B$ denote the number of boys in a family, and let $G$ denote the number of girls. The total number of children is then $N = B + G$. (These are random variables.) For each of the following parental decision rules, compute these expectations:
\begin{gather}
E[G], \quad E[B], \quad E[N],\\
E[G/N], \quad E[B/N],\quad E[(G-B)/N],\\
E[G]/E[B], \quad E[G/B].
\end{gather}

### Exercise 3.1

1. Parents have exactly one child.

\begin{gather}
E[G]=0.5, \quad, E[B]=0.5, \quad E[N]=1, \quad\\
E[G/N]=0.5, \quad E[B/N]=0.5, E[(G-B)/N]=0,\\
E[G]/E[B]=1, E[G/B]=\infty. \square
\end{gather}

### Exercise 3.2

2. Parents stop having children once they have a boy or two girls, whichever comes first.

Under this rule, parents will stop having children if (i) the first child is a boy, (ii) and always after the second child (note that if the second child is a boy, they stop; if the second child is a girl, they also stop).  With these observations, we calculate the following probabilities:
\begin{align}
&E[G] = 0.5\times 0 + 0.25 \times 1 + 0.25 \times 2 = 0.75,\\
&E[B] = 0.5\times 1 + 0.25 \times 1 + 0.25 \times 0 = 0.75,\\
&E[N] = 0.5\times 1 + 0.5\times 2 = 1.5,\\
&E[G/N] = 0.5\times 0/1 + 0.25\times 1/2 + 0.25\times 2/2=3/8,\\
&E[B/N] = 0.5\times 1/1 + 0.25\times 1/2 + 0.25\times 0/2=5/8,\\
&E[(G-B)/N] = 0.5\times (0-1)/1 + 0.25\times (1-1)/2 + 0.25\times (2-0)/2=-2/8,\\
&E[G]/E[B]=1,\\
&E[G/B]=0.5\times 0/1+ 0.25\times 1/1 + 0.25\times 2/0 = \infty. \square
\end{align}

In [21]:
num_experiments = int(1e05)
G = []
B = []
N = []
GN = []
BN = []
GB = []
for experiment in range(num_experiments):
    family = {'Boy': 0, 'Girl': 0}
    # stop having kids when have a boy, or family exceeds two
    while family['Boy']==0 and family['Girl']<2:
        if random.choice(['Boy','Girl'])=='Boy':
            family['Boy']+=1
        else:
            family['Girl']+=1
    G.append(family['Girl'])
    B.append(family['Boy'])
    N.append(family['Boy']+family['Girl'])
    GN.append(family['Girl']/(family['Boy']+family['Girl']))
    BN.append(family['Boy']/(family['Boy']+family['Girl']))
    if family['Boy']==0:
        GB.append(1e09)
    else:
        GB.append(family['Girl']/family['Boy'])
# G
pe = sum(G)/num_experiments
pt = 0.75
print("Empirical E[G]={0}, Theoretical E[G]={1}".format(round(pe,3),round(pt,3)))
# B
pe = sum(B)/num_experiments
pt = 0.75
print("Empirical E[B]={0}, Theoretical E[B]={1}".format(round(pe,3),round(pt,3)))
# N
pe = sum(N)/num_experiments
pt = 1.5
print("Empirical E[N]={0}, Theoretical E[N]={1}".format(round(pe,3),round(pt,3)))
# G/N
pe = sum(GN)/num_experiments
pt = 3/8
print("Empirical E[G/N]={0}, Theoretical E[G/N]={1}".format(round(pe,3),round(pt,3)))
# B/N
pe = sum(BN)/num_experiments
pt = 5/8
print("Empirical E[B/N]={0}, Theoretical E[B/N]={1}".format(round(pe,3),round(pt,3)))
# G/B
pe = sum(GB)/num_experiments
pt = float('inf')
print("Empirical E[G/B]={0}, Theoretical E[G/B]={1}".format(round(pe,3),round(pt,3)))

Empirical E[G]=0.749, Theoretical E[G]=0.75
Empirical E[B]=0.75, Theoretical E[B]=0.75
Empirical E[N]=1.499, Theoretical E[N]=1.5
Empirical E[G/N]=0.374, Theoretical E[G/N]=0.375
Empirical E[B/N]=0.626, Theoretical E[B/N]=0.625
Empirical E[G/B]=249520000.25, Theoretical E[G/B]=inf


### Exercise 3.3

3. Parents always have two children.

\begin{align}
&E[G] = 0.25\times 2 + 0.25 \times 1 + 0.25 \times 1 + 0.25 \times 0  = 1,\\
&E[B] = 0.25\times 2 + 0.25 \times 1 + 0.25 \times 1 + 0.25 \times 0  = 1,\\
&E[N] = 2,\\
&E[G/N] = 1/2,\\
&E[B/N] = 1/2,\\
&E[(G-B)/N] = 0.5\times (1-1)/2 + 0.25\times (2-0)/2 + 0.25\times (0-2)/2=0,\\
&E[G]/E[B]=1,\\
&E[G/B]=0.5\times 1/1+ 0.25\times 0/2 + 0.25\times 2/0 = \infty. \square
\end{align}

In [14]:
num_experiments = int(1e05)
G = []
B = []
N = []
GN = []
BN = []
GB = []
for experiment in range(num_experiments):
    family = {'Boy': 0, 'Girl': 0}
    # stop having kids when have a boy, or family exceeds two
    while family['Boy']+family['Girl']<2:
        if random.choice(['Boy','Girl'])=='Boy':
            family['Boy']+=1
        else:
            family['Girl']+=1
    G.append(family['Girl'])
    B.append(family['Boy'])
    N.append(family['Boy']+family['Girl'])
    GN.append(family['Girl']/(family['Boy']+family['Girl']))
    BN.append(family['Boy']/(family['Boy']+family['Girl']))
    if family['Boy']==0:
        GB.append(1e09)
    else:
        GB.append(family['Girl']/family['Boy'])
# G
pe = sum(G)/num_experiments
pt = 1
print("Empirical E[G]={0}, Theoretical E[G]={1}".format(round(pe,3),round(pt,3)))
# B
pe = sum(B)/num_experiments
pt = 1
print("Empirical E[B]={0}, Theoretical E[B]={1}".format(round(pe,3),round(pt,3)))
# N
pe = sum(N)/num_experiments
pt = 2
print("Empirical E[N]={0}, Theoretical E[N]={1}".format(round(pe,3),round(pt,3)))
# G/N
pe = sum(GN)/num_experiments
pt = 1/2
print("Empirical E[G/N]={0}, Theoretical E[G/N]={1}".format(round(pe,3),round(pt,3)))
# B/N
pe = sum(BN)/num_experiments
pt = 1/2
print("Empirical E[B/N]={0}, Theoretical E[B/N]={1}".format(round(pe,3),round(pt,3)))
# G/B
pe = sum(GB)/num_experiments
pt = float('inf')
print("Empirical E[G/B]={0}, Theoretical E[G/B]={1}".format(round(pe,3),round(pt,3)))

Empirical E[G]=0.997, Theoretical E[G]=1
Empirical E[B]=1.003, Theoretical E[B]=1
Empirical E[N]=2.0, Theoretical E[N]=2
Empirical E[G/N]=0.498, Theoretical E[G/N]=0.5
Empirical E[B/N]=0.502, Theoretical E[B/N]=0.5
Empirical E[G/B]=248590000.499, Theoretical E[G/B]=inf


### Exercise 3.4

4. Parents have children until they have a boy. Take the idealization that the family has no limit on the number of children. (My own great-great-grandfather had 22 children that survived infancy. Not all of his three wives survived childbirth.)

With probability $1/2$, the family ends with 1 boy.  With probability $1/4$, the family ends with 1 girl and 1 boy.  More generally, with probability $2^{-i-1}$, the family ends with $i$ girls, and $1$ boy.  Using this, we perform the calculations below:
\begin{align}
&E[G] = \sum_{i=0}^{\infty}2^{-i-1}*i = 1,\\
&E[B] = 1,\\
&E[N] = E[G+B] = E[G]+E[B]=2,\\
&E[G/N] = \sum_{i=0}^{\infty}2^{-i-1}*i/(i+1)=1-\log(2)\approx0.30685,\\
&E[B/N] = \sum_{i=0}^{\infty}2^{-i-1}*1/(i+1)=\log(2)\approx0.69315,\\
&E[(G-B)/N] = E[G/N]-E[B/N]=1-\log(2)-\log(2)=1-2log(2)\approx-0.386,\\
&E[G]/E[B]=1,\\
&E[G/B]=\sum_{i=0}^{\infty}2^{-i-1}\times i/1=1. \square
\end{align}

In [19]:
from math import log

In [20]:
num_experiments = int(1e05)
G = []
B = []
N = []
GN = []
BN = []
GB = []
for experiment in range(num_experiments):
    family = {'Boy': 0, 'Girl': 0}
    # stop having kids when have a boy, or family exceeds two
    while family['Boy']<1:
        if random.choice(['Boy','Girl'])=='Boy':
            family['Boy']+=1
        else:
            family['Girl']+=1
    G.append(family['Girl'])
    B.append(family['Boy'])
    N.append(family['Boy']+family['Girl'])
    GN.append(family['Girl']/(family['Boy']+family['Girl']))
    BN.append(family['Boy']/(family['Boy']+family['Girl']))
    GB.append(family['Girl']/(family['Boy']))

# G
pe = sum(G)/num_experiments
pt = 1
print("Empirical E[G]={0}, Theoretical E[G]={1}".format(round(pe,3),round(pt,3)))
# B
pe = sum(B)/num_experiments
pt = 1
print("Empirical E[B]={0}, Theoretical E[B]={1}".format(round(pe,3),round(pt,3)))
# N
pe = sum(N)/num_experiments
pt = 2
print("Empirical E[N]={0}, Theoretical E[N]={1}".format(round(pe,3),round(pt,3)))
# G/N
pe = sum(GN)/num_experiments
pt = 1-log(2)
print("Empirical E[G/N]={0}, Theoretical E[G/N]={1}".format(round(pe,3),round(pt,3)))
# B/N
pe = sum(BN)/num_experiments
pt = log(2)
print("Empirical E[B/N]={0}, Theoretical E[B/N]={1}".format(round(pe,3),round(pt,3)))
# G/B
pe = sum(GB)/num_experiments
pt = 1
print("Empirical E[G/B]={0}, Theoretical E[G/B]={1}".format(round(pe,3),round(pt,3)))

Empirical E[G]=0.997, Theoretical E[G]=1
Empirical E[B]=1.0, Theoretical E[B]=1
Empirical E[N]=1.997, Theoretical E[N]=2
Empirical E[G/N]=0.306, Theoretical E[G/N]=0.307
Empirical E[B/N]=0.694, Theoretical E[B/N]=0.693
Empirical E[G/B]=0.997, Theoretical E[G/B]=1


### Exercise 3.5

There is a problem with the way Sen chose to characterize the problem in terms of ratios. What is it? What is a better way to capture the intuition that the population should be about half men and half women?

One problem is that $E[G]/E[B]\neq E[G/B]$.  For example, $E[G/B]=\infty$ for some of the rules above, but that does not mean we should expect infinitely many more women than boys. A better way to capture the intuition that the population should be about half men and half women is that $E[G]/E[B]=1$. $\square$

### Exercise 3.6

The analysis above ignored parents. Does this matter for Sen’s point? We also assumed that all the families observed were complete, but at any given point in time some of the families may not have stopped having children. Under the decision rules above, does this ameliorate the missing women problem?

Parents do matter.

If some families have not stopped having children (e.g. because they still don't have their first boy), we can only expect $E[B/N]$ to increase. $\square$