In [1]:
from IPython.core.display import HTML
HTML("""
<style>
div.text_cell_render { /* Customize text cells */
font-family: 'Times New Roman';
font-size:1.3em;
line-height:1.4em;
padding-left:1.5em;
padding-right:1.5em;
}
</style>
""")

### 1.1 Do first babies arrive late?

<b>Anecdotal Evidence </b> is based on data that is unpublished and usually personal. For example,
<i><center>"My two friends that have given birth recently to their first babies,
BOTH went almost 2 weeks overdue before going into
labour or being induced.” </center></i>
Anecdotal Evidence usually fail because of <b>Small number of observations</b>, <b>Selection bias</b> (People who join a discussion of this question might be interested because their first babies were late.), <b>Confirmation bias</b> (People who believe the claim might be more likely to contribute examples that confirm it) and <b>Inaccuracy</b>.



### 1.2 A Statistical Approach

Limitations of Anecdotal Evidence can be addressed by using the tools of statistics, which include <b>Data Collection</b>, <b>Descriptive Statistics</b>, <b>Exploratory Data Analysis</b>, <b>Hypothesis Testing</b> and <b>Estimation</b>.

### 1.3 The National Survey of Family Growth (NSFG)

NSFG is a <b>cross-sectional</b> study (it captures a snapshot of a group at a point in time). The alternative is a <b>longitudinal</b> study which observes a group repeatedly over a period of time. The people who participate in a survey are called <b>respondents</b>. Cross-sectional studies are meant to be <b>representative</b>, which means that every member of the target population has an equal chance of participating. NSFG is deliberately <b>oversampled</b> as certain groups are sampled at higher rates compared to their representation in US population. Drawback of oversampling is that it is hard to arrive at a conclusion based on statistics from the survey.
<br><br>
<b>Exercise 1.2</b> Download data from the NSFG:

In [2]:
import pandas as pd
# Reference to extract the columns: http://greenteapress.com/thinkstats/survey.py
pregnancies = pd.read_fwf("2002FemPreg.dat", 
                         names=["caseid", "nbrnaliv", "babysex", "birthwgt_lb",
                               "birthwgt_oz", "prglength", "outcome", "birthord",
                               "agepreg", "finalwgt"],
                         colspecs=[(0, 12), (21, 22), (55, 56), (57, 58), (57, 59),
                                (274, 276), (276, 277), (278, 279), (283, 285), (422, 439)])
pregnancies.head()

Unnamed: 0,caseid,nbrnaliv,babysex,birthwgt_lb,birthwgt_oz,prglength,outcome,birthord,agepreg,finalwgt
0,1,1.0,1.0,8.0,81.0,39,1,1.0,33.0,6448.271112
1,1,1.0,2.0,7.0,71.0,39,1,2.0,39.0,6448.271112
2,2,3.0,1.0,9.0,9.0,39,1,1.0,14.0,12999.542264
3,2,1.0,2.0,7.0,7.0,39,1,2.0,17.0,12999.542264
4,2,1.0,2.0,6.0,6.0,39,1,3.0,18.0,12999.542264


The description for the fields are as follows:

| caseid | prglength | outcome | birthord | finalwgt
| --- | --- | --- | --- | --- |
| Integer ID of Respondent | Integer Duration of pregnancy in weeks | 1 indicates a live birth | code for first child: 1 | Number of people in US population this respondant represents

<b>Exercise 1.3</b> Explore the data in the Pregnancies table. Count the number of live births and compute the average pregnancy length (in weeks) for first babies and others for the live births.

In [3]:
pregnancies.describe()

Unnamed: 0,caseid,nbrnaliv,babysex,birthwgt_lb,birthwgt_oz,prglength,outcome,birthord,agepreg,finalwgt
count,13593.0,9148.0,9144.0,9144.0,9144.0,13593.0,13593.0,9148.0,13241.0,13593.0
mean,6216.526595,1.025907,1.494532,6.653762,26.249453,29.531229,1.763996,1.824552,24.230949,8196.42228
std,3645.417341,0.252864,0.515295,1.588809,29.371073,13.802523,1.31593,1.037053,5.824302,9325.918114
min,1.0,1.0,1.0,0.0,0.0,0.0,1.0,0.0,10.0,118.65679
25%,3022.0,1.0,1.0,6.0,7.0,13.0,1.0,1.0,20.0,3841.375308
50%,6161.0,1.0,1.0,7.0,8.0,39.0,1.0,2.0,23.0,6256.592133
75%,9423.0,1.0,2.0,8.0,61.0,39.0,2.0,2.0,28.0,9432.360931
max,12571.0,9.0,9.0,9.0,99.0,50.0,6.0,9.0,44.0,261879.953864


In [4]:
live_births = pregnancies[pregnancies['outcome'] == 1]
print("Number of live births is: " + str(live_births.shape[0]))
mean_first = live_births[live_births['birthord'] == 1]['prglength'].mean()
mean_other = live_births[live_births['birthord'] != 1]['prglength'].mean()
print("Mean Pregnancy length for live births of first babies is: " + str(mean_first))
print("Mean Pregnancy length for live births of other babies is: " + str(mean_other))
print("Difference in Mean Pregnancy length for first and other babies is : " + str(mean_first - mean_other))

Number of live births is: 9148
Mean Pregnancy length for live births of first babies is: 38.6009517335
Mean Pregnancy length for live births of other babies is: 38.5229144667
Difference in Mean Pregnancy length for first and other babies is : 0.0780372667775


### 1.5 Significance

From the above analysis, it is evident that the difference in mean pregnancy lengths of first and other babies is <b>13.11 hours</b>. A difference like this is called an <b>apparent effect</b> which means that there must be something going on but we are not sure yet. If the difference occurred by chance, we can conlcude that thet effect was not <b>statistically significant</b>. An apparent effect that is caused by bias, measurement error, or
some other kind of error is called <b>artifact</b>.