#### Data Acquisition Notebook

Data for this project were retrieved from: https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm

This data represents the birth data for 2019, provided by the CDC and is a 5 GB text file containing only the raw data. 

The following notebook imports the data into a pandas dataframe using the read_fwf function and specifies the column's using names and positional ranges found int the data documentation which can be found here:
ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/DVS/natality/UserGuide2019-508.pdf

It then saves that data into a readable CSV which I will then use to extract the data I am interested.

Due to the size of the data file, this portion of the processing took a significant amount of time, and I recommend using the processed csv saved in the github repo, here, to run the subsequent notebooks, rather than trying to download the data and process it yourself.

The variables selected for the final dataframe to be used for this project exclude flags, such as those for imputed data and data collected and include maternal and paternal demographics, pregnancy characteristics, pregnancy outcomes, and the infant outcomes. These are the variables I hope to analyze for clusters.

I will also exclude duplicate information, such as date of birth, while keeping age. 

In [1]:
import pandas as pd
import time

In [2]:
Column_names = ['DOB_YY',
                'DOB_MM',
               'DOB_TT', 
                'DOB_WK',
               'BFACIL', 
                'F_FACILITY',
               'BFACIL3',
               'MAGE_IMPFLG',
                'MAGE_REPFLG',
                'MAGER',
                'MAGER14',
                'MAGER9',
                'MBSTATE_REC',
               'RESTATUS', 
                'MRACE31',
                'MRACE6',
                'MRACE15',
                'MBRACE',
               'MRACEIMP',
                'MHISPX',
                'MHISP_R',
                'F_MHISP',
                'MRACEHISP',
               'MAR_P',
                'DMAR',
                'MAR_IMP',
               'F_MAR_P',
                'MEDUC',
               'F_MEDUC',
               'FAGERPT_FLG',
                'FAGECOMB',
                'FAGEREC11',
                'FRACE31',
                'FRACE6', 
                'FRACE15',
               'FHISPX',
                'FHISP_R',
                'F_FHISP',
                'FRACEHISP',
                'FEDUC',
               'f_FEDUC', 
               'PRIORLIVE',
                'PRIORDEAD',
                'PRIORTERM',
               'LBO_REC',
               'TBO_REC',
               'ILLB_R', 
                'ILLB_R11',
               'ILOP_R',
                'ILOP_R11',
               'ILP_R',
                'ILP_R11',
               'PRECARE',
                'F_MPCB',
                'PRECARE5',
               'PREVIS',
                'PREVIS_REC',
                'F_TPCV',
                'WIC',
                'F_WIC',
                'CIG_0',
                'CIG_1',
                'CIG_2',
                'CIG_3',
                'CIG0_R',
                'CIG1_R',
                'CIG2_R',
                'CIG3_R',
                'F_CIGS_0',
                'F_CIGS_1',
                'F_CIGS_2',
                'F_CIGS_3',
                'CIG_REC',
                'F_TOBACO',
                'M_Ht_In',
                'F_M_HT',
                'BMI',
                'BMI_R',
                'PWgt_R',
                'F_PWGT',
                'DWgt_R',
                'F_DWGT',
                'WTGAIN',
                'WTGAIN_REC',
                'F_WTGAIN',
                'RF_PDIAB',
                'RF_GDIAB',
                'RF_PHYPE',
                'RF_GHYPE',
                'RF_EHYPE',
                'RF_PPTERM',
                'F_RF_PDIAB',
                'F_RF_GDIAB',
                'F_RF_PHYPER',
                'F_RF_GHYPER',
                'F_RF_ECLAMP',
                'F_RF_PPB',
                'RF_INFTR',
                'RF_FEDRG',
                'RF_ARTEC',
                'f_RF_INFT',
                'F_RF_INF_DRG',
                'F_RF_INF_ART',
                'RF_CESAR',
                'RF_CESARN',
                'F_RF_CESAR',
                'F_RF_NCESAR',
                'NO_RISKS',
                'IP_GON',
                'IP_SYPH',
                'IP_CHLAM',
                'IP_HEPB',
                'IP_HEPC',
                'F_IP_GONOR',
                'F_IP_SYPH',
                'F_IP_CHLAM',
                'F_IP_HEPATB',
                'F_IP_HEPATC',
                'NO_INFEC',
                'OB_ECVS',
                'OB_ECVF',
                'F_OB_SUCC',
                'F_OB_FAIL',
                'LD_INDL',
                'LD_AUGM',
                'LD_STER',
                'LD_ANTB',
                'LD_CHOR',
                'LD_ANES',
                'F_LD_INDL',
                'F_LD_AUGM',
                'F_LD_STER',
                'F_LD_ANTB',
                'F_LD_CHOR',
                'F_LD_ANES',
                'NO_LBRDLV',
                'ME_PRES',
                'ME_ROUT',
                'ME_TRIAL',
                'F_ME_PRES',
                'F_ME_ROUT',
                'F_ME_TRIAL',
                'RDMETH_REC',
                'DMETH_REC',
                'F_DMETH_REC',
                'MM_MTR',
                'MM_PLAC',
                'MM_RUPT',
                'MM_UHYST',
                'MM_AICU',
                'F_MM_MTR',
                'F_MM_PLAC',
                'F_MM_RUPT',
                'F_MM_UHYST',
                'F_MM_AICU',
                'NO_MMORB',
                'ATTEND',
                'MTRAN',
                'PAY',
                'PAY_REC',
                'F_PAY',
                'F_PAY_REC',
                'APGAR5',
                'APGAR5R',
                'F_APGAR5',
                'APGAR10',
                'APGAR10R',
                'DPLURAL',
                'IMP_PLUR',
                'SETORDER_R',
                'SEX',
                'IMP_SEX',
                'DLMP_MM',
                'DLMP_YY',
                'COMPGST_IMP',
                'OBGEST_FLG',
                'COMBGEST',
                'GESTREC10',
                'GESTREC3',
                'LMPUSED',
                'OEGest_Comb',
                'OEGest_R10',
                'OEGest_R3',
                'DBWT',
                'BWTR12',
                'BWTR4',
                'AB_AVEN1',
                'AB_AVEN6',
                'AB_NICU',
                'AB_SURF',
                'AB_ANTI',
                'AB_SEIZ',
                'F_AB_VENT',
                'F_AB_VENT6',
                'F_AB_NIUC',
                'F_AB_SURFAC',
                'F_AB_ANTIBIO',
                'F_AB_SEIZ',
                'NO_ABNORM',
                'CA_ANEN',
                'CA_MNSB',
                'CA_CCHD',
                'CA_CDH',
                'CA_OMPH',
                'CA_GAST',
                'F_CA_ANEN',
                'F_CA_MENIN',
                'F_CA_HEART',
                'F_CA_HERNIA',
                'F_CA_OMPHA',
                'F_CA_GASTRO',
                'CA_LIMB',
                'CA_CLEFT',
                'CA_CLPAL',
                'CA_DOWN',
                'CA_DISOR',
                'CA_HYPO',
                'F_CA_LIMB',
                'F_CA_CLEFTLP',
                'F_CA_CLEFT',
                'F_CA_DOWNS',
                'F_CA_CHROM',
                'F_CA_HYPOS',
                'NO_CONGEN',
                'ITRAN',
                'ILIVE',
                'BFED',
                'F_BFED'
               ]


In [3]:
Widths = [10,
          2,
          8,
          1,
          9,
          1,
          17,
          13,
          1,
          2,
          2,
          1,
          5,
          20,
          2,
          2,
          2,
          1,
          1,
          1,
          3,
          1,
          1,
          2,
          1,
          1,
          2,
          1,
          2,
          16,
          6,
          2,
          2,
          1,
          2,
          4,
          1,
          1,
          1,
          1,
          2,
          7,
          2,
          2,
          3,
          3,
          18,
          2,
          6,
          2,
          6,
          2,
          7,
          1,
          1,
          12,
          4,
          1,
          7,
          1,
          2,
          2,
          2,
          2,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          11,
          1,
          4,
          1,
          7,
          1,
          6,
          2,
          2,
          1,
          1,
          6,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          2,
          2,
          1,
          1,
          6,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          7,
          1,
          2,
          1,
          18,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          6,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          6,
          1,
          1,
          1,
          1,
          2,
          1,
          1,
          1,
          1,
          2,
          6,
          1,
          1,
          1,
          1,
          1,
          7,
          1,
          2,
          1,
          4,
          2,
          3,
          16,
          1,
          2,
          6,
          4,
          1,
          2,
          2,
          1,
          4,
          2,
          2,
          1,
          4,
          3,
          1,
          6,
          1,
          1,
          1,
          1,
          1,
          2,
          1,
          1,
          1,
          1,
          1,
          2,
          6,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          1,
          6,
          1,
          1,
          1,
         ]

In [4]:
colspecs = [(0,12),
           (12,14),
           (14,22),
           (22,23),
           (23,32),
            (32,48),
            (48,50),
            (50,73),
            (73,74),
            (74,76),
            (76,78),
            (78,80),
            (80,84),
            (84,104),
            (104,106),
            (106,107),
            (107,109),
            (109,110),
            (110,111),
            (111,112),
            (112,115),
            (115,116),
            (116,117),
            (117,119),
            (119,120),
            (120,121),
            (121,123),
            (123,124),
            (124,126),
            (126,142),
            (142,148),
            (148,150),
            (150,152),
            (152,153),
            (153,155),
            (155,159),
            (159,160),
            (160,161),
            (161,162),
            (162,163),
            (163,165),
            (165,172),
            (172,174),
            (174,176),
            (176,179),
            (179,182),
            (182,200),
            (200,202),
            (202,208),
            (208,210),
            (210,216),
            (216,218),
            (218,225),
            (225,226),
            (226,227),
            (227,239),
            (239,243),
            (243,245),
            (245,251),
            (251,252),
            (252,254),
            (254,256),
            (256,258),
            (258,260),
            (260,261),
            (261,262),
            (262,263),
            (263,264),
            (264,265),
            (265,266),
            (266,267),
            (267,268),
            (268,269),
            (269,270),
            (270,281),
            (281,282),
            (282,286),
            (286,287),
            (287,294),
            (294,295),
            (295,301),
            (301,303),
            (303,305),
            (305,306),
            (306,307),
            (307,313),
            (313,314),
            (314,315),
            (315,316),
            (316,317),
            (317,318),
            (318,319),
            (319,320),
            (320,321),
            (321,322),
            (322,323),
            (323,324),
            (324,325),
            (325,326),
            (326,327),
            (327,328),
            (328,329),
            (329,330),
            (330,331),
            (331,333),
            (333,335),
            (335,336),
            (336,337),
            (337,343),
            (343,344),
            (344,345),
            (345,346),
            (346,347),
            (347,348),
            (348,349),
            (349,350),
            (350,351),
            (351,352),
            (352,353),
            (353,360),
            (360,361),
            (361,363),
            (363,364),
            (364,383),
            (383,384),
            (384,385),
            (385,386),
            (386,387),
            (387,388),
            (388,389),
            (389,390),
            (390,391),
            (391,392),
            (392,393),
            (393,394),
            (394,395),
            (395,401),
            (401,402),
            (402,403),
            (403,404),
            (404,405),
            (405,406),
            (406,407),
            (407,408),
            (408,409),
            (409,415),
            (415,416),
            (416,417),
            (417,418),
            (418,419),
            (419,421),
            (421,422),
            (422,423),
            (423,424),
            (424,425),
            (425,427),
            (427,433),
            (433,434),
            (434,435),
            (435,436),
            (436,437),
            (437,438),
            (438,445),
            (445,446),
            (446,447),
            (447,449),
            (449,450),
            (450,454),
            (454,456),
            (456,459),
            (459,475),
            (475,476),
            (476,478),
            (478,484),
            (484,488),
            (488,489),
            (489,491),
            (491,493),
            (493,494),
            (494,498),
            (498,500),
            (500,502),
            (502,503),
            (503,507),
            (507,510),
            (510,511),
            (511,517),
            (517,518),
            (518,519),
            (519,520),
            (520,521),
            (521,522),
            (522,524),
            (524,525),
            (525,526),
            (526,527),
            (527,528),
            (528,529),
            (529,531),
            (531,537),
            (537,538),
            (538,539),
            (539,540),
            (540,541),
            (541,542),
            (542,543),
            (543,544),
            (544,545),
            (545,546),
            (546,547),
            (547,548),
            (548,549),
            (549,550),
            (550,551),
            (551,552),
            (552,553),
            (553,554),
            (554,555),
            (555,556),
            (556,557),
            (557,558),
            (558,559),
            (559,560),
            (560,561),
            (561,567),
            (567,568),
            (568,569),
            (569,560)
           ]


In [5]:
len(colspecs)
#len(Column_names)

228

In [6]:
df = pd.read_fwf("C:/Users/15856/Data 602/HW2 data/Nat2019us/Nat2019PublicUS.c20200506.r20200915.txt",colspecs = colspecs, names = Column_names)

In [7]:
pd.set_option('display.max_rows', df.shape[1]+1)

In [8]:
df.head().T

Unnamed: 0,0,1,2,3,4
DOB_YY,2019,2019,2019,2019,2019
DOB_MM,1,1,1,1,1
DOB_TT,1135,1305,800,130,1426
DOB_WK,3,3,3,4,4
BFACIL,1,1,1,1,1
F_FACILITY,1,1,1,1,1
BFACIL3,1,1,1,1,1
MAGE_IMPFLG,,,,,
MAGE_REPFLG,,,,,
MAGER,29,40,30,25,38


In [9]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3757582 entries, 0 to 3757581
Columns: 228 entries, DOB_YY to F_BFED
dtypes: float64(15), int64(158), object(55)
memory usage: 6.4+ GB


In [12]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3757582 entries, 0 to 3757581
Columns: 228 entries, DOB_YY to F_BFED
dtypes: float64(15), int64(158), object(55)
memory usage: 6.4+ GB


In [13]:
#Notes on the dropped data: the data flags are often represented directly in the data 
#being reported and are therefore duplicates. There are also multiple unneccessary recodes which are dropped

Drop_Rows = ['DOB_YY', #duplicated in age
                'DOB_MM',#duplicated in age
               'DOB_TT', #duplicated in a#flagge
                'DOB_WK',#duplicated in age
               #'BFACIL', 
                'F_FACILITY',# dropping data flags
               'BFACIL3',#facility recode, keeping the more detailed version
               'MAGE_IMPFLG',# dropping data flags
                'MAGE_REPFLG',# dropping data flags
                #'MAGER',
                'MAGER14',#Age recode, keeping the continupus data
                'MAGER9',#Age recode, keeping the continupus data
                #'MBSTATE_REC',
               #'RESTATUS', 
                'MRACE31',#Dropping the extreamly detailed and least detailed recodes in favor of the 15 category recode
                'MRACE6',#Dropping the extreamly detailed and least detailed recodes in favor of the 15 category recode
                #'MRACE15',
                'MBRACE',#dropping Bridged race mother since this is used for estimates by the cdc https://www.cdc.gov/nchs/nvss/bridged_race.htm
               'MRACEIMP',# dropping data flags
                #'MHISPX',
                'MHISP_R',#hispanic origine recode, keeping more detailed info
                'F_MHISP',# dropping data flags
                'MRACEHISP',#duplicate data
               #'MAR_P',
                #'DMAR',
                'MAR_IMP',# dropping data flags
               'F_MAR_P',# dropping data flags
                #'MEDUC',
               'F_MEDUC',# dropping data flags
               'FAGERPT_FLG',# dropping data flags
                #'FAGECOMB',
                'FAGEREC11',#Age recode, keeping the continupus data
                'FRACE31',#dropping race recodes
                'FRACE6', #dropping race recodes
                #'FRACE15',
               #'FHISPX',
                'FHISP_R',#dropping Hispanic origin recode
                'F_FHISP',# dropping data flags
                'FRACEHISP',#duplicate data
                #'FEDUC',
               'f_FEDUC', # dropping data flags
               #'PRIORLIVE',
                #'PRIORDEAD',
                #'PRIORTERM',
               'LBO_REC',#duplicate data
               'TBO_REC',#duplicate data
               #'ILLB_R', 
                'ILLB_R11',#duplicate data
               'ILOP_R',#duplicate data
                'ILOP_R11',#duplicat data
               'ILP_R',#duplicate data
                'ILP_R11',#duplicate data
               #'PRECARE',
                'F_MPCB',# dropping data flags
                'PRECARE5',#duplicate data
               #'PREVIS',
                'PREVIS_REC',#duplicate data recode
                'F_TPCV',# dropping data flags
                #'WIC',
                'F_WIC',# dropping data flags
                #'CIG_0',
                #'CIG_1',
                #'CIG_2',
                #'CIG_3',
                'CIG0_R',#duplicate
                'CIG1_R',#duplicat
                'CIG2_R',#duplicate
                'CIG3_R',#duplicate
                'F_CIGS_0',# dropping data flags
                'F_CIGS_1',# dropping data flags
                'F_CIGS_2',# dropping data flags
                'F_CIGS_3',# dropping data flags
                'CIG_REC',#duplicate
                'F_TOBACO',# dropping data flags
                'M_Ht_In',#using BMI rather than height
                'F_M_HT',#using BMI rather than height
                #'BMI',
                'BMI_R',#duplicate data recode
                'PWgt_R',#using BMI rather than weight
                'F_PWGT',# dropping data flags
                'DWgt_R',#using weight gain rather than delivery weight
                'F_DWGT',# dropping data flags
                #'WTGAIN',
                'WTGAIN_REC',#duplicate recode
                'F_WTGAIN',# dropping data flags
                #'RF_PDIAB',
                #'RF_GDIAB',
                #'RF_PHYPE',
                #'RF_GHYPE',
                #'RF_EHYPE',
                #'RF_PPTERM',
                'F_RF_PDIAB',# dropping data flags
                'F_RF_GDIAB',# dropping data flags
                'F_RF_PHYPER',# dropping data flags
                'F_RF_GHYPER',# dropping data flags
                'F_RF_ECLAMP',# dropping data flags
                'F_RF_PPB',# dropping data flags
                #'RF_INFTR',
                #'RF_FEDRG',
                #'RF_ARTEC',
                'f_RF_INFT',# dropping data flags
                'F_RF_INF_DRG',# dropping data flags
                'F_RF_INF_ART',# dropping data flags
                'RF_CESAR',#duplicate data
                #'RF_CESARN',
                'F_RF_CESAR',# dropping data flags
                'F_RF_NCESAR',# dropping data flags
                'NO_RISKS',#duplicate data
                #'IP_GON',
                #'IP_SYPH',
                #'IP_CHLAM',
                #'IP_HEPB',
                #'IP_HEPC',
                'F_IP_GONOR',# dropping data flags
                'F_IP_SYPH',# dropping data flags
                'F_IP_CHLAM',# dropping data flags
                'F_IP_HEPATB',# dropping data flags
                'F_IP_HEPATC',# dropping data flags
                'NO_INFEC',#duplicate data
                #'OB_ECVS',
                #'OB_ECVF',
                'F_OB_SUCC',# dropping data flags
                'F_OB_FAIL',# dropping data flags
                #'LD_INDL',
                #'LD_AUGM',
                #'LD_STER',
                #'LD_ANTB',
                #'LD_CHOR',
                #'LD_ANES',
                'F_LD_INDL',# dropping data flags
                'F_LD_AUGM',# dropping data flags
                'F_LD_STER',# dropping data flags
                'F_LD_ANTB',# dropping data flags
                'F_LD_CHOR',# dropping data flags
                'F_LD_ANES',# dropping data flags
                'NO_LBRDLV',#duplicate
                #'ME_PRES',
                'ME_ROUT',#duplicate of delivery method
                #'ME_TRIAL',
                'F_ME_PRES',# dropping data flags
                'F_ME_ROUT',# dropping data flags
                'F_ME_TRIAL',# dropping data flags
                #'RDMETH_REC',
                'DMETH_REC',#duplicate
                'F_DMETH_REC',# dropping data flags
                #'MM_MTR',
                #'MM_PLAC',
                #'MM_RUPT',
                #'MM_UHYST',
                #'MM_AICU',
                'F_MM_MTR',# dropping data flags
                'F_MM_PLAC',# dropping data flags
                'F_MM_RUPT',# dropping data flags
                'F_MM_UHYST',# dropping data flags
                'F_MM_AICU',# dropping data flags
                'NO_MMORB',#duplicate
                #'ATTEND',
                #'MTRAN',
                #'PAY', 
                'PAY_REC', #duplicate
                'F_PAY',# dropping data flags
                'F_PAY_REC',# dropping data flags
                #'APGAR5',#determines health of baby after birth
                'APGAR5R',#duplicate
                'F_APGAR5',# dropping data flags
                #'APGAR10',
                'APGAR10R',#duplicate
                #'DPLURAL',
                'IMP_PLUR',# dropping data flags
                'SETORDER_R',#duplicate
                #'SEX',
                'IMP_SEX',# dropping data flags
                'DLMP_MM',#duplicate data
                'DLMP_YY',#duplicate data
                'COMPGST_IMP',# dropping data flags
                'OBGEST_FLG',# dropping data flags
                #'COMBGEST',
                'GESTREC10',#duplicate
                'GESTREC3',#duplicate
                'LMPUSED',# dropping data flags
                'OEGest_Comb',# dropping data flags
                'OEGest_R10',#duplicate
                'OEGest_R3',#duplicate
                #'DBWT',
                'BWTR12',#duplicate
                'BWTR4',#duplicate
                #'AB_AVEN1',
                #'AB_AVEN6',
                #'AB_NICU',
                #'AB_SURF',
                #'AB_ANTI',
                #'AB_SEIZ',
                'F_AB_VENT',# dropping data flags
                'F_AB_VENT6',# dropping data flags
                'F_AB_NIUC',# dropping data flags
                'F_AB_SURFAC',# dropping data flags
                'F_AB_ANTIBIO',# dropping data flags
                'F_AB_SEIZ',# dropping data flags
                'NO_ABNORM',#duplicate data
                #'CA_ANEN',
                #'CA_MNSB',
                #'CA_CCHD',
                #'CA_CDH',
                #'CA_OMPH',
                #'CA_GAST',
                'F_CA_ANEN',# dropping data flags
                'F_CA_MENIN',# dropping data flags
                'F_CA_HEART',# dropping data flags
                'F_CA_HERNIA',# dropping data flags
                'F_CA_OMPHA',# dropping data flags
                'F_CA_GASTRO',# dropping data flags
                #'CA_LIMB',
                #'CA_CLEFT',
                #'CA_CLPAL',
                #'CA_DOWN',
                #'CA_DISOR',
                #'CA_HYPO',
                'F_CA_LIMB',# dropping data flags
                'F_CA_CLEFTLP',# drop# dropping data flagsping data flags
                'F_CA_CLEFT',# dropping data flags# dropping data flags
                'F_CA_DOWNS',# dropping data flags# dropping data flags
                'F_CA_CHROM',# dropping data flags# dropping data flags
                'F_CA_HYPOS',# dropping data flags
                'NO_CONGEN',# dropping data flags
                #'ITRAN',
                #'ILIVE',
                #'BFED',
                'F_BFED'# dropping data flags
               ]

In [14]:
df.drop(Drop_Rows, axis = 1, inplace=True)

In [17]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3757582 entries, 0 to 3757581
Data columns (total 87 columns):
 #   Column       Dtype  
---  ------       -----  
 0   BFACIL       int64  
 1   MAGER        int64  
 2   MBSTATE_REC  int64  
 3   RESTATUS     int64  
 4   MRACE15      int64  
 5   MHISPX       int64  
 6   MAR_P        object 
 7   DMAR         float64
 8   MEDUC        int64  
 9   FAGECOMB     int64  
 10  FRACE15      int64  
 11  FHISPX       int64  
 12  FEDUC        int64  
 13  PRIORLIVE    int64  
 14  PRIORDEAD    int64  
 15  PRIORTERM    int64  
 16  ILLB_R       int64  
 17  PRECARE      int64  
 18  PREVIS       int64  
 19  WIC          object 
 20  CIG_0        int64  
 21  CIG_1        int64  
 22  CIG_2        int64  
 23  CIG_3        int64  
 24  BMI          float64
 25  WTGAIN       int64  
 26  RF_PDIAB     object 
 27  RF_GDIAB     object 
 28  RF_PHYPE     object 
 29  RF_GHYPE     object 
 30  RF_EHYPE     object 
 31  RF_PPTERM    object 
 32

In [16]:
df.to_csv('./NatalityData.csv')