## Matching employment-population ratio data from the CPS with BLS Summary Data

This is a proof of concept that the technique for using the CPS to find the employment to population ratio for one subgroup of the population will match with the BLS summary statistics for the same group/time. Specifically, I look at 2010 and Women aged 25-54, (BLS series ID: LNS12300062)

In [1]:
import pandas as pd

In [2]:
cols = ['year', 'female', 'age', 'educ', 'empl', 'fnlwgt']
year = '2015'

In [3]:
df = pd.read_stata('data/cepr_org_{}.dta'.format(year), columns=cols)

In [4]:
df = df[(df['age'] >= 24) & 
        (df['age'] <=54) &
        (df['female'] == 1)]

In [5]:
emp = df.groupby('empl').sum()['fnlwgt'].ix[1]
pop = df.groupby('empl').sum()['fnlwgt'].sum()
epop = emp/pop

print '{}: Women, age 25-54: {}'.format(year, str(round(epop * 100, 2)))

2015: Women, age 25-54: 70.33


#### Compare with BLS Summary Statistics

In [6]:
import requests
import json
import config # file called config.py with my API key

# BLS API v1 url
url = 'https://api.bls.gov/publicAPI/v1/timeseries/data/'

series = 'LNS12300062'

In [7]:
# get the data returned by the url and series id
r = requests.get('{}{}'.format(url, series))

# Generate pandas dataframe from the data returned
df2 = pd.DataFrame(r.json()['Results']['series'][0]['data'])

In [8]:
round(df2[df2['year'] == year]['value'].astype(float).mean(), 2)

70.33