<a href="https://colab.research.google.com/github/npr99/PlanningMethods/blob/master/Location_Quotient_Using_BLS_QCEW.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Location Quotient Example Using BLS QCEW

The Bureau of Labor Statistics provides employment and wage data by year, county, MSA, state, and nation by industry.

An overview of this data is available on the BLS website for the Quarterly Census of Employment and Wages (QCEW) data

https://www.bls.gov/cew/

The BLS provides data in "data slices" - for individual states, MSAs and counties.

https://data.bls.gov/cew/doc/access/csv_data_slices.htm

The BLS provides a complete layout of the datasets - specifically for Location Quotient the Annual Average data is a good choice:

https://data.bls.gov/cew/doc/access/csv_data_slices.htm#ANNUAL_LAYOUT

The BLS provides a complete list of Areas (states, MSAs and counties) - the "area codes" are based on the state and county FIPS codes:

https://data.bls.gov/cew/doc/titles/area/area_titles.htm

### Example Area Slice Files

2016 QCEW Annual averages for Georgia
http://data.bls.gov/cew/data/api/2016/a/area/13000.csv

2016 QCEW Annual averages for Chatham County, GA
http://data.bls.gov/cew/data/api/2016/a/area/13051.csv

### File Layout
https://www.bls.gov/cew/about-data/downloadable-file-layouts/annual/naics-based-annual-layout.htm

#### Details on codes for different ownership types:
https://www.bls.gov/cew/classifications/ownerships/ownership-titles.htm


Note totals for employment are the sums of all Annual average of monthly employment levels for a given year (annual_avg_emplvl) for all ownership codes.

### BLS data as time series
Chatham County QCEW Time Series Data
https://data.bls.gov/timeseries/ENU1305110010


## Step 1: Obtain BLS QCEW Data File
The Pandas read csv command is a fast way to download .csv datafiles directly to the notebook session.

In [1]:
import pandas as pd # For reading, writing and wrangling data

In [2]:
blsqcew_areatitles = pd.read_csv('https://data.bls.gov/cew/doc/titles/area/area_titles.csv')
blsqcew_areatitles.head()

Unnamed: 0,area_fips,area_title
0,US000,U.S. TOTAL
1,USCMS,U.S. Combined Statistical Areas (combined)
2,USMSA,U.S. Metropolitan Statistical Areas (combined)
3,USNMS,U.S. Nonmetropolitan Area Counties (combined)
4,01000,Alabama -- Statewide


In [3]:
# Find Area FIPS code based on county name
blsqcew_areatitles.loc[blsqcew_areatitles['area_title'] == 'Chatham County, Georgia']


Unnamed: 0,area_fips,area_title
486,13051,"Chatham County, Georgia"


In [4]:
# Save Area Title for later use
area_title_df = blsqcew_areatitles.loc[blsqcew_areatitles['area_fips'] == '13051']
area_title = area_title_df['area_title'].values[0]
area_title

'Chatham County, Georgia'

In [5]:
blsqcew = pd.read_csv('http://data.bls.gov/cew/data/api/2016/a/area/13051.csv')
blsqcew.head()

Unnamed: 0,area_fips,own_code,industry_code,agglvl_code,size_code,year,qtr,disclosure_code,annual_avg_estabs,annual_avg_emplvl,total_annual_wages,taxable_annual_wages,annual_contributions,annual_avg_wkly_wage,avg_annual_pay,lq_disclosure_code,lq_annual_avg_estabs,lq_annual_avg_emplvl,lq_total_annual_wages,lq_taxable_annual_wages,lq_annual_contributions,lq_annual_avg_wkly_wage,lq_avg_annual_pay,oty_disclosure_code,oty_annual_avg_estabs_chg,oty_annual_avg_estabs_pct_chg,oty_annual_avg_emplvl_chg,oty_annual_avg_emplvl_pct_chg,oty_total_annual_wages_chg,oty_total_annual_wages_pct_chg,oty_taxable_annual_wages_chg,oty_taxable_annual_wages_pct_chg,oty_annual_contributions_chg,oty_annual_contributions_pct_chg,oty_annual_avg_wkly_wage_chg,oty_annual_avg_wkly_wage_pct_chg,oty_avg_annual_pay_chg,oty_avg_annual_pay_pct_chg
0,13051,0,10,70,0,2016,A,,8654,149090,6613717155,1354478311,22797107,853,44361,,1.0,1.0,1.0,1.0,1.0,1.0,1.0,,277,3.3,3627,2.5,155753462,2.4,40394517,3.1,-1835966,-7.5,-1,-0.1,-35,-0.1
1,13051,1,10,71,0,2016,A,,64,2603,184370024,0,0,1362,70837,,1.19,0.89,0.97,0.0,0.0,1.09,1.09,,0,0.0,32,1.2,2698690,1.5,0,0.0,0,0.0,3,0.2,180,0.3
2,13051,1,102,72,0,2016,A,,64,2603,184370024,0,0,1362,70837,,1.19,0.9,0.99,0.0,0.0,1.09,1.09,,0,0.0,32,1.2,2698690,1.5,0,0.0,0,0.0,3,0.2,180,0.3
3,13051,1,1021,73,0,2016,A,,15,498,30001695,0,0,1159,60275,,0.56,0.7,0.85,0.0,0.0,1.22,1.22,,0,0.0,8,1.6,-509113,-1.7,0,0.0,0,0.0,-38,-3.2,-1971,-3.2
4,13051,1,1023,73,0,2016,A,,2,17,1615844,0,0,1828,95050,,2.81,1.23,1.26,0.0,0.0,1.03,1.03,,0,0.0,-1,-5.6,-58322,-3.5,0,0.0,0,0.0,48,2.7,2469,2.7


## Step 3: Explore Data
Look at descripitive statistics for key variabiables.

In [6]:
# Explore the Location quotient of annual average employment relative to the U.S. (Rounded to the hundredths place)
blsqcew[['annual_avg_estabs','annual_avg_emplvl','lq_annual_avg_emplvl','lq_total_annual_wages',]].describe()

Unnamed: 0,annual_avg_estabs,annual_avg_emplvl,lq_annual_avg_emplvl,lq_total_annual_wages
count,1703.0,1703.0,1703.0,1703.0
mean,45.806812,702.179683,0.766753,0.771321
std,365.10619,5845.989363,2.602384,2.100285
min,0.0,0.0,0.0,0.0
25%,2.0,0.0,0.0,0.0
50%,5.0,13.0,0.16,0.14
75%,15.0,197.0,0.94,0.95
max,8654.0,149090.0,56.37,35.96


# Look at top Location Quotients by industry

## Full list of industry codes
For a full list of industry codes use the link below:

https://www.bls.gov/cew/classifications/industry/industry-titles.htm


## More details on NAICS
North American Industry Classification System (NAICS) 

https://www.naics.com/search/

## NAICS FAQ
https://www.naics.com/frequently-asked-questions/#NAICSfaq

## Add Industry Titles
The current industry titles were updated in 2017. The list of industries appears to be backwards compatitle and includes codes for all previous years and updated codes. The BLS website has a number of tables that help identify new NAICS codes and how they map to older NAICS codes. Most of the changes appear to be at the 4, 5, and 6 digit NAICS code levels.

"For detailed information on QCEW establishment, employment, and wage levels for each industry affected by the NAICS 2017 conversion, please refer to this [QCEW 2017 revision table](https://data.bls.gov/cew/apps/bls_naics/naics2017.xls)."


In [7]:
industry_titles_df = pd.read_csv('https://www.bls.gov/cew/classifications/industry/industry-titles-csv.csv')
industry_titles_df.head()

Unnamed: 0,industry_code,industry_title
0,10,"10 Total, all industries"
1,101,101 Goods-producing
2,1011,1011 Natural resources and mining
3,1012,1012 Construction
4,1013,1013 Manufacturing


In [8]:
industry_titles_df.industry_code.describe()

count       2497
unique      2497
top       423840
freq           1
Name: industry_code, dtype: object

In [9]:
blsqcew_titles = pd.merge(left = industry_titles_df,
                          right = blsqcew,
                          left_on = 'industry_code',
                          right_on = 'industry_code',
                          how = 'outer')
blsqcew_titles.head()

Unnamed: 0,industry_code,industry_title,area_fips,own_code,agglvl_code,size_code,year,qtr,disclosure_code,annual_avg_estabs,annual_avg_emplvl,total_annual_wages,taxable_annual_wages,annual_contributions,annual_avg_wkly_wage,avg_annual_pay,lq_disclosure_code,lq_annual_avg_estabs,lq_annual_avg_emplvl,lq_total_annual_wages,lq_taxable_annual_wages,lq_annual_contributions,lq_annual_avg_wkly_wage,lq_avg_annual_pay,oty_disclosure_code,oty_annual_avg_estabs_chg,oty_annual_avg_estabs_pct_chg,oty_annual_avg_emplvl_chg,oty_annual_avg_emplvl_pct_chg,oty_total_annual_wages_chg,oty_total_annual_wages_pct_chg,oty_taxable_annual_wages_chg,oty_taxable_annual_wages_pct_chg,oty_annual_contributions_chg,oty_annual_contributions_pct_chg,oty_annual_avg_wkly_wage_chg,oty_annual_avg_wkly_wage_pct_chg,oty_avg_annual_pay_chg,oty_avg_annual_pay_pct_chg
0,10,"10 Total, all industries",13051.0,0.0,70.0,0.0,2016.0,A,,8654.0,149090.0,6613717000.0,1354478000.0,22797107.0,853.0,44361.0,,1.0,1.0,1.0,1.0,1.0,1.0,1.0,,277.0,3.3,3627.0,2.5,155753462.0,2.4,40394517.0,3.1,-1835966.0,-7.5,-1.0,-0.1,-35.0,-0.1
1,10,"10 Total, all industries",13051.0,1.0,71.0,0.0,2016.0,A,,64.0,2603.0,184370000.0,0.0,0.0,1362.0,70837.0,,1.19,0.89,0.97,0.0,0.0,1.09,1.09,,0.0,0.0,32.0,1.2,2698690.0,1.5,0.0,0.0,0.0,0.0,3.0,0.2,180.0,0.3
2,10,"10 Total, all industries",13051.0,2.0,71.0,0.0,2016.0,A,,49.0,4774.0,256817800.0,718677.0,13493.0,1035.0,53796.0,,0.79,0.99,1.13,0.09,0.77,1.14,1.14,,0.0,0.0,-89.0,-1.8,9650905.0,3.9,320958.0,80.7,8430.0,166.5,57.0,5.8,2965.0,5.8
3,10,"10 Total, all industries",13051.0,3.0,71.0,0.0,2016.0,A,,16.0,10917.0,482220800.0,2746020.0,63775.0,849.0,44173.0,,0.11,0.74,0.82,0.15,0.49,1.1,1.1,,0.0,0.0,797.0,7.9,41172025.0,9.3,104473.0,4.0,-6593.0,-9.4,11.0,1.3,589.0,1.4
4,10,"10 Total, all industries",13051.0,5.0,71.0,0.0,2016.0,A,,8527.0,130796.0,5690309000.0,1351014000.0,22719839.0,837.0,43505.0,,1.02,1.03,1.01,1.02,1.0,0.98,0.98,,279.0,3.4,2887.0,2.3,102231842.0,1.8,39969086.0,3.0,-1837803.0,-7.5,-3.0,-0.4,-183.0,-0.4


In [10]:
blsqcew_titles.industry_code.describe()

count     2651
unique    2497
top         10
freq         5
Name: industry_code, dtype: object

In [11]:
import numpy as np
blsqcew_titles[['industry_code','industry_title','annual_avg_emplvl']].\
loc[blsqcew_titles['annual_avg_emplvl'].isnull()].head()

Unnamed: 0,industry_code,industry_title,annual_avg_emplvl
36,1111,NAICS 1111 Oilseed and grain farming,
37,11111,NAICS 11111 Soybean farming,
38,111110,NAICS 111110 Soybean farming,
39,11112,"NAICS 11112 Oilseed, except soybean, farming",
40,111120,"NAICS 111120 Oilseed, except soybean, farming",


In [12]:
blsqcew_titles[['industry_title','annual_avg_estabs','annual_avg_emplvl','lq_annual_avg_emplvl','lq_total_annual_wages']].sort_values(by='lq_annual_avg_emplvl', ascending=False).head(10)

Unnamed: 0,industry_title,annual_avg_estabs,annual_avg_emplvl,lq_annual_avg_emplvl,lq_total_annual_wages
1638,NAICS 48832 Marine cargo handling,13.0,3581.0,56.37,35.96
1639,NAICS 488320 Marine cargo handling,13.0,3581.0,56.37,35.96
1635,NAICS 4883 Support activities for water transp...,28.0,3823.0,39.8,26.33
2228,NAICS 621493 Freestanding emergency medical ce...,1.0,124.0,23.32,26.21
1610,NAICS 48711 Scenic and sightseeing transportat...,10.0,243.0,16.52,14.2
1611,NAICS 487110 Scenic and sightseeing transporta...,10.0,243.0,16.52,14.2
1609,NAICS 4871 Scenic and sightseeing transportati...,10.0,243.0,16.52,14.2
2220,NAICS 62142 Outpatient mental health centers,13.0,196.0,13.71,14.1
2222,NAICS 621420 Outpatient mental health centers,13.0,196.0,13.71,14.1
1640,NAICS 48833 Navigational services to shipping,7.0,181.0,10.13,10.45


Look at top industries by employement.

In [14]:
blsqcew_titles[['industry_title','annual_avg_estabs','annual_avg_emplvl','lq_annual_avg_emplvl','lq_total_annual_wages']].sort_values(by='annual_avg_emplvl', ascending=False).head(10)

Unnamed: 0,industry_title,annual_avg_estabs,annual_avg_emplvl,lq_annual_avg_emplvl,lq_total_annual_wages
0,"10 Total, all industries",8654.0,149090.0,1.0,1.0
4,"10 Total, all industries",8527.0,130796.0,1.03,1.01
12,102 Service-providing,7677.0,110314.0,1.05,0.92
15,"1021 Trade, transportation, and utilities",2126.0,35113.0,1.24,1.26
27,1026 Leisure and hospitality,1103.0,23591.0,1.44,1.48
25,1025 Education and health services,901.0,23336.0,1.03,1.29
2374,NAICS 72 Accommodation and food services,975.0,21645.0,1.55,1.72
5,101 Goods-producing,850.0,20483.0,0.93,1.38
2192,NAICS 62 Health care and social assistance,822.0,19895.0,1.0,1.26
1335,NAICS 44-45 Retail trade,1295.0,18771.0,1.13,1.24


## Identify the NAICS Code by lengths 

In [22]:
blsqcew_titles.loc[:,'NAICS digits'] = 0
blsqcew_titles.loc[(blsqcew_titles['industry_code'].str.len()==2) |
    (blsqcew_titles['industry_code'].str.contains("-")),'NAICS digits'] = 2
blsqcew_titles.loc[(blsqcew_titles['industry_code'].str.len()==3),'NAICS digits'] = 3
blsqcew_titles.loc[(blsqcew_titles['industry_code'].str.len()==4),'NAICS digits'] = 4
# 5 digit NAICS codes should not include the codes with dashes (48-49 is a 2 digit code)
blsqcew_titles.loc[(blsqcew_titles['industry_code'].str.len()==5) & 
                   ~(blsqcew_titles['industry_code'].str.contains("-")),'NAICS digits'] = 5
blsqcew_titles.loc[(blsqcew_titles['industry_code'].str.len()==6),'NAICS digits'] = 6
blsqcew_titles.loc[(blsqcew_titles['industry_code'].str.startswith('10')),'NAICS digits'] = 0
blsqcew_titles.\
pivot_table(values='annual_avg_emplvl',index = 'NAICS digits',aggfunc=np.sum)

Unnamed: 0_level_0,annual_avg_emplvl
NAICS digits,Unnamed: 1_level_1
0,590733.0
2,143360.0
3,127072.0
4,121894.0
5,108379.0
6,104374.0


## Add Ownership Code Titles
For each NAICS code the BLS identifies details by ownership - such as a private company or a governement agency.
"The QCEW program stopped publishing data for International Government establishments as a separate ownership group after 1994." (BLS) 

For more information see the BLS website:
https://www.bls.gov/cew/classifications/ownerships/ownership-titles.htm 


In [29]:
own_titles_df = pd.read_csv('https://www.bls.gov/cew/classifications/ownerships/ownership-titles-csv.csv')
own_titles_df.head(8)

Unnamed: 0,own_code,own_title
0,0,Total Covered
1,1,Federal Government
2,2,State Government
3,3,Local Government
4,4,International Government
5,5,Private
6,8,Total Government
7,9,Total U.I. Covered (Excludes Federal Government)


In [30]:
blsqcew_titles_own = pd.merge(left = own_titles_df,
                          right = blsqcew_titles,
                          left_on = 'own_code',
                          right_on = 'own_code',
                          how = 'outer')

In [33]:
blsqcew_titles_own[['NAICS digits','industry_title','own_code','own_title','annual_avg_emplvl']].loc[blsqcew_titles_own['NAICS digits']==2].head(18)

Unnamed: 0,NAICS digits,industry_title,own_code,own_title,annual_avg_emplvl
9,2.0,NAICS 44-45 Retail trade,1.0,Federal Government,45.0
14,2.0,NAICS 48-49 Transportation and warehousing,1.0,Federal Government,453.0
23,2.0,NAICS 52 Finance and insurance,1.0,Federal Government,17.0
32,2.0,NAICS 54 Professional and technical services,1.0,Federal Government,414.0
43,2.0,NAICS 62 Health care and social assistance,1.0,Federal Government,254.0
52,2.0,"NAICS 71 Arts, entertainment, and recreation",1.0,Federal Government,18.0
57,2.0,NAICS 92 Public administration,1.0,Federal Government,1403.0
104,2.0,NAICS 48-49 Transportation and warehousing,2.0,State Government,0.0
109,2.0,NAICS 61 Educational services,2.0,State Government,1910.0
117,2.0,NAICS 62 Health care and social assistance,2.0,State Government,894.0


Look at summary data by 2-digit NAICS code and Ownership Code.

In [42]:
blsqcew_2digit = blsqcew_titles_own[['industry_code','industry_title','own_code','annual_avg_estabs','annual_avg_emplvl']].\
loc[(blsqcew_titles_own['NAICS digits']==2)].sort_values(by=['industry_code','own_code'])
varformat = {('annual_avg_emplvl'): "{:,.0f}", ('annual_avg_estabs'):"{:,.0f}"}
blsqcew_2digit.head(8).style\
     .format(varformat)

Unnamed: 0,industry_code,industry_title,own_code,annual_avg_estabs,annual_avg_emplvl
238,11,"NAICS 11 Agriculture, forestry, fishing and hunting",5.0,15,0
270,21,"NAICS 21 Mining, quarrying, and oil and gas extraction",5.0,2,0
276,22,NAICS 22 Utilities,5.0,14,336
288,23,NAICS 23 Construction,5.0,603,5466
377,31-33,NAICS 31-33 Manufacturing,5.0,231,14913
646,42,NAICS 42 Wholesale trade,5.0,391,5442
9,44-45,NAICS 44-45 Retail trade,1.0,1,45
792,44-45,NAICS 44-45 Retail trade,5.0,1295,18771


Total jobs appears to be the toal of own code 9 (Total U.I. Covered (Excludes Federal Government)) and own code 1 (Federal Government). 

In [35]:
#  Need to drop total covered - own_code == 0 - leads to double counting
blsqcew_totalcovered = blsqcew_titles_own[['industry_code','industry_title','own_code','annual_avg_estabs','annual_avg_emplvl']].loc[(blsqcew_titles_own['own_code'] ==0)]
blsqcew_totalcovered.head()

Unnamed: 0,industry_code,industry_title,own_code,annual_avg_estabs,annual_avg_emplvl
0,10,"10 Total, all industries",0.0,8654.0,149090.0


In [43]:
# Replace Industry code for NAICS 92 to get levels of government
blsqcew_2digit.loc[(blsqcew_2digit['industry_code'] =='92')]

Unnamed: 0,industry_code,industry_title,own_code,annual_avg_estabs,annual_avg_emplvl
57,92,NAICS 92 Public administration,1.0,39.0,1403.0
135,92,NAICS 92 Public administration,2.0,27.0,1912.0
202,92,NAICS 92 Public administration,3.0,13.0,5346.0


In [44]:
blsqcew_2digit.loc[(blsqcew_2digit['industry_code'] =='92') &
                   (blsqcew_2digit['own_code'] ==1),'industry_title'] = 'NAICS 92 Public administration 1 Federal Government'
blsqcew_2digit.loc[(blsqcew_2digit['industry_code'] =='92') &
                   (blsqcew_2digit['own_code'] ==2),'industry_title'] = 'NAICS 92 Public administration 2 State Government'
blsqcew_2digit.loc[(blsqcew_2digit['industry_code'] =='92') &
                   (blsqcew_2digit['own_code'] ==3),'industry_title'] = 'NAICS 92 Public administration 3 Local Government'

In [45]:
# Replace Industry code for NAICS 92 to get levels of government
blsqcew_2digit.loc[(blsqcew_2digit['industry_code'] =='92')]

Unnamed: 0,industry_code,industry_title,own_code,annual_avg_estabs,annual_avg_emplvl
57,92,NAICS 92 Public administration 1 Federal Gover...,1.0,39.0,1403.0
135,92,NAICS 92 Public administration 2 State Government,2.0,27.0,1912.0
202,92,NAICS 92 Public administration 3 Local Government,3.0,13.0,5346.0


In [46]:
table1 = blsqcew_2digit.groupby(by=["industry_code","industry_title"]).sum()
table1.reset_index(inplace = True)
table_title = "Two-digit NAICS, "+area_title
varformat = {('annual_avg_emplvl'): "{:,.0f}", ('annual_avg_estabs'):"{:,.0f}"}
table1.style\
     .set_caption(table_title)\
     .format(varformat)

Unnamed: 0,industry_code,industry_title,own_code,annual_avg_estabs,annual_avg_emplvl
0,11,"NAICS 11 Agriculture, forestry, fishing and hunting",5.0,15,0
1,21,"NAICS 21 Mining, quarrying, and oil and gas extraction",5.0,2,0
2,22,NAICS 22 Utilities,5.0,14,336
3,23,NAICS 23 Construction,5.0,603,5466
4,31-33,NAICS 31-33 Manufacturing,5.0,231,14913
5,42,NAICS 42 Wholesale trade,5.0,391,5442
6,44-45,NAICS 44-45 Retail trade,6.0,1296,18816
7,48-49,NAICS 48-49 Transportation and warehousing,8.0,442,11019
8,51,NAICS 51 Information,8.0,94,1557
9,52,NAICS 52 Finance and insurance,6.0,594,3260


# Create a function that obtains and cleans BLS data

In [52]:
import pandas as pd # For reading, writing and wrangling data
import sys  # saving CSV files

def obtain_clean_bls(area_fips: str = "US000", year: str = "2016"):
  """
  area_fips = 5 character string
  """
  blsqcew_areatitles = pd.read_csv('https://data.bls.gov/cew/doc/titles/area/area_titles.csv')
  # Save Area Title for later use
  area_title_df = blsqcew_areatitles.loc[blsqcew_areatitles['area_fips'] == area_fips]
  area_title = area_title_df['area_title'].values[0]

  # Obtain data for area
  blsqcew_url = 'http://data.bls.gov/cew/data/api/'+year+'/a/area/'+area_fips+'.csv'
  print('Obtaining BLS QCEW data for ',area_title)
  print('From: ',blsqcew_url)
  blsqcew = pd.read_csv(blsqcew_url)

  # Add industry titles
  industry_titles_df = pd.read_csv('https://www.bls.gov/cew/classifications/industry/industry-titles-csv.csv')
  blsqcew_titles = pd.merge(left = industry_titles_df,
                            right = blsqcew,
                            left_on = 'industry_code',
                            right_on = 'industry_code',
                            how = 'right')
  
  # Identify NAICS code lenght
  blsqcew_titles.loc[:,'NAICS digits'] = 0
  blsqcew_titles.loc[(blsqcew_titles['industry_code'].str.len()==2) |
      (blsqcew_titles['industry_code'].str.contains("-")),'NAICS digits'] = 2
  blsqcew_titles.loc[(blsqcew_titles['industry_code'].str.len()==3),'NAICS digits'] = 3
  blsqcew_titles.loc[(blsqcew_titles['industry_code'].str.len()==4),'NAICS digits'] = 4
  # 5 digit NAICS codes should not include the codes with dashes (48-49 is a 2 digit code)
  blsqcew_titles.loc[(blsqcew_titles['industry_code'].str.len()==5) & 
                    ~(blsqcew_titles['industry_code'].str.contains("-")),'NAICS digits'] = 5
  blsqcew_titles.loc[(blsqcew_titles['industry_code'].str.len()==6),'NAICS digits'] = 6
  blsqcew_titles.loc[(blsqcew_titles['industry_code'].str.startswith('10')),'NAICS digits'] = 0

  # Replace Industry code for NAICS 92 to get levels of government
  blsqcew_titles.loc[(blsqcew_titles['industry_code'] =='92') &
                   (blsqcew_titles['own_code'] ==1),'industry_code'] = '92 1 Federal Government'
  blsqcew_titles.loc[(blsqcew_titles['industry_code'] =='92') &
                    (blsqcew_titles['own_code'] ==2),'industry_code'] = '92 2 State Government'
  blsqcew_titles.loc[(blsqcew_titles['industry_code'] =='92') &
                    (blsqcew_titles['own_code'] ==3),'industry_code'] = '92 3 Local Government'

  # Look at summary data by 2-digit NAICS code and Ownership Code
  blsqcew_2digit = blsqcew_titles.loc[(blsqcew_titles['NAICS digits']==2)]

  #  Need to drop total covered - own_code == 0 - leads to double counting
  blsqcew_totalcovered = blsqcew_titles.loc[(blsqcew_titles['own_code'] ==0)]
  blsqcew_totalcovered.head()

  # Append Total and BLS 2 digit
  blsqcew_2digit_total = blsqcew_totalcovered.append(blsqcew_2digit)

  table1 = blsqcew_2digit_total[['industry_code','industry_title','annual_avg_emplvl']].groupby(by=["industry_code","industry_title"]).sum()
  table1.reset_index(inplace = True)
  # renanme columns 
  table1 = table1.rename(columns={"annual_avg_emplvl": area_title+" employment", "industry_code": "NAICS code", "industry_title": year+" NAICS title"},)

  # Format table
  table_title = "Two-digit NAICS, "+area_title
  varformat = {(area_title+" employment"): "{:,.0f}"}
  table1_fmt = table1.style\
      .set_caption(table_title)\
      .format(varformat)

  # Save all data
  csv_filepath = 'BLSQCEW_'+year+"_"+area_fips+'.csv'
  savefile = csv_filepath
  blsqcew_titles.to_csv(savefile, index=False)

  # Save 2 digit results as csv
  csv_filepath = 'BLSQCEW_'+year+"_"+area_fips+'2digittotals.csv'
  savefile = csv_filepath
  table1.to_csv(savefile, index=False)

  return table1_fmt

# Run code for a specific area
The BLS provides a complete list of Areas (states, MSAs and counties) - the "area codes" are based on the state and county FIPS codes:

https://data.bls.gov/cew/doc/titles/area/area_titles.htm


In [54]:
obtain_clean_bls(area_fips = "48041", year = "2019")

Obtaining BLS QCEW data for  Brazos County, Texas
From:  http://data.bls.gov/cew/data/api/2019/a/area/48041.csv


Unnamed: 0,NAICS code,2019 NAICS title,"Brazos County, Texas employment"
0,10,"10 Total, all industries",107676
1,11,"NAICS 11 Agriculture, forestry, fishing and hunting",785
2,21,"NAICS 21 Mining, quarrying, and oil and gas extraction",1357
3,22,NAICS 22 Utilities,441
4,23,NAICS 23 Construction,4821
5,31-33,NAICS 31-33 Manufacturing,5158
6,42,NAICS 42 Wholesale trade,2027
7,44-45,NAICS 44-45 Retail trade,10992
8,48-49,NAICS 48-49 Transportation and warehousing,247
9,51,NAICS 51 Information,1342
