# ACS, BLS and O\*NET Titles

* In this notebook, we will consolidate titles from __ACS (American Community Survey)__, __BLS (Bureau of Labor Statistics)__ and __O\*NET (Occupational Information Network)__ into a final file so that we can match data from __ACS__ with data from __O\*NET__.
* The structure of __ACS__, __BLS__ and __O\*NET__ are quite similar to each other. However, we have to match some title manually.
* __BLS-SOC Occupational Titles__ are the framework for both __ACS__ and __O\*NET__. In other words, we can say that while __O\*NET__ is a more detailed version; __ACS__ is a more concise version of __BLS__.

* The structure of __BLS-SOC Occupational Titles__

| SOC Code | Occupation Type | Occupation Title |
| :------- | --------------- | ---------------: |
| 11-0000 | Major	 |			Management Occupations |
| ⁞ | ⁞ | ⁞ |
| 11-2000 |		Minor		 |	Advertising, Marketing, Promotions, Public Relations, and Sales Managers |
| 11-2020 |			Broad	 |	Marketing and Sales Managers |
| 11-2021 |				Detailed	 |Marketing Managers |
| 11-2022 |				Detailed	 |Sales Managers |
| ⁞ | ⁞ | ⁞ |
| ⁞ | ⁞ | ⁞ |

* __O\*NET-SOC Occupational Titles__ consists of __BLS-SOC Detailed Occupational Titles__. On top of that, __O\*NET Detailed Occupational Titles__ are the additions of __O\*NET__ to the list of __BLS-SOC Detailed Occupational Titles__.
* In other words, __O\*NET Occupational Titles__ can be splitted into two categories:
    1. __O\*NET-SOC Occupational Titles__
    2. __O\*NET-Detailed Occupational Titles__
    
* The __O\*NET-SOC Occupational Titles__ are taken from __BLS-SOC Occupational Titles__ and those titles have __O\*NET-SOC Occupational Code__ ending with `.00`.
* The __O\*NET-Detailed Occupational Titles__ are additions of __O\*NET__ to the list and those titles do *not* have __O\*NET-SOC Occupational Code__ ending `.00`.

An Example:

| O\*NET-SOC Code | Title | O\*NET-SOC | O\*NET-Detailed |
| :-------------- | ----- | :--------: | :-------------: |
| 11-3031.00 | Financial Managers | X | |
| 11-3031.01 | Treasurers and Controllers |  | X |	
| 11-3031.02 | Financial Managers, Branch or Department |  | X |

* __ACS Occupational Titles__ are mainly consists of __BLS-SOC Occupational Titles__. However, for some reason, not explained explicitly by __ACS__, some occupational titles are bundled together. Moreover, the structure of __ACS Occupational Titles__ are not consistent over time. Approximately, in every 7 years, the structure changes which means that adjustments between different __ACS Occupational Titles__ has to be made manually.
* When __ACS__ bundles occupational titles, it bundles them under either of the following __BLS-SOC Occupational Titles__:
    1. Broad BLS-SOC Occupational Titles
    2. Minor BLS-SOC Occupational Titles
    3. Major BLS-SOC Occupational Titles


* For this reason, we have to track occupational titles in __ACS__ manually and match them in __BLS__.

An Example:

| ACS Code | O*NET Code | ACS Title | O*NET-SOC Title |
| :---- | ---- | ----- | ---- | 
| 1110XX | |Chief Executives and Legislators | |
| | 11-1011.00 | | Chief Executives |
| | 11-1011.03 | | Chief Sustainability Officers
| | 11-1031.00 | | Legislators



# Table of Contents

1. [Read O\*NET Occupational Titles](#Read-O\*NET-Occupational-Titles)
2. [Read ACS Occupational Titles](#Read-ACS-Occupational-Titles)
3. [Read BLS Occupational Titles](#Read-BLS-Occupational-Titles)

In [1]:
from IPython.core.display import display, HTML
display(HTML('<style>.container { width:80% !important; }</style>'))
import pandas as pd
import numpy as np

# Read O\*NET Occupational Titles

In [2]:
df_onet = pd.read_excel('csv_files/db_22_3_excel/Occupation Data.xlsx')
df_onet.columns = ['onetsoccode', 'onet_title', 'description']
df_onet.drop('description', axis=1, inplace=True)
print('Number of O*NET titles: {}'.format(df_onet.shape[0]))
display(df_onet.head())

Number of O*NET titles: 1110


Unnamed: 0,onetsoccode,onet_title
0,11-1011.00,Chief Executives
1,11-1011.03,Chief Sustainability Officers
2,11-1021.00,General and Operations Managers
3,11-1031.00,Legislators
4,11-2011.00,Advertising and Promotions Managers


# Read ACS Occupational Titles

In [3]:
# ACS raw occupational titles
df_acs = pd.read_csv('csv_files/ACS_2016_Codes.csv')
df_acs.head()

Unnamed: 0,Code,Label,2016
0,,,acs
1,111021,MGR-GENERAL AND OPERATIONS MANAGERS,10126
2,1110XX,MGR-CHIEF EXECUTIVES AND LEGISLATORS *,15456
3,112011,MGR-ADVERTISING AND PROMOTIONS MANAGERS,521
4,112020,MGR-MARKETING AND SALES MANAGERS,10693


In [4]:
# Process ACS occupational titles
df_acs.columns = ['onetsoccode', 'acs_title', 'count']
df_acs.dropna(axis=0, inplace=True)
df_acs = df_acs.loc[df_acs.onetsoccode != 'Code', :]
df_acs.acs_title = df_acs.acs_title.apply(lambda x: x[4:])
df_acs.acs_title = df_acs.acs_title.apply(
                                lambda x: x[:-1].title() if x.endswith('*')
                                                         else x.title())
df_acs.onetsoccode = df_acs.onetsoccode.apply(
                                lambda x: x[:2] + '-' + x[2:] + '.00')
df_acs.head()

Unnamed: 0,onetsoccode,acs_title,count
1,11-1021.00,General And Operations Managers,10126
2,11-10XX.00,Chief Executives And Legislators,15456
3,11-2011.00,Advertising And Promotions Managers,521
4,11-2020.00,Marketing And Sales Managers,10693
5,11-2031.00,Public Relations And Fundraising Managers,701


In [5]:
# Manually correct some occupational titles
value = ["Meeting, Convention, And Event Planners",
         "Explosives Workers, Ordnance Handling Experts, And Blasters",
         "Electronic Equipment Installers And Repairers, Motor Vehicles",
         "Electronic Home Entertainment Equipment Installers And Repairers",
         "Helpers--Production Workers",
         "Brickmasons And Blockmasons, Stonemasons, And Reinforcing Iron And Rebar Workers",
         "Healthcare Support Workers, All Other, Including Medical Equipment Preparers",
         "Software Developers, Applications And Systems Software"]       
to_replace=["Meeting Convention, And Event Planners",
            "Explosives Workers, Ordnance Handling Experts, And V",
            "Electronic Equipment Installers And Repairers, Motor",
            "Electronic Home Entertainment Equipment Installers And",
            "Helpers-Production Workers",
            "Brickmasons, Blockmasons, Stonemasons, And Reinforcing Iron And Rebar Workers",
            "Thcare Support Workers, All Other, Including Medical Equipment Preparers",
            "Software Developers,Applications And Systems Software"]
df_acs.acs_title.replace(to_replace=to_replace, value=value, inplace=True)

In [6]:
df_acs.onetsoccode.replace(to_replace=['99-9920.00', 'BB-BBBB.00', 
                                       '55-9830.00'],
                           value='NA', inplace=True)
df_acs = df_acs.loc[df_acs.onetsoccode != 'NA', :]
print('Number of ACS 2016 titles: {}'.format(df_acs.shape[0]))
display(df_acs.head())

Number of ACS 2016 titles: 477


Unnamed: 0,onetsoccode,acs_title,count
1,11-1021.00,General And Operations Managers,10126
2,11-10XX.00,Chief Executives And Legislators,15456
3,11-2011.00,Advertising And Promotions Managers,521
4,11-2020.00,Marketing And Sales Managers,10693
5,11-2031.00,Public Relations And Fundraising Managers,701


# Read BLS Occupational Titles

In [7]:
# BLS occupational titles raw
df_bls = pd.read_excel('csv_files/SOC_structures/soc_structure_2010.xls')
df_bls.head()

Unnamed: 0,Bureau of Labor Statistics,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4
0,On behalf of the Standard Occupational Classif...,,,,
1,,,,,
2,January 2009,,,,
3,*** This is the final structure for the 2010 S...,,,,
4,,,,,


In [8]:
# Process BLS occupational titles
df_bls.fillna('', inplace=True)
df_bls = df_bls.iloc[12:, :]
df_bls.columns = ['major', 'minor', 'broad', 'detailed', 'bls_title']
df_bls['onetsoccode'] = df_bls.iloc[:, :-1].sum(axis=1)
df_bls.onetsoccode = df_bls.onetsoccode.apply(lambda x : x.strip() + '.00')
df_bls.bls_title = df_bls.bls_title.apply(lambda x: x.strip())
df_bls['category'] = np.where(df_bls.major != '', 'major', '')
df_bls['category'] = np.where(df_bls.minor != '', 'minor', df_bls.category)
df_bls['category'] = np.where(df_bls.broad != '', 'broad', df_bls.category)
df_bls['category'] = np.where(df_bls.detailed != '', 'detailed', df_bls.category)
df_bls.reset_index(drop=True, inplace=True)
df_bls.head()

Unnamed: 0,major,minor,broad,detailed,bls_title,onetsoccode,category
0,11-0000,,,,Management Occupations,11-0000.00,major
1,,11-1000,,,Top Executives,11-1000.00,minor
2,,,11-1010,,Chief Executives,11-1010.00,broad
3,,,,11-1011,Chief Executives,11-1011.00,detailed
4,,,11-1020,,General and Operations Managers,11-1020.00,broad


In [9]:
# a table that shows major, minor and broad occupational title for each
# BLS detailed occupational title
df_bls_table = df_bls.copy()
df_bls_table.replace('', np.nan, inplace=True)
df_bls_detailed = df_bls_table.loc[df_bls_table.detailed.notnull(),
                                   ['onetsoccode', 'bls_title']].reset_index(
                                       drop=True)
df_bls_broad = df_bls_table.loc[df_bls_table.broad.notnull(),
                                ['onetsoccode', 'bls_title']].reset_index(
                                    drop=True)
df_bls_minor = df_bls_table.loc[df_bls_table.minor.notnull(),
                                ['onetsoccode', 'bls_title']].reset_index(
                                    drop=True)
df_bls_major = df_bls_table.loc[df_bls_table.major.notnull(),
                                ['onetsoccode', 'bls_title']].reset_index(
                                    drop=True)
df_bls_table.fillna(method='ffill', axis=0, inplace=True)
df_bls_table = df_bls_table.loc[df_bls_table.category == 'detailed', :]
df_bls_table.reset_index(drop=True, inplace=True)
for column in ['major', 'minor', 'broad', 'detailed']:
    df_bls_table[column] = df_bls_table[column].apply(
        lambda x: x.strip() + '.00')
df_bls_table.drop(['onetsoccode', 'category'], axis=1, inplace=True)

In [10]:
display(df_bls_table.head())
df_bls_table.to_csv('title_bls_table.csv', index=False)

Unnamed: 0,major,minor,broad,detailed,bls_title
0,11-0000.00,11-1000.00,11-1010.00,11-1011.00,Chief Executives
1,11-0000.00,11-1000.00,11-1020.00,11-1021.00,General and Operations Managers
2,11-0000.00,11-1000.00,11-1030.00,11-1031.00,Legislators
3,11-0000.00,11-2000.00,11-2010.00,11-2011.00,Advertising and Promotions Managers
4,11-0000.00,11-2000.00,11-2020.00,11-2021.00,Marketing Managers


In [11]:
code_detailed = df_bls.loc[df_bls.category == 'detailed',
                           ['onetsoccode']].reset_index(drop=True)
code_broad = df_bls.loc[df_bls.category == 'broad',
                        ['onetsoccode']].reset_index(drop=True)
code_minor = df_bls.loc[df_bls.category == 'minor',
                        ['onetsoccode']].reset_index(drop=True)
code_major = df_bls.loc[df_bls.category == 'major',
                        ['onetsoccode']].reset_index(drop=True)

# Merge ACS, BLS, O\*NET Titles

In [12]:
# merge acs titles with detailed, broad and minor bls titles, respectively
# in order to identify which ACS occupational title belongs to which group
df_title = df_acs.merge(code_detailed,
                        on='onetsoccode',
                        indicator='acs_detailed',
                        how='left')
df_title = df_title.merge(code_broad,
                          on='onetsoccode',
                          indicator='acs_broad',
                          how='left')
df_title = df_title.merge(code_minor,
                          on='onetsoccode',
                          indicator='acs_minor',
                          how='left')

In [13]:
# generate binary variables for detailed, broad, and minor bls occupations
df_title.acs_detailed = np.where(df_title.acs_detailed == 'both', 1, 0)
df_title.acs_broad = np.where(df_title.acs_broad == 'both', 1, 0)
df_title.acs_minor = np.where(df_title.acs_minor == 'both', 1, 0)
df_title['acs_combined'] = df_title.iloc[:, -3:].sum(axis=1)
df_title.acs_combined = np.where(df_title.acs_combined == 1, 0, 1)
df_title.head()

Unnamed: 0,onetsoccode,acs_title,count,acs_detailed,acs_broad,acs_minor,acs_combined
0,11-1021.00,General And Operations Managers,10126,1,0,0,0
1,11-10XX.00,Chief Executives And Legislators,15456,0,0,0,1
2,11-2011.00,Advertising And Promotions Managers,521,1,0,0,0
3,11-2020.00,Marketing And Sales Managers,10693,0,1,0,0
4,11-2031.00,Public Relations And Fundraising Managers,701,1,0,0,0


In [14]:
n_detailed = df_title.acs_detailed.sum()
n_broad = df_title.acs_broad.sum()
n_minor = df_title.acs_minor.sum()
n_combined = df_acs.shape[0] - (n_detailed + n_broad + n_minor)
print(f'Number of BLS-SOC Detailed occupations in ACS is {n_detailed}')
print(f'Number of BLS-SOC Broad occupations in ACS is {n_broad}')
print(f'Number of BLS-SOC Minor occupations in ACS is {n_minor}')
print(f'Number of ACS combined occupations is {n_combined}')

Number of BLS-SOC Detailed occupations in ACS is 321
Number of BLS-SOC Broad occupations in ACS is 106
Number of BLS-SOC Minor occupations in ACS is 5
Number of ACS combined occupations is 45


## ACS & BLS-SOC Detailed Occupations

In [15]:
def bls_category(category):
    temp = df_title.loc[df_title[category] == 1]
    temp = temp.loc[:, ['onetsoccode', 'acs_title', 'count']]
    return temp.reset_index(drop=True)

In [16]:
def check_titles(df):
    temp1 = df['acs_title'].apply(lambda x : x.lower())
    temp2 = df['bls_title'].apply(lambda x : x.lower())
    if pd.Series(temp1 != temp2).sum() == 0:
        return True
    return False       

In [17]:
df_acs_detailed = bls_category('acs_detailed')
df_acs_broad = bls_category('acs_broad')
df_acs_minor = bls_category('acs_minor')

In [18]:
df_acs_detailed = df_acs_detailed.merge(df_bls_table, left_on='onetsoccode', 
                                        right_on='detailed', how='left')
df_acs_detailed = df_acs_detailed.loc[:, ['onetsoccode', 'acs_title', 'bls_title']]
df_acs_detailed.head()

Unnamed: 0,onetsoccode,acs_title,bls_title
0,11-1021.00,General And Operations Managers,General and Operations Managers
1,11-2011.00,Advertising And Promotions Managers,Advertising and Promotions Managers
2,11-2031.00,Public Relations And Fundraising Managers,Public Relations and Fundraising Managers
3,11-3011.00,Administrative Services Managers,Administrative Services Managers
4,11-3021.00,Computer And Information Systems Managers,Computer and Information Systems Managers


## ACS & BLS-SOC Broad Occupations

In [19]:
df_acs_broad = df_acs_broad.merge(df_bls_table, left_on='onetsoccode',
                                  right_on='broad', how='left')
df_acs_broad = df_acs_broad.loc[:, ['onetsoccode', 'acs_title', 'bls_title']]
df_acs_broad.head()

Unnamed: 0,onetsoccode,acs_title,bls_title
0,11-2020.00,Marketing And Sales Managers,Marketing Managers
1,11-2020.00,Marketing And Sales Managers,Sales Managers
2,11-9030.00,Education Administrators,"Education Administrators, Preschool and Childc..."
3,11-9030.00,Education Administrators,"Education Administrators, Elementary and Secon..."
4,11-9030.00,Education Administrators,"Education Administrators, Postsecondary"


## ACS & BLS-SOC Minor Occupations

In [20]:
df_acs_minor = df_acs_minor.merge(df_bls_table, left_on='onetsoccode',
                                  right_on='minor', how='left')
df_acs_minor = df_acs_minor.loc[:, ['onetsoccode', 'acs_title', 'bls_title']]
df_acs_minor.head()

Unnamed: 0,onetsoccode,acs_title,bls_title
0,25-1000.00,Postsecondary Teachers,"Business Teachers, Postsecondary"
1,25-1000.00,Postsecondary Teachers,"Computer Science Teachers, Postsecondary"
2,25-1000.00,Postsecondary Teachers,"Mathematical Science Teachers, Postsecondary"
3,25-1000.00,Postsecondary Teachers,"Architecture Teachers, Postsecondary"
4,25-1000.00,Postsecondary Teachers,"Engineering Teachers, Postsecondary"


## ACS Combined Occupations

* __ACS Combined__ occupations refer to the occupations in ACS occupational titles that do not match any single occupational title in BLS-SOC occupational titles.

* We mapped BLS Detailed, Broad and Major occupational titles to the corresponding ACS occupational titles. However, there are several BLS and ACS occupational titles that are not matched. In this stage, we have to manually match the remaining BLS and ACS occupational titles since there is no pattern in matching those occupational titles.

In [21]:
df_final = df_acs_detailed.append(df_acs_broad).append(df_acs_minor)
df_final = df_final.merge(df_bls_table, on='bls_title', how='outer')
df_final = df_final[['onetsoccode', 'acs_title', 'bls_title', 'detailed']]
df_final.sort_values('onetsoccode', inplace=True)
df_final.reset_index(drop=True, inplace=True)
df_final.head()

Unnamed: 0,onetsoccode,acs_title,bls_title,detailed
0,11-1021.00,General And Operations Managers,General and Operations Managers,11-1021.00
1,11-2011.00,Advertising And Promotions Managers,Advertising and Promotions Managers,11-2011.00
2,11-2020.00,Marketing And Sales Managers,Sales Managers,11-2022.00
3,11-2020.00,Marketing And Sales Managers,Marketing Managers,11-2021.00
4,11-2031.00,Public Relations And Fundraising Managers,Public Relations and Fundraising Managers,11-2031.00


In [22]:
def replace_acs_title(onetsoccode, title):
    mask = (df_final.detailed.isin(onetsoccode)) & df_final.acs_title.isnull()
    df_final.acs_title[mask] = title

In [23]:
replace_acs_title(['11-1011.00', '11-1031.00'],
                  "Chief Executives And Legislators")
replace_acs_title(['11-9199.00', '11-9061.00', '11-9131.00'],
                  "Miscellaneous Managers, Including Funeral Service Managers And Postmasters And Mail Superintendents")
replace_acs_title(['15-1132.00', '15-1133.00'],
                  "Software Developers, Applications and Systems Software")
replace_acs_title(['15-2021.00', '15-2041.00', '15-2091.00', '15-2099.00'],
                  "Miscellaneous Mathematical Science Occupations, Including Mathematicians and Statisticians")
replace_acs_title(['17-2021.00', '17-2031.00'],
                  "Biomedical and Agricultural Engineers")
replace_acs_title(['17-2171.00', '17-2151.00'],
                  "Petroleum, Mining and Geological Engineers, Including Mining Safety Engineers")
replace_acs_title(['17-2161.00', '17-2199.00'],
                  "Miscellaneous Engineers, Including Nuclear Engineers")
replace_acs_title(['19-1041.00', '19-1042.00', '19-1099.00'],
                  "Medical Scientists, and Life Scientists, All Other")
replace_acs_title(['19-3091.00', '19-3092.00', '19-3093.00', '19-3094.00', '19-3099.00', '19-3022.00',
                   '19-3041.00'], "Miscellaneous Social Scientists, Including Survey Researchers and Sociologists")
replace_acs_title(['19-4041.00', '19-4051.00'],
                  "Geological and Petroleum Technicians, and Nuclear Technicians")
replace_acs_title(['19-4091.00', '19-4092.00', '19-4093.00', '19-4099.00', '19-4061.00'],
                  "Miscellaneous Life, Physical, and Social Science Technicians, Including Social Science Research Assistants")
replace_acs_title(['21-1091.00', '21-1094.00', '21-1099.00'],
                  "Miscellaneous Community and Social Service Specialists, Including Health Educators and Community Health Workers")
replace_acs_title(['23-1011.00', '23-1021.00', '23-1022.00', '23-1023.00'],
                  "Lawyers, and Judges, Magistrates, and Other Judicial Workers")
replace_acs_title(['25-9011.00', '25-9021.00', '25-9031.00',
                   '25-9099.00'], "Other Education, Training, and Library Workers")
replace_acs_title(['27-4099.00', '27-4011.00', '27-4012.00', '27-4013.00', '27-4014.00'],
                  "Broadcast and Sound Engineering Technicians and Radio Operators, and Media and Communication Equipment Workers, All Other")
replace_acs_title(['29-1128.00', '29-1129.00'],
                  "Other Therapists, Including Exercise Physiologists")
replace_acs_title(['29-1161.00', '29-1171.00'],
                  "Nurse Practitioners and Nurse Midwives")
replace_acs_title(['31-9093.00', '31-9099.00'],
                  "Healthcare Support Workers, All Other, Including Medical Equipment Preparers")
replace_acs_title(['33-3031.00', '33-3041.00'],
                  "Miscellaneous Law Enforcement Workers")
replace_acs_title(['33-9092.00', '33-9099.00'],
                  "Lifeguards and Other Recreational, and All Other Protective Service Workers")
replace_acs_title(['35-9099.00', '35-9011.00'],
                  "Miscellaneous Food Preparation and Serving Related Workers, Including Dining Room and Cafeteria Attendants and Bartender Helpers")
replace_acs_title(['37-2011.00', '37-2019.00'],
                  "Janitors and Building Cleaners")
replace_acs_title(['39-4011.00', '39-4021.00'],
                  "Embalmers and Funeral Attendants")
replace_acs_title(['43-4021.00', '43-4151.00'],
                  "Correspondence Clerks and Order Clerks")
replace_acs_title(['43-9199.00', '43-9031.00'],
                  "Miscellaneous Office and Administrative Support Workers, Including Desktop Publishers")
replace_acs_title(['45-2021.00', '45-2091.00', '45-2092.00', '45-2093.00',
                   '45-2099.00'], "Miscellaneous Agricultural Workers, Including Animal Breeders")
replace_acs_title(['47-2021.00', '47-2022.00', '47-2171.00'],
                  "Brickmasons and Blockmasons, Stonemasons, and Reinforcing Iron and Rebar Workers")
replace_acs_title(['47-5099.00', '47-5061.00', '47-5081.00', '47-5051.00'],
                  "Miscellaneous Extraction Workers, Including Roof Bolters and Helpers")
replace_acs_title(['47-5011.00', '47-5012.00', '47-5013.00', '47-5071.00'],
                  "Derrick, Rotary Drill, and Service Unit Operators, and Roustabouts, Oil, Gas, and Mining")
replace_acs_title(['47-4099.00', '47-4091.00', '47-4071.00', '47-2231.00'],
                  "Miscellaneous Construction Workers, Including Solar Photovoltaic Installers, Septic Tank Servicers and Sewer Pipe Cleaners")
replace_acs_title(['49-9041.00', '49-9045.00'],
                  "Industrial and Refractory Machinery Mechanics")
replace_acs_title(['49-9092.00', '49-9093.00', '49-9095.00', '49-9097.00', '49-9099.00', '49-9081.00'],
                  "Miscellaneous Installation, Maintenance, and Repair Workers, Including Wind Turbine Service Technicians")
replace_acs_title(['51-4061.00', '51-4062.00', '51-4071.00', '51-4072.00'],
                  "Model Makers, Patternmakers, and Molding Machine Setters, Metal and Plastic")
replace_acs_title(['51-4081.00', '51-4191.00', '51-4192.00', '51-4192.00', '51-4193.00', '51-4194.00',
                   '51-4199.00'], "Miscellaneous Metal Workers and Plastic Workers, Including Multiple Machine Tool Setters")
replace_acs_title(['51-6061.00', '51-6062.00'],
                  "Textile Bleaching and Dyeing, and Cutting Machine Setters, Operators, and Tenders")
replace_acs_title(['51-6091.00', '51-6092.00', '51-6099.00'],
                  "Miscellaneous Textile, Apparel, And Furnishings Workers, Except Upholsterers")
replace_acs_title(['51-7031.00', '51-7032.00', '51-7099.00'],
                  "Miscellaneous Woodworkers, Including Model Makers and Patternmakers")
replace_acs_title(['51-9141.00', '51-9192.00', '51-9193.00', '51-9199.00'],
                  "Miscellaneous Production Workers, Including Semiconductor Processors")
replace_acs_title(['53-6011.00', '53-6041.00', '53-6099.00'],
                  "Miscellaneous Transportation Workers, Including Bridge and Lock Tenders and Traffic Technicians")
replace_acs_title(['53-7011.00', '53-7041.00'],
                  "Conveyor Operators and Tenders, and Hoist and Winch Operators")
replace_acs_title(['53-7111.00', '53-7121.00', '53-7199.00'],
                  "Miscellaneous Material Moving Workers, Including Mine Shuttle Car Operators, and Tank Car, Truck, and Ship Loaders")
replace_acs_title(['47-2072.00', '47-2073.00'],
                  "Construction Equipment Operators, Except Paving, Surfacing, and Tamping Equipment Operators")
replace_acs_title(['49-2093.00', '49-2094.00', '49-2095.00'],
                  "Electrical and Electronics Repairers, Transportation Equipment, and Industrial and Utility")
replace_acs_title(['53-5011.00', '53-5031.00'],
                  "Sailors and Marine Oilers, and Ship Engineers")
replace_acs_title(['53-4021.00', '53-4041.00', '53-4099.00'],
                  "Subway, Streetcar, and Other Rail Transportation Workers")

In [24]:
df_final = df_final.loc[:, ['detailed', 'acs_title', 'bls_title']]
df_final.rename(columns={'detailed' : 'onetsoccode'}, inplace=True)
df_final.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 840 entries, 0 to 839
Data columns (total 3 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   onetsoccode  840 non-null    object
 1   acs_title    840 non-null    object
 2   bls_title    840 non-null    object
dtypes: object(3)
memory usage: 19.8+ KB


# Merge ACS - BLS - O\*NET Occupations

In [25]:
df_final = df_final.merge(df_onet, on='onetsoccode', how='outer')
df_final.sort_values('onetsoccode', inplace=True)
df_final.reset_index(drop=True, inplace=True)
df_final.bls_title.fillna(method='ffill', inplace=True)
df_final.acs_title.fillna(method='ffill', inplace=True)
df_final.head()

Unnamed: 0,onetsoccode,acs_title,bls_title,onet_title
0,11-1011.00,Chief Executives And Legislators,Chief Executives,Chief Executives
1,11-1011.03,Chief Executives And Legislators,Chief Executives,Chief Sustainability Officers
2,11-1021.00,General And Operations Managers,General and Operations Managers,General and Operations Managers
3,11-1031.00,Chief Executives And Legislators,Legislators,Legislators
4,11-2011.00,Advertising And Promotions Managers,Advertising and Promotions Managers,Advertising and Promotions Managers


In [26]:
with pd.option_context('display.max_rows', None):
    display(df_final)

Unnamed: 0,onetsoccode,acs_title,bls_title,onet_title
0,11-1011.00,Chief Executives And Legislators,Chief Executives,Chief Executives
1,11-1011.03,Chief Executives And Legislators,Chief Executives,Chief Sustainability Officers
2,11-1021.00,General And Operations Managers,General and Operations Managers,General and Operations Managers
3,11-1031.00,Chief Executives And Legislators,Legislators,Legislators
4,11-2011.00,Advertising And Promotions Managers,Advertising and Promotions Managers,Advertising and Promotions Managers
5,11-2011.01,Advertising And Promotions Managers,Advertising and Promotions Managers,Green Marketers
6,11-2021.00,Marketing And Sales Managers,Marketing Managers,Marketing Managers
7,11-2022.00,Marketing And Sales Managers,Sales Managers,Sales Managers
8,11-2031.00,Public Relations And Fundraising Managers,Public Relations and Fundraising Managers,Public Relations and Fundraising Managers
9,11-3011.00,Administrative Services Managers,Administrative Services Managers,Administrative Services Managers


In [27]:
df_final.to_csv('title_acs_bls_onet.csv', index=False)