# Employment

## BLS: Job Openings and Labor Turnover Survey

Information about the survey [here](https://download.bls.gov/pub/time.series/jt/jt.txt)

In [1]:
import os
import pandas as pd

# path for the folder "project"
path = "C:\\Users\\pedro\\OneDrive\\NYU\\CSS\\II. Data Skills\\project"
os.chdir(path)

&nbsp;<br>

Importing all `JOLTS survey` files directly from [BLS](https://download.bls.gov/pub/time.series/jt/), and saving as a .parquet file:

In [2]:
JOLTS = pd.read_csv("https://download.bls.gov/pub/time.series/jt/jt.data.1.AllItems", delimiter="\t")

In [3]:
JOLTS.columns = JOLTS.columns.str.strip()
JOLTS["series_id"] = JOLTS["series_id"].str.strip()
JOLTS = JOLTS[JOLTS["period"] != 'M13'].drop(columns = "footnote_codes")
JOLTS["period"] = JOLTS["period"].str.replace("M","")
JOLTS["date"] = JOLTS["year"].astype(str)+"-"+JOLTS["period"]+"-1"
JOLTS["date"] = pd.to_datetime(JOLTS["date"])
JOLTS = JOLTS.drop(columns = ["year","period"])[["series_id","date","value"]]

In [4]:
JOLTS.head()

Unnamed: 0,series_id,date,value
0,JTS000000000000000HIL,2000-12-01,5426.0
1,JTS000000000000000HIL,2001-01-01,5722.0
2,JTS000000000000000HIL,2001-02-01,5303.0
3,JTS000000000000000HIL,2001-03-01,5528.0
4,JTS000000000000000HIL,2001-04-01,5204.0


In [5]:
JOLTS.to_parquet("data\\employment\\data_bls_jolts.parquet")

&nbsp;<br>

Building the Dictionary:

In [6]:
# importing different information for the data
series = pd.read_csv("https://download.bls.gov/pub/time.series/jt/jt.series", delimiter="\t")
industry = pd.read_csv("https://download.bls.gov/pub/time.series/jt/jt.industry", delimiter="\t")
state = pd.read_csv("https://download.bls.gov/pub/time.series/jt/jt.state", delimiter="\t")
area = pd.read_csv("https://download.bls.gov/pub/time.series/jt/jt.area", delimiter="\t")
size = pd.read_csv("https://download.bls.gov/pub/time.series/jt/jt.sizeclass", delimiter="\t")
data_element = pd.read_csv("https://download.bls.gov/pub/time.series/jt/jt.dataelement", delimiter="\t")
rate = pd.read_csv("https://download.bls.gov/pub/time.series/jt/jt.ratelevel", delimiter="\t")

In [7]:
# removing blank space of column names
series.columns = series.columns.str.strip()
industry.columns = industry.columns.str.strip()
state.columns = state.columns.str.strip()
area.columns = area.columns.str.strip()
size.columns = size.columns.str.strip()
data_element.columns = data_element.columns.str.strip()
rate.columns = rate.columns.str.strip()

In [8]:
# Merge dfs
jolts_dict = pd.merge(series, industry, how = "left")
jolts_dict = pd.merge(jolts_dict, state, how = "left")
jolts_dict = pd.merge(jolts_dict, area, how = "left")
jolts_dict = pd.merge(jolts_dict, size, how = "left")
jolts_dict = pd.merge(jolts_dict, data_element, how = "left")
jolts_dict = pd.merge(jolts_dict, rate, how = "left")

jolts_dict["series_id"] = jolts_dict["series_id"].str.strip()
jolts_dict.head()

Unnamed: 0,series_id,seasonal,industry_code,state_code,area_code,sizeclass_code,dataelement_code,ratelevel_code,footnote_codes,begin_year,...,end_period,industry_text,display_level,selectable,sort_sequence,state_text,area_text,sizeclass_text,dataelement_text,ratelevel_text
0,JTS000000000000000HIL,S,0,0,0,0,HI,L,,2000,...,M09,Total nonfarm,0,T,1,Total US,All areas,All size classes,,
1,JTS000000000000000HIR,S,0,0,0,0,HI,R,,2000,...,M09,Total nonfarm,0,T,1,Total US,All areas,All size classes,,Rate
2,JTS000000000000000JOL,S,0,0,0,0,JO,L,,2000,...,M09,Total nonfarm,0,T,1,Total US,All areas,All size classes,Job openings,
3,JTS000000000000000JOR,S,0,0,0,0,JO,R,,2000,...,M09,Total nonfarm,0,T,1,Total US,All areas,All size classes,Job openings,Rate
4,JTS000000000000000LDL,S,0,0,0,0,LD,L,,2000,...,M09,Total nonfarm,0,T,1,Total US,All areas,All size classes,,


In [9]:
jolts_dict.drop(["footnote_codes"], axis=1).to_parquet("data\\employment\\dict_bls_jolts.parquet")