### Capstone Project Proposal (Module 3.6)
### Phai Phongthiengtham




### How do successful firms post their job vacancies for managers?

#### Broad Idea:
Management has been long emphasized as an important factor of the success of a company. Nevertheless, there is little empirical evidence mostly due to limited data. This project contributes to the literature by looking at what successful firms are looking for when they hire a manager.  Using 15 millions of management job postings all over the US from 2012-2016, the objective is to explore what features of management job postings can predict a firm’s performance (for example, profitability, stock market movement).   


#### Potential Clients: 
* Human resource departments 
* Job board websites

#### Data:
(1.) Vacancy postings

Vacancy postings are provided by CareerBuilder. This project uses all management postings in the US from 2012-2016. Management positions are defined by the occupation code provided by the Bureau of Labor Statistics.  

Relevant variables are:

* posting period: month, year 
* geographic information: city, state, zip code, county
* firm information: firm's name, NAICS
* posting information: job title, education, occupation code and original content of the posting   

In [1]:
import pandas as pd
pd.set_option('display.max_columns', 200)

# --- Data Description: Job Postings --- 
#
#
# all variable names for CB job postings
FieldList = ['obs','yearmonth', 'city', 'state', 'zip', 'county',
             'company_name', 'company_naics', 'company_isstaffingfirm',
             'master_company_name', 'master_company_naics', 'master_company_isstaffingfirm',
             'onet', 'cb_jobtitle_id', 'cb_jobtitle', 'edulevels_name',
             'source', 'subsource', 'original_jobtitle', 'url', 'description']
# read postings
df_postings = pd.read_csv('postings/raw/Data_2012.txt',sep='|',names = FieldList)
# select relevant variables
df_postings = df_postings[['obs','yearmonth','city','state','county',
                           'company_name','company_naics',
                           'master_company_name', 'master_company_naics',
                           'edulevels_name', 'onet','cb_jobtitle','description']]

df_postings = df_postings.rename(columns={'company_naics': 'naics'})

# set index
df_postings = df_postings.set_index(['obs','yearmonth'])

In [2]:
df_postings.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,city,state,county,company_name,naics,master_company_name,master_company_naics,edulevels_name,onet,cb_jobtitle,description
obs,yearmonth,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
11,201201,"Oak Brook, IL",IL,17043,Inland Real Estate Corporation,531110.0,Inland Real Estate Corporation,531110.0,"['High school or GED', ""Bachelor's degree""]",11-2022.00,Business Development Managers,"The Inland Real Estate Group of Companies, Inc..."
16,201201,"Chicago, IL",IL,17031,Memorial Health Inc,621999.0,Memorial Health Inc,621999.0,"[""Bachelor's degree""]",11-3021.00,Information Technology (IT) Managers,Title : Director of Applications Information T...
20,201201,"Houston, TX",TX,48201,"Oceaneering International, Inc.",213112.0,"Oceaneering International, Inc.",213112.0,"[""Bachelor's degree""]",11-9041.00,Project Managers,Company Profile Oceaneering is a global oilfie...
22,201201,"Denver, CO",CO,8031,"P2 Energy Solutions, Inc.",213112.0,"P2 Energy Solutions, Inc.",213112.0,"[""Bachelor's degree"", ""Master's degree""]",11-2021.00,Product Managers,P2 Energy Solutions seeks a Senior Product Man...
30,201201,"Sacramento, CA",CA,6067,The McClatchy Company,511110.0,The McClatchy Company,511110.0,[],11-2021.00,Marketing Managers,"The McClatchy Company, the third-largest newsp..."


In [3]:
df_postings['description'][0] # example of original posting content

"The Inland Real Estate Group of Companies, Inc currently has a full-time opening for a Director of Leasing for one of the Inland REIT ' s. This position will oversee and direct one of the Inland REIT ' s leasing policies and strategies within a retail portfolio. Additional responsibilities include directing the day to day leasing functions and activities through delegation and may also assist with complex leasing transactions. This position will be base din Oak Brook, IL and qualified candidates should have more than 10 years successful commercial leasing experience, preferably within the retail environment. Job Functions : Manage leasing functions thru Leasing Agent Provide industry training to Leasing Agents Oversee productivity of Leasing Agents Provide direction on deal terms to the Leasing Agents Set deal thresholds for Leasing Agents Coordinate with the Marketing Department and Leasing Agents portfolio based marketing efforts Coordinate Retailer presentations Coordinate with the

In [4]:
df_postings['description'][1] # example of original posting content

'Title : Director of Applications Information Technology Location : Savannah, GA Description : Memorial Health has an immediate need for a Director of Applications in Savannah, GA. This is a great opportunity for the right candidate to deliver, develop, and implement business and clinical applications. Responsibilities : Responsible for overseeing the performance of several departments focused on different application portfolios and limited software development. Translates client requirements into applications that create business value and actively promote the use of the application to end-users. Assists in the development and implementation of the application strategy and communicates the value of technology / applications to clients. Provides direction and leadership in the review of present IT systems and methods, while forming new and revised systems. Demonstrates creative IT solutions for resolving complex business issues within the organization. Assesses potential technological 

(2.) COMPUSTAT (North America) Database provided by Wharton Research Data Services

All publicly traded companies in the US are required to track accounting and balance sheet
data. Compustat datbase, therefore, provides excellent information on firms. Full compustat dataset contains 1,842 variables. Potential relevant variables include: 

* North American Industry Classification Code (NAICS)
* Company name (CONM)
* Company legal name (CONML)
* Earnings before interest and taxes (EBIT)
* Earnings per share (EPSFI)
* Book value per share (BVLPS)
* Market value (MKVALT)
* Net income (NI)

There are additional information on stock prices as well (coming soon).

In [8]:
# --- Data Description: Compustat --- 
#
#
# compustat data
df_compustat = pd.read_csv('compustat/compustat_extracted_small.txt',sep='\t')
df_compustat.head(10)

Unnamed: 0,bkvlps,conm,conml,ebit,epsfi,mkvalt,naics,ni
0,21.4697,AAR CORP,AAR Corp,142.36,1.65,485.2897,423860,67.723
1,23.3254,AAR CORP,AAR Corp,136.6,1.38,790.0029,423860,55.0
2,25.2654,AAR CORP,AAR Corp,142.6,1.83,961.308,423860,72.9
3,23.8574,AAR CORP,AAR Corp,-8.6,0.24,1046.3954,423860,10.2
4,25.0847,AAR CORP,AAR Corp,66.1,1.37,842.5112,423860,47.7
5,-23.821,AMERICAN AIRLINES GROUP INC,American Airlines Group Inc,494.0,-5.6,266.5571,481111,-1876.0
6,-10.4608,AMERICAN AIRLINES GROUP INC,American Airlines Group Inc,1935.0,-11.25,6591.9923,481111,-1834.0
7,2.8976,AMERICAN AIRLINES GROUP INC,American Airlines Group Inc,5073.0,3.93,37405.5843,481111,2882.0
8,9.0215,AMERICAN AIRLINES GROUP INC,American Airlines Group Inc,7284.0,11.07,26452.7417,481111,7610.0
9,7.4612,AMERICAN AIRLINES GROUP INC,American Airlines Group Inc,6007.0,4.81,23685.5569,481111,2676.0


Firm performance indicator:
* Change in net income (YoY)
* Change in earnings per share (YoY) 
* Change in stock prices