H1B Disclosure Dataset: Predicting the case
Status
Problem Statement:
    __The H-1B Dataset selected for this project contains data from employer’s Labor Condition
Application and the case certification determinations processed by the Office of Foreign Labor
Certification (OFLC) where the date of the determination was issued on or after October 1,
2016, and on or before June 30, 2017.
    __The Labor Condition Application (LCA) is a document that a prospective H-1B employer files
with U.S. Department of Labor Employment and Training Administration (DOLETA) when it
seeks to employ nonimmigrant workers at a specific job occupation in an area of intended
employment for not more than three years.
    The goal for this project is to predict the case status of an application submitted by the employer
to hire non-immigrant workers under the H-1B visa program. The employer can hire
non-immigrant workers only after their LCA petition is approved. The approved LCA petition is
then submitted as part of the Petition for a Non-immigrant Worker application for work
authorizations for H-1B visa status.

Download the dataset:
https://drive.google.com/drive/folders/1sIjRnbrIvrDaSkj8TIi-myErAx258iGf?usp=sharing
Data Set Information:
The H-1B dataset from OFLC contained 40 attributes and 528,147 instances.

In [1]:
import pandas as pd
import numpy as np 


In [2]:
df = pd.read_csv("H-1B_Disclosure_Data_FY17.csv")

  interactivity=interactivity, compiler=compiler, result=result)


In [4]:
df

Unnamed: 0.1,Unnamed: 0,CASE_NUMBER,CASE_STATUS,CASE_SUBMITTED,DECISION_DATE,VISA_CLASS,EMPLOYMENT_START_DATE,EMPLOYMENT_END_DATE,EMPLOYER_NAME,EMPLOYER_BUSINESS_DBA,...,H1B_DEPENDENT,WILLFUL_VIOLATOR,SUPPORT_H1B,LABOR_CON_AGREE,PUBLIC_DISCLOSURE_LOCATION,WORKSITE_CITY,WORKSITE_COUNTY,WORKSITE_STATE,WORKSITE_POSTAL_CODE,ORIGINAL_CERT_DATE
0,0,I-200-16055-173457,CERTIFIED-WITHDRAWN,2016-02-24,2016-10-01,H-1B,2016-08-10,2019-08-10,DISCOVER PRODUCTS INC.,,...,N,N,,Y,,RIVERWOODS,LAKE,IL,60015,2016-03-01
1,1,I-200-16064-557834,CERTIFIED-WITHDRAWN,2016-03-04,2016-10-01,H-1B,2016-08-16,2019-08-16,DFS SERVICES LLC,,...,N,N,,Y,,RIVERWOODS,LAKE,IL,60015,2016-03-08
2,2,I-200-16063-996093,CERTIFIED-WITHDRAWN,2016-03-10,2016-10-01,H-1B,2016-09-09,2019-09-09,EASTBANC TECHNOLOGIES LLC,,...,Y,N,Y,,,WASHINGTON,,DC,20007,2016-03-16
3,3,I-200-16272-196340,WITHDRAWN,2016-09-28,2016-10-01,H-1B,2017-01-26,2020-01-25,INFO SERVICES LLC,,...,Y,N,Y,,,JERSEY CITY,HUDSON,NJ,07302,
4,4,I-200-15053-636744,CERTIFIED-WITHDRAWN,2015-02-22,2016-10-02,H-1B,2015-03-01,2018-03-01,BB&T CORPORATION,,...,N,N,,Y,,NEW YORK,NEW YORK,NY,10036,2015-02-26
5,5,I-200-15071-336195,CERTIFIED-WITHDRAWN,2015-03-12,2016-10-02,H-1B,2015-09-11,2018-09-11,"SUNTRUST BANKS, INC.",,...,N,N,,Y,,ATLANTA,FULTON,GA,30303,2015-03-18
6,6,I-200-16056-842817,CERTIFIED-WITHDRAWN,2016-02-25,2016-10-02,H-1B,2016-08-25,2019-08-24,CITADEL INFORMATION SERVICES INC.,CITADEL,...,Y,N,Y,Y,,EDISON,MIDDLESEX,NJ,08837,2016-03-02
7,7,I-200-16056-757335,CERTIFIED-WITHDRAWN,2016-02-25,2016-10-02,H-1B,2016-08-26,2019-08-25,CITADEL INFORMATION SERVICES INC.,CITADEL,...,Y,N,Y,Y,,EDISON,MIDDLESEX,NJ,08837,2016-03-02
8,8,I-200-16058-469533,CERTIFIED-WITHDRAWN,2016-02-27,2016-10-02,H-1B,2016-08-26,2019-08-25,CITADEL INFORMATION SERVICES INC.,CITADEL,...,Y,N,Y,Y,,NEW YORK,NEW YORK,NY,10005,2016-03-03
9,9,I-200-16059-084066,CERTIFIED-WITHDRAWN,2016-02-28,2016-10-02,H-1B,2016-08-29,2019-08-26,CITADEL INFORMATION SERVICES INC.,CITADEL,...,Y,N,Y,Y,,ISELIN,MIDDLESEX,NJ,08830,2016-03-03


In [5]:
df.shape

(624650, 53)

In [6]:
df.CASE_STATUS.value_counts()

CERTIFIED              545694
CERTIFIED-WITHDRAWN     49704
WITHDRAWN               20772
DENIED                   8480
Name: CASE_STATUS, dtype: int64

In [11]:
(df.isnull().sum()/len(df))*100

Unnamed: 0                       0.000000
CASE_NUMBER                      0.000000
CASE_STATUS                      0.000000
CASE_SUBMITTED                   0.000000
DECISION_DATE                    0.000000
VISA_CLASS                       0.000000
EMPLOYMENT_START_DATE            0.004643
EMPLOYMENT_END_DATE              0.004803
EMPLOYER_NAME                    0.008965
EMPLOYER_BUSINESS_DBA           93.072921
EMPLOYER_ADDRESS                 0.001121
EMPLOYER_CITY                    0.002401
EMPLOYER_STATE                   0.002882
EMPLOYER_POSTAL_CODE             0.002882
EMPLOYER_COUNTRY                15.449772
EMPLOYER_PROVINCE               99.020892
EMPLOYER_PHONE                  15.449932
EMPLOYER_PHONE_EXT              95.537981
AGENT_REPRESENTING_EMPLOYER     15.449612
AGENT_ATTORNEY_NAME              0.000000
AGENT_ATTORNEY_CITY             43.753462
AGENT_ATTORNEY_STATE            46.208437
JOB_TITLE                        0.000800
SOC_CODE                         0

In [None]:
# df.drop(columns=["Unnamed: 0,EMPLOYER_BUSINESS_DBA,ORIGINAL_CERT_DATE,EMPLOYER_PROVINCE,EMPLOYER_PHONE_EXT,"])

In [15]:
df1 = df.dropna(thresh=df.shape[0]*0.6,how='all',axis=1)
df1=df1.drop(columns=["Unnamed: 0"])

In [16]:
df1

Unnamed: 0,CASE_NUMBER,CASE_STATUS,CASE_SUBMITTED,DECISION_DATE,VISA_CLASS,EMPLOYMENT_START_DATE,EMPLOYMENT_END_DATE,EMPLOYER_NAME,EMPLOYER_ADDRESS,EMPLOYER_CITY,...,PW_SOURCE_OTHER,WAGE_RATE_OF_PAY_FROM,WAGE_RATE_OF_PAY_TO,WAGE_UNIT_OF_PAY,H1B_DEPENDENT,WILLFUL_VIOLATOR,WORKSITE_CITY,WORKSITE_COUNTY,WORKSITE_STATE,WORKSITE_POSTAL_CODE
0,I-200-16055-173457,CERTIFIED-WITHDRAWN,2016-02-24,2016-10-01,H-1B,2016-08-10,2019-08-10,DISCOVER PRODUCTS INC.,2500 LAKE COOK ROAD,RIVERWOODS,...,OFLC ONLINE DATA CENTER,65811.00,67320.0,Year,N,N,RIVERWOODS,LAKE,IL,60015
1,I-200-16064-557834,CERTIFIED-WITHDRAWN,2016-03-04,2016-10-01,H-1B,2016-08-16,2019-08-16,DFS SERVICES LLC,2500 LAKE COOK ROAD,RIVERWOODS,...,TOWERS WATSON DATA SERVICES 2015 CSR PROFESSIO...,53000.00,57200.0,Year,N,N,RIVERWOODS,LAKE,IL,60015
2,I-200-16063-996093,CERTIFIED-WITHDRAWN,2016-03-10,2016-10-01,H-1B,2016-09-09,2019-09-09,EASTBANC TECHNOLOGIES LLC,1211 31ST ST. NW,WASHINGTON,...,OFLC ONLINE DATA CENTER,77000.00,0.0,Year,Y,N,WASHINGTON,,DC,20007
3,I-200-16272-196340,WITHDRAWN,2016-09-28,2016-10-01,H-1B,2017-01-26,2020-01-25,INFO SERVICES LLC,17177 NORTH LAUREL PARK DR,LIVONIA,...,OFLC ONLINE DATA CENTER,102000.00,0.0,Year,Y,N,JERSEY CITY,HUDSON,NJ,07302
4,I-200-15053-636744,CERTIFIED-WITHDRAWN,2015-02-22,2016-10-02,H-1B,2015-03-01,2018-03-01,BB&T CORPORATION,223 WEST NASH STREET,WILSON,...,OFLC ONLINE DATA CENTER,132500.00,0.0,Year,N,N,NEW YORK,NEW YORK,NY,10036
5,I-200-15071-336195,CERTIFIED-WITHDRAWN,2015-03-12,2016-10-02,H-1B,2015-09-11,2018-09-11,"SUNTRUST BANKS, INC.","303 PEACHTREE STREET, NE",ATLANTA,...,OFLC ONLINE DATA CENTER,71750.00,0.0,Year,N,N,ATLANTA,FULTON,GA,30303
6,I-200-16056-842817,CERTIFIED-WITHDRAWN,2016-02-25,2016-10-02,H-1B,2016-08-25,2019-08-24,CITADEL INFORMATION SERVICES INC.,33 WOOD AVENUE SOUTH,ISELIN,...,ONLINE WAGE LIBRARY,61000.00,0.0,Year,Y,N,EDISON,MIDDLESEX,NJ,08837
7,I-200-16056-757335,CERTIFIED-WITHDRAWN,2016-02-25,2016-10-02,H-1B,2016-08-26,2019-08-25,CITADEL INFORMATION SERVICES INC.,33 WOOD AVENUE SOUTH,ISELIN,...,ONLINE WAGE LIBRARY,60500.00,0.0,Year,Y,N,EDISON,MIDDLESEX,NJ,08837
8,I-200-16058-469533,CERTIFIED-WITHDRAWN,2016-02-27,2016-10-02,H-1B,2016-08-26,2019-08-25,CITADEL INFORMATION SERVICES INC.,33 WOOD AVENUE SOUTH,ISELIN,...,ONLINE WAGE LIBRARY,60450.00,0.0,Year,Y,N,NEW YORK,NEW YORK,NY,10005
9,I-200-16059-084066,CERTIFIED-WITHDRAWN,2016-02-28,2016-10-02,H-1B,2016-08-29,2019-08-26,CITADEL INFORMATION SERVICES INC.,33 WOOD AVENUE SOUTH,ISELIN,...,ONLINE WAGE LIBRARY,50000.00,0.0,Year,Y,N,ISELIN,MIDDLESEX,NJ,08830


In [17]:
(df1.isnull().sum()/len(df1))*100

CASE_NUMBER                     0.000000
CASE_STATUS                     0.000000
CASE_SUBMITTED                  0.000000
DECISION_DATE                   0.000000
VISA_CLASS                      0.000000
EMPLOYMENT_START_DATE           0.004643
EMPLOYMENT_END_DATE             0.004803
EMPLOYER_NAME                   0.008965
EMPLOYER_ADDRESS                0.001121
EMPLOYER_CITY                   0.002401
EMPLOYER_STATE                  0.002882
EMPLOYER_POSTAL_CODE            0.002882
EMPLOYER_COUNTRY               15.449772
EMPLOYER_PHONE                 15.449932
AGENT_REPRESENTING_EMPLOYER    15.449612
AGENT_ATTORNEY_NAME             0.000000
JOB_TITLE                       0.000800
SOC_CODE                        0.000320
SOC_NAME                        0.000480
NAICS_CODE                      0.001121
TOTAL_WORKERS                   0.000000
NEW_EMPLOYMENT                  0.000000
CONTINUED_EMPLOYMENT            0.000000
CHANGE_PREVIOUS_EMPLOYMENT      0.000000
NEW_CONCURRENT_E

In [18]:
df1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 624650 entries, 0 to 624649
Data columns (total 43 columns):
CASE_NUMBER                    624650 non-null object
CASE_STATUS                    624650 non-null object
CASE_SUBMITTED                 624650 non-null object
DECISION_DATE                  624650 non-null object
VISA_CLASS                     624650 non-null object
EMPLOYMENT_START_DATE          624621 non-null object
EMPLOYMENT_END_DATE            624620 non-null object
EMPLOYER_NAME                  624594 non-null object
EMPLOYER_ADDRESS               624643 non-null object
EMPLOYER_CITY                  624635 non-null object
EMPLOYER_STATE                 624632 non-null object
EMPLOYER_POSTAL_CODE           624632 non-null object
EMPLOYER_COUNTRY               528143 non-null object
EMPLOYER_PHONE                 528142 non-null object
AGENT_REPRESENTING_EMPLOYER    528144 non-null object
AGENT_ATTORNEY_NAME            624650 non-null object
JOB_TITLE                

In [30]:
df1[["EMPLOYER_COUNTRY","EMPLOYER_PHONE","AGENT_REPRESENTING_EMPLOYER"]].info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 624650 entries, 0 to 624649
Data columns (total 3 columns):
EMPLOYER_COUNTRY               528143 non-null object
EMPLOYER_PHONE                 528142 non-null object
AGENT_REPRESENTING_EMPLOYER    528144 non-null object
dtypes: object(3)
memory usage: 14.3+ MB


In [32]:
df.shape[0]*0.60

374790.0

In [37]:
df.EMPLOYER_PHONE

0          2244050900
1          2244050900
2          2022953000
3          7343776007
4          2522462031
5          4048137888
6          7322380072
7          7322380072
8          7322380072
9          7322380072
10         7322380072
11         6092695555
12         6092695555
13         6092695555
14         6092695555
15         6092695555
16         3175758950
17         5093753959
18         7134920214
19         2547517878
20         4083830155
21         5093753959
22         5034942169
23         7135866500
24         8173543000
25         8314584458
26         7737027752
27         3108251681
28         5132296469
29         5093354508
             ...     
624620            NaN
624621            NaN
624622            NaN
624623            NaN
624624            NaN
624625            NaN
624626            NaN
624627            NaN
624628            NaN
624629            NaN
624630            NaN
624631            NaN
624632            NaN
624633            NaN
624634    

In [38]:
df2 = df1.copy(deep=True)

In [39]:
df2 = df2.replace(np.nan, 0)

In [57]:
df2=df2.drop(columns=["CASE_NUMBER"])

In [66]:
df2["DECISION_DATE"]=pd.to_datetime(df2["DECISION_DATE"])
df2["CASE_SUBMITTED"]=pd.to_datetime(df2["CASE_SUBMITTED"])

In [67]:
import datetime as dt
df2['Time_to_Review'] = (df2['DECISION_DATE']-df2['CASE_SUBMITTED']).dt.days

In [70]:
df2=df2.drop(columns=["DECISION_DATE","CASE_SUBMITTED"])

In [71]:
df2.columns

Index(['CASE_STATUS', 'VISA_CLASS', 'EMPLOYMENT_START_DATE',
       'EMPLOYMENT_END_DATE', 'EMPLOYER_NAME', 'EMPLOYER_ADDRESS',
       'EMPLOYER_CITY', 'EMPLOYER_STATE', 'EMPLOYER_POSTAL_CODE',
       'EMPLOYER_COUNTRY', 'EMPLOYER_PHONE', 'AGENT_REPRESENTING_EMPLOYER',
       'AGENT_ATTORNEY_NAME', 'JOB_TITLE', 'SOC_CODE', 'SOC_NAME',
       'NAICS_CODE', 'TOTAL_WORKERS', 'NEW_EMPLOYMENT', 'CONTINUED_EMPLOYMENT',
       'CHANGE_PREVIOUS_EMPLOYMENT', 'NEW_CONCURRENT_EMPLOYMENT',
       'CHANGE_EMPLOYER', 'AMENDED_PETITION', 'FULL_TIME_POSITION',
       'PREVAILING_WAGE', 'PW_UNIT_OF_PAY', 'PW_WAGE_LEVEL', 'PW_SOURCE',
       'PW_SOURCE_YEAR', 'PW_SOURCE_OTHER', 'WAGE_RATE_OF_PAY_FROM',
       'WAGE_RATE_OF_PAY_TO', 'WAGE_UNIT_OF_PAY', 'H1B_DEPENDENT',
       'WILLFUL_VIOLATOR', 'WORKSITE_CITY', 'WORKSITE_COUNTY',
       'WORKSITE_STATE', 'WORKSITE_POSTAL_CODE', 'Time_to_Review'],
      dtype='object')

In [72]:
df2=df2.drop(columns=['SOC_CODE','EMPLOYER_ADDRESS','EMPLOYER_CITY', \
    'EMPLOYER_STATE', \
    'EMPLOYER_COUNTRY',\
    'WORKSITE_CITY', 'WORKSITE_COUNTY','WORKSITE_STATE'])

In [73]:
X = df2.drop(columns=["CASE_STATUS"])
y = df2["CASE_STATUS"]

In [87]:
from sklearn.preprocessing import LabelEncoder

In [88]:
le =LabelEncoder()

In [89]:
df2["VISA_CLASS"]=le.fit_transform(df2["VISA_CLASS"])

In [93]:
df2

Unnamed: 0,CASE_STATUS,VISA_CLASS,EMPLOYMENT_START_DATE,EMPLOYMENT_END_DATE,EMPLOYER_NAME,EMPLOYER_POSTAL_CODE,EMPLOYER_PHONE,AGENT_REPRESENTING_EMPLOYER,AGENT_ATTORNEY_NAME,JOB_TITLE,...,PW_SOURCE,PW_SOURCE_YEAR,PW_SOURCE_OTHER,WAGE_RATE_OF_PAY_FROM,WAGE_RATE_OF_PAY_TO,WAGE_UNIT_OF_PAY,H1B_DEPENDENT,WILLFUL_VIOLATOR,WORKSITE_POSTAL_CODE,Time_to_Review
0,CERTIFIED-WITHDRAWN,1,2016-08-10,2019-08-10,DISCOVER PRODUCTS INC.,60015,2244050900,Y,"ELLSWORTH, CHAD",ASSOCIATE DATA INTEGRATION,...,OES,2015.0,OFLC ONLINE DATA CENTER,65811.00,67320.0,Year,N,N,60015,220
1,CERTIFIED-WITHDRAWN,1,2016-08-16,2019-08-16,DFS SERVICES LLC,60015,2244050900,Y,"ELLSWORTH, CHAD",SENIOR ASSOCIATE,...,Other,2015.0,TOWERS WATSON DATA SERVICES 2015 CSR PROFESSIO...,53000.00,57200.0,Year,N,N,60015,211
2,CERTIFIED-WITHDRAWN,1,2016-09-09,2019-09-09,EASTBANC TECHNOLOGIES LLC,20007,2022953000,Y,"BURKE, KAREN",.NET SOFTWARE PROGRAMMER,...,OES,2015.0,OFLC ONLINE DATA CENTER,77000.00,0.0,Year,Y,N,20007,205
3,WITHDRAWN,1,2017-01-26,2020-01-25,INFO SERVICES LLC,48152,7343776007,N,",",PROJECT MANAGER,...,OES,2016.0,OFLC ONLINE DATA CENTER,102000.00,0.0,Year,Y,N,07302,3
4,CERTIFIED-WITHDRAWN,1,2015-03-01,2018-03-01,BB&T CORPORATION,27893,2522462031,Y,"SCOFIELD, EILEEN",ASSOCIATE - ESOTERIC ASSET BACKED SECURITIES,...,OES,2015.0,OFLC ONLINE DATA CENTER,132500.00,0.0,Year,N,N,10036,588
5,CERTIFIED-WITHDRAWN,1,2015-09-11,2018-09-11,"SUNTRUST BANKS, INC.",30308,4048137888,Y,"SCOFIELD, EILEEN",CREDIT RISK METRICS SPECIALIST,...,OES,2015.0,OFLC ONLINE DATA CENTER,71750.00,0.0,Year,N,N,30303,570
6,CERTIFIED-WITHDRAWN,1,2016-08-25,2019-08-24,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",BUSINESS SYSTEMS ANALYST,...,Other,2015.0,ONLINE WAGE LIBRARY,61000.00,0.0,Year,Y,N,08837,220
7,CERTIFIED-WITHDRAWN,1,2016-08-26,2019-08-25,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",PROGRAMMER ANALYST,...,Other,2015.0,ONLINE WAGE LIBRARY,60500.00,0.0,Year,Y,N,08837,220
8,CERTIFIED-WITHDRAWN,1,2016-08-26,2019-08-25,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",PROGRAMMER ANALYST,...,Other,2015.0,ONLINE WAGE LIBRARY,60450.00,0.0,Year,Y,N,10005,218
9,CERTIFIED-WITHDRAWN,1,2016-08-29,2019-08-26,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",MARKET RESEARCH ANALYST,...,Other,2015.0,ONLINE WAGE LIBRARY,50000.00,0.0,Year,Y,N,08830,217


In [94]:
df2["EMPLOYMENT_END_DATE"]=pd.to_datetime(df2["EMPLOYMENT_END_DATE"])
df2["EMPLOYMENT_START_DATE"]=pd.to_datetime(df2["EMPLOYMENT_START_DATE"])

df2['Emp_Time'] = (df2['EMPLOYMENT_END_DATE']-df2['EMPLOYMENT_START_DATE']).dt.days

In [95]:
df2= df2.drop(columns=["EMPLOYMENT_START_DATE","EMPLOYMENT_START_DATE"])

In [103]:
df2= df2.drop(columns=["EMPLOYMENT_END_DATE"])

In [105]:
df2.dtypes

CASE_STATUS                     object
VISA_CLASS                       int64
EMPLOYER_NAME                   object
EMPLOYER_POSTAL_CODE            object
EMPLOYER_PHONE                  object
AGENT_REPRESENTING_EMPLOYER     object
AGENT_ATTORNEY_NAME             object
JOB_TITLE                       object
SOC_NAME                        object
NAICS_CODE                      object
TOTAL_WORKERS                    int64
NEW_EMPLOYMENT                   int64
CONTINUED_EMPLOYMENT             int64
CHANGE_PREVIOUS_EMPLOYMENT       int64
NEW_CONCURRENT_EMPLOYMENT        int64
CHANGE_EMPLOYER                  int64
AMENDED_PETITION                 int64
FULL_TIME_POSITION              object
PREVAILING_WAGE                float64
PW_UNIT_OF_PAY                  object
PW_WAGE_LEVEL                   object
PW_SOURCE                       object
PW_SOURCE_YEAR                 float64
PW_SOURCE_OTHER                 object
WAGE_RATE_OF_PAY_FROM          float64
WAGE_RATE_OF_PAY_TO      

In [108]:
df2["Time_to_Review"]=df2["Time_to_Review"].astype(float)

In [109]:
df2["Emp_Time"]=df2["Emp_Time"].astype(float)

In [115]:
X = df2.drop(columns=["CASE_STATUS"])
y = df2["CASE_STATUS"]

In [118]:
X.dtypes

VISA_CLASS                       int64
EMPLOYER_NAME                   object
EMPLOYER_POSTAL_CODE            object
EMPLOYER_PHONE                  object
AGENT_REPRESENTING_EMPLOYER     object
AGENT_ATTORNEY_NAME             object
JOB_TITLE                       object
SOC_NAME                        object
NAICS_CODE                      object
TOTAL_WORKERS                    int64
NEW_EMPLOYMENT                   int64
CONTINUED_EMPLOYMENT             int64
CHANGE_PREVIOUS_EMPLOYMENT       int64
NEW_CONCURRENT_EMPLOYMENT        int64
CHANGE_EMPLOYER                  int64
AMENDED_PETITION                 int64
FULL_TIME_POSITION              object
PREVAILING_WAGE                float64
PW_UNIT_OF_PAY                  object
PW_WAGE_LEVEL                   object
PW_SOURCE                       object
PW_SOURCE_YEAR                 float64
PW_SOURCE_OTHER                 object
WAGE_RATE_OF_PAY_FROM          float64
WAGE_RATE_OF_PAY_TO            float64
WAGE_UNIT_OF_PAY         

In [110]:
from sklearn.model_selection import train_test_split as tts
X_train, X_test, y_train, y_test = tts(X,y, test_size = 0.3, random_state= 42)

In [130]:
# DT=DecisionTreeClassifier(max_depth=15, min_samples_leaf=100)

# DT.fit(X_train,y_train)

In [139]:
df_with_dummies= pd.get_dummies(df2,columns=["AGENT_REPRESENTING_EMPLOYER","FULL_TIME_POSITION","H1B_DEPENDENT","WILLFUL_VIOLATOR"],drop_first=False)


In [142]:
df_with_dummies.select_dtypes(include=[object])

Unnamed: 0,CASE_STATUS,EMPLOYER_NAME,EMPLOYER_POSTAL_CODE,EMPLOYER_PHONE,AGENT_ATTORNEY_NAME,JOB_TITLE,SOC_NAME,NAICS_CODE,PW_UNIT_OF_PAY,PW_WAGE_LEVEL,PW_SOURCE,PW_SOURCE_OTHER,WAGE_UNIT_OF_PAY,WORKSITE_POSTAL_CODE
0,CERTIFIED-WITHDRAWN,DISCOVER PRODUCTS INC.,60015,2244050900,"ELLSWORTH, CHAD",ASSOCIATE DATA INTEGRATION,COMPUTER SYSTEMS ANALYSTS,522210,Year,Level I,OES,OFLC ONLINE DATA CENTER,Year,60015
1,CERTIFIED-WITHDRAWN,DFS SERVICES LLC,60015,2244050900,"ELLSWORTH, CHAD",SENIOR ASSOCIATE,OPERATIONS RESEARCH ANALYSTS,522210,Year,0,Other,TOWERS WATSON DATA SERVICES 2015 CSR PROFESSIO...,Year,60015
2,CERTIFIED-WITHDRAWN,EASTBANC TECHNOLOGIES LLC,20007,2022953000,"BURKE, KAREN",.NET SOFTWARE PROGRAMMER,COMPUTER PROGRAMMERS,541511,Year,Level II,OES,OFLC ONLINE DATA CENTER,Year,20007
3,WITHDRAWN,INFO SERVICES LLC,48152,7343776007,",",PROJECT MANAGER,"COMPUTER OCCUPATIONS, ALL OTHER",541511,Year,Level III,OES,OFLC ONLINE DATA CENTER,Year,07302
4,CERTIFIED-WITHDRAWN,BB&T CORPORATION,27893,2522462031,"SCOFIELD, EILEEN",ASSOCIATE - ESOTERIC ASSET BACKED SECURITIES,CREDIT ANALYSTS,522110,Year,Level III,OES,OFLC ONLINE DATA CENTER,Year,10036
5,CERTIFIED-WITHDRAWN,"SUNTRUST BANKS, INC.",30308,4048137888,"SCOFIELD, EILEEN",CREDIT RISK METRICS SPECIALIST,"FINANCIAL SPECIALISTS, ALL OTHER",522110,Year,Level III,OES,OFLC ONLINE DATA CENTER,Year,30303
6,CERTIFIED-WITHDRAWN,CITADEL INFORMATION SERVICES INC.,08830,7322380072,",",BUSINESS SYSTEMS ANALYST,MANAGEMENT ANALYSTS,541511,Year,Level I,Other,ONLINE WAGE LIBRARY,Year,08837
7,CERTIFIED-WITHDRAWN,CITADEL INFORMATION SERVICES INC.,08830,7322380072,",",PROGRAMMER ANALYST,COMPUTER PROGRAMMERS,541511,Year,Level I,Other,ONLINE WAGE LIBRARY,Year,08837
8,CERTIFIED-WITHDRAWN,CITADEL INFORMATION SERVICES INC.,08830,7322380072,",",PROGRAMMER ANALYST,COMPUTER PROGRAMMERS,541511,Year,Level I,Other,ONLINE WAGE LIBRARY,Year,10005
9,CERTIFIED-WITHDRAWN,CITADEL INFORMATION SERVICES INC.,08830,7322380072,",",MARKET RESEARCH ANALYST,MARKET RESEARCH ANALYSTS AND MARKETING SPECIAL...,541511,Year,Level I,Other,ONLINE WAGE LIBRARY,Year,08830


In [137]:
df2.select_dtypes(include=[object])

Unnamed: 0,CASE_STATUS,EMPLOYER_NAME,EMPLOYER_POSTAL_CODE,EMPLOYER_PHONE,AGENT_REPRESENTING_EMPLOYER,AGENT_ATTORNEY_NAME,JOB_TITLE,SOC_NAME,NAICS_CODE,FULL_TIME_POSITION,PW_UNIT_OF_PAY,PW_WAGE_LEVEL,PW_SOURCE,PW_SOURCE_OTHER,WAGE_UNIT_OF_PAY,H1B_DEPENDENT,WILLFUL_VIOLATOR,WORKSITE_POSTAL_CODE
0,CERTIFIED-WITHDRAWN,DISCOVER PRODUCTS INC.,60015,2244050900,Y,"ELLSWORTH, CHAD",ASSOCIATE DATA INTEGRATION,COMPUTER SYSTEMS ANALYSTS,522210,Y,Year,Level I,OES,OFLC ONLINE DATA CENTER,Year,N,N,60015
1,CERTIFIED-WITHDRAWN,DFS SERVICES LLC,60015,2244050900,Y,"ELLSWORTH, CHAD",SENIOR ASSOCIATE,OPERATIONS RESEARCH ANALYSTS,522210,Y,Year,0,Other,TOWERS WATSON DATA SERVICES 2015 CSR PROFESSIO...,Year,N,N,60015
2,CERTIFIED-WITHDRAWN,EASTBANC TECHNOLOGIES LLC,20007,2022953000,Y,"BURKE, KAREN",.NET SOFTWARE PROGRAMMER,COMPUTER PROGRAMMERS,541511,Y,Year,Level II,OES,OFLC ONLINE DATA CENTER,Year,Y,N,20007
3,WITHDRAWN,INFO SERVICES LLC,48152,7343776007,N,",",PROJECT MANAGER,"COMPUTER OCCUPATIONS, ALL OTHER",541511,Y,Year,Level III,OES,OFLC ONLINE DATA CENTER,Year,Y,N,07302
4,CERTIFIED-WITHDRAWN,BB&T CORPORATION,27893,2522462031,Y,"SCOFIELD, EILEEN",ASSOCIATE - ESOTERIC ASSET BACKED SECURITIES,CREDIT ANALYSTS,522110,Y,Year,Level III,OES,OFLC ONLINE DATA CENTER,Year,N,N,10036
5,CERTIFIED-WITHDRAWN,"SUNTRUST BANKS, INC.",30308,4048137888,Y,"SCOFIELD, EILEEN",CREDIT RISK METRICS SPECIALIST,"FINANCIAL SPECIALISTS, ALL OTHER",522110,Y,Year,Level III,OES,OFLC ONLINE DATA CENTER,Year,N,N,30303
6,CERTIFIED-WITHDRAWN,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",BUSINESS SYSTEMS ANALYST,MANAGEMENT ANALYSTS,541511,Y,Year,Level I,Other,ONLINE WAGE LIBRARY,Year,Y,N,08837
7,CERTIFIED-WITHDRAWN,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",PROGRAMMER ANALYST,COMPUTER PROGRAMMERS,541511,Y,Year,Level I,Other,ONLINE WAGE LIBRARY,Year,Y,N,08837
8,CERTIFIED-WITHDRAWN,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",PROGRAMMER ANALYST,COMPUTER PROGRAMMERS,541511,Y,Year,Level I,Other,ONLINE WAGE LIBRARY,Year,Y,N,10005
9,CERTIFIED-WITHDRAWN,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",MARKET RESEARCH ANALYST,MARKET RESEARCH ANALYSTS AND MARKETING SPECIAL...,541511,Y,Year,Level I,Other,ONLINE WAGE LIBRARY,Year,Y,N,08830


In [138]:
df2.select_dtypes(exclude=[object])

Unnamed: 0,VISA_CLASS,TOTAL_WORKERS,NEW_EMPLOYMENT,CONTINUED_EMPLOYMENT,CHANGE_PREVIOUS_EMPLOYMENT,NEW_CONCURRENT_EMPLOYMENT,CHANGE_EMPLOYER,AMENDED_PETITION,PREVAILING_WAGE,PW_SOURCE_YEAR,WAGE_RATE_OF_PAY_FROM,WAGE_RATE_OF_PAY_TO,Time_to_Review,Emp_Time
0,1,1,1,0,0,0,0,0,59197.00,2015.0,65811.00,67320.0,220.0,1095.0
1,1,1,1,0,0,0,0,0,49800.00,2015.0,53000.00,57200.0,211.0,1095.0
2,1,2,2,0,0,0,0,0,76502.00,2015.0,77000.00,0.0,205.0,1095.0
3,1,1,1,0,0,0,0,0,90376.00,2016.0,102000.00,0.0,3.0,1094.0
4,1,1,0,0,0,0,1,0,116605.00,2015.0,132500.00,0.0,588.0,1096.0
5,1,1,1,0,0,0,0,0,59405.00,2015.0,71750.00,0.0,570.0,1096.0
6,1,1,1,0,0,0,0,0,52915.00,2015.0,61000.00,0.0,220.0,1094.0
7,1,1,1,0,0,0,0,0,51730.00,2015.0,60500.00,0.0,220.0,1094.0
8,1,1,1,0,0,0,0,0,58053.00,2015.0,60450.00,0.0,218.0,1094.0
9,1,1,1,0,0,0,0,0,46821.00,2015.0,50000.00,0.0,217.0,1092.0


In [143]:
df2

Unnamed: 0,CASE_STATUS,VISA_CLASS,EMPLOYER_NAME,EMPLOYER_POSTAL_CODE,EMPLOYER_PHONE,AGENT_REPRESENTING_EMPLOYER,AGENT_ATTORNEY_NAME,JOB_TITLE,SOC_NAME,NAICS_CODE,...,PW_SOURCE_YEAR,PW_SOURCE_OTHER,WAGE_RATE_OF_PAY_FROM,WAGE_RATE_OF_PAY_TO,WAGE_UNIT_OF_PAY,H1B_DEPENDENT,WILLFUL_VIOLATOR,WORKSITE_POSTAL_CODE,Time_to_Review,Emp_Time
0,CERTIFIED-WITHDRAWN,1,DISCOVER PRODUCTS INC.,60015,2244050900,Y,"ELLSWORTH, CHAD",ASSOCIATE DATA INTEGRATION,COMPUTER SYSTEMS ANALYSTS,522210,...,2015.0,OFLC ONLINE DATA CENTER,65811.00,67320.0,Year,N,N,60015,220.0,1095.0
1,CERTIFIED-WITHDRAWN,1,DFS SERVICES LLC,60015,2244050900,Y,"ELLSWORTH, CHAD",SENIOR ASSOCIATE,OPERATIONS RESEARCH ANALYSTS,522210,...,2015.0,TOWERS WATSON DATA SERVICES 2015 CSR PROFESSIO...,53000.00,57200.0,Year,N,N,60015,211.0,1095.0
2,CERTIFIED-WITHDRAWN,1,EASTBANC TECHNOLOGIES LLC,20007,2022953000,Y,"BURKE, KAREN",.NET SOFTWARE PROGRAMMER,COMPUTER PROGRAMMERS,541511,...,2015.0,OFLC ONLINE DATA CENTER,77000.00,0.0,Year,Y,N,20007,205.0,1095.0
3,WITHDRAWN,1,INFO SERVICES LLC,48152,7343776007,N,",",PROJECT MANAGER,"COMPUTER OCCUPATIONS, ALL OTHER",541511,...,2016.0,OFLC ONLINE DATA CENTER,102000.00,0.0,Year,Y,N,07302,3.0,1094.0
4,CERTIFIED-WITHDRAWN,1,BB&T CORPORATION,27893,2522462031,Y,"SCOFIELD, EILEEN",ASSOCIATE - ESOTERIC ASSET BACKED SECURITIES,CREDIT ANALYSTS,522110,...,2015.0,OFLC ONLINE DATA CENTER,132500.00,0.0,Year,N,N,10036,588.0,1096.0
5,CERTIFIED-WITHDRAWN,1,"SUNTRUST BANKS, INC.",30308,4048137888,Y,"SCOFIELD, EILEEN",CREDIT RISK METRICS SPECIALIST,"FINANCIAL SPECIALISTS, ALL OTHER",522110,...,2015.0,OFLC ONLINE DATA CENTER,71750.00,0.0,Year,N,N,30303,570.0,1096.0
6,CERTIFIED-WITHDRAWN,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",BUSINESS SYSTEMS ANALYST,MANAGEMENT ANALYSTS,541511,...,2015.0,ONLINE WAGE LIBRARY,61000.00,0.0,Year,Y,N,08837,220.0,1094.0
7,CERTIFIED-WITHDRAWN,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",PROGRAMMER ANALYST,COMPUTER PROGRAMMERS,541511,...,2015.0,ONLINE WAGE LIBRARY,60500.00,0.0,Year,Y,N,08837,220.0,1094.0
8,CERTIFIED-WITHDRAWN,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",PROGRAMMER ANALYST,COMPUTER PROGRAMMERS,541511,...,2015.0,ONLINE WAGE LIBRARY,60450.00,0.0,Year,Y,N,10005,218.0,1094.0
9,CERTIFIED-WITHDRAWN,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",MARKET RESEARCH ANALYST,MARKET RESEARCH ANALYSTS AND MARKETING SPECIAL...,541511,...,2015.0,ONLINE WAGE LIBRARY,50000.00,0.0,Year,Y,N,08830,217.0,1092.0


In [144]:
X

Unnamed: 0,VISA_CLASS,EMPLOYER_NAME,EMPLOYER_POSTAL_CODE,EMPLOYER_PHONE,AGENT_REPRESENTING_EMPLOYER,AGENT_ATTORNEY_NAME,JOB_TITLE,SOC_NAME,NAICS_CODE,TOTAL_WORKERS,...,PW_SOURCE_YEAR,PW_SOURCE_OTHER,WAGE_RATE_OF_PAY_FROM,WAGE_RATE_OF_PAY_TO,WAGE_UNIT_OF_PAY,H1B_DEPENDENT,WILLFUL_VIOLATOR,WORKSITE_POSTAL_CODE,Time_to_Review,Emp_Time
0,1,DISCOVER PRODUCTS INC.,60015,2244050900,Y,"ELLSWORTH, CHAD",ASSOCIATE DATA INTEGRATION,COMPUTER SYSTEMS ANALYSTS,522210,1,...,2015.0,OFLC ONLINE DATA CENTER,65811.00,67320.0,Year,N,N,60015,220.0,1095.0
1,1,DFS SERVICES LLC,60015,2244050900,Y,"ELLSWORTH, CHAD",SENIOR ASSOCIATE,OPERATIONS RESEARCH ANALYSTS,522210,1,...,2015.0,TOWERS WATSON DATA SERVICES 2015 CSR PROFESSIO...,53000.00,57200.0,Year,N,N,60015,211.0,1095.0
2,1,EASTBANC TECHNOLOGIES LLC,20007,2022953000,Y,"BURKE, KAREN",.NET SOFTWARE PROGRAMMER,COMPUTER PROGRAMMERS,541511,2,...,2015.0,OFLC ONLINE DATA CENTER,77000.00,0.0,Year,Y,N,20007,205.0,1095.0
3,1,INFO SERVICES LLC,48152,7343776007,N,",",PROJECT MANAGER,"COMPUTER OCCUPATIONS, ALL OTHER",541511,1,...,2016.0,OFLC ONLINE DATA CENTER,102000.00,0.0,Year,Y,N,07302,3.0,1094.0
4,1,BB&T CORPORATION,27893,2522462031,Y,"SCOFIELD, EILEEN",ASSOCIATE - ESOTERIC ASSET BACKED SECURITIES,CREDIT ANALYSTS,522110,1,...,2015.0,OFLC ONLINE DATA CENTER,132500.00,0.0,Year,N,N,10036,588.0,1096.0
5,1,"SUNTRUST BANKS, INC.",30308,4048137888,Y,"SCOFIELD, EILEEN",CREDIT RISK METRICS SPECIALIST,"FINANCIAL SPECIALISTS, ALL OTHER",522110,1,...,2015.0,OFLC ONLINE DATA CENTER,71750.00,0.0,Year,N,N,30303,570.0,1096.0
6,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",BUSINESS SYSTEMS ANALYST,MANAGEMENT ANALYSTS,541511,1,...,2015.0,ONLINE WAGE LIBRARY,61000.00,0.0,Year,Y,N,08837,220.0,1094.0
7,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",PROGRAMMER ANALYST,COMPUTER PROGRAMMERS,541511,1,...,2015.0,ONLINE WAGE LIBRARY,60500.00,0.0,Year,Y,N,08837,220.0,1094.0
8,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",PROGRAMMER ANALYST,COMPUTER PROGRAMMERS,541511,1,...,2015.0,ONLINE WAGE LIBRARY,60450.00,0.0,Year,Y,N,10005,218.0,1094.0
9,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",MARKET RESEARCH ANALYST,MARKET RESEARCH ANALYSTS AND MARKETING SPECIAL...,541511,1,...,2015.0,ONLINE WAGE LIBRARY,50000.00,0.0,Year,Y,N,08830,217.0,1092.0


In [145]:
obj_col = X.columns[X.dtypes == 'object']

In [146]:
obj_col

Index(['EMPLOYER_NAME', 'EMPLOYER_POSTAL_CODE', 'EMPLOYER_PHONE',
       'AGENT_REPRESENTING_EMPLOYER', 'AGENT_ATTORNEY_NAME', 'JOB_TITLE',
       'SOC_NAME', 'NAICS_CODE', 'FULL_TIME_POSITION', 'PW_UNIT_OF_PAY',
       'PW_WAGE_LEVEL', 'PW_SOURCE', 'PW_SOURCE_OTHER', 'WAGE_UNIT_OF_PAY',
       'H1B_DEPENDENT', 'WILLFUL_VIOLATOR', 'WORKSITE_POSTAL_CODE'],
      dtype='object')

In [150]:
# for i in obj_col:
#     try:
#         X[i] = le.fit_transform(X[i])
#         print(i)
#     except:
#         print('not categorical')

In [162]:
X = df2.drop(columns=["CASE_STATUS"])
y = df2["CASE_STATUS"]

In [160]:
for i in X.select_dtypes(include=[object]).columns:
    try:
        X[i]=le.fit_transform(X[i])
        print(i)
    except:
        print('FO')

FO
FO
FO
FO
AGENT_ATTORNEY_NAME
FO
FO
FO
FO
FO
FO
FO
FO
FO
FO
FO
FO


In [161]:
X

Unnamed: 0,VISA_CLASS,EMPLOYER_NAME,EMPLOYER_POSTAL_CODE,EMPLOYER_PHONE,AGENT_REPRESENTING_EMPLOYER,AGENT_ATTORNEY_NAME,JOB_TITLE,SOC_NAME,NAICS_CODE,TOTAL_WORKERS,...,PW_SOURCE_YEAR,PW_SOURCE_OTHER,WAGE_RATE_OF_PAY_FROM,WAGE_RATE_OF_PAY_TO,WAGE_UNIT_OF_PAY,H1B_DEPENDENT,WILLFUL_VIOLATOR,WORKSITE_POSTAL_CODE,Time_to_Review,Emp_Time
0,1,DISCOVER PRODUCTS INC.,60015,2244050900,Y,1580,ASSOCIATE DATA INTEGRATION,COMPUTER SYSTEMS ANALYSTS,522210,1,...,2015.0,OFLC ONLINE DATA CENTER,65811.00,67320.0,Year,N,N,60015,220.0,1095.0
1,1,DFS SERVICES LLC,60015,2244050900,Y,1580,SENIOR ASSOCIATE,OPERATIONS RESEARCH ANALYSTS,522210,1,...,2015.0,TOWERS WATSON DATA SERVICES 2015 CSR PROFESSIO...,53000.00,57200.0,Year,N,N,60015,211.0,1095.0
2,1,EASTBANC TECHNOLOGIES LLC,20007,2022953000,Y,699,.NET SOFTWARE PROGRAMMER,COMPUTER PROGRAMMERS,541511,2,...,2015.0,OFLC ONLINE DATA CENTER,77000.00,0.0,Year,Y,N,20007,205.0,1095.0
3,1,INFO SERVICES LLC,48152,7343776007,N,0,PROJECT MANAGER,"COMPUTER OCCUPATIONS, ALL OTHER",541511,1,...,2016.0,OFLC ONLINE DATA CENTER,102000.00,0.0,Year,Y,N,07302,3.0,1094.0
4,1,BB&T CORPORATION,27893,2522462031,Y,5254,ASSOCIATE - ESOTERIC ASSET BACKED SECURITIES,CREDIT ANALYSTS,522110,1,...,2015.0,OFLC ONLINE DATA CENTER,132500.00,0.0,Year,N,N,10036,588.0,1096.0
5,1,"SUNTRUST BANKS, INC.",30308,4048137888,Y,5254,CREDIT RISK METRICS SPECIALIST,"FINANCIAL SPECIALISTS, ALL OTHER",522110,1,...,2015.0,OFLC ONLINE DATA CENTER,71750.00,0.0,Year,N,N,30303,570.0,1096.0
6,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,0,BUSINESS SYSTEMS ANALYST,MANAGEMENT ANALYSTS,541511,1,...,2015.0,ONLINE WAGE LIBRARY,61000.00,0.0,Year,Y,N,08837,220.0,1094.0
7,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,0,PROGRAMMER ANALYST,COMPUTER PROGRAMMERS,541511,1,...,2015.0,ONLINE WAGE LIBRARY,60500.00,0.0,Year,Y,N,08837,220.0,1094.0
8,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,0,PROGRAMMER ANALYST,COMPUTER PROGRAMMERS,541511,1,...,2015.0,ONLINE WAGE LIBRARY,60450.00,0.0,Year,Y,N,10005,218.0,1094.0
9,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,0,MARKET RESEARCH ANALYST,MARKET RESEARCH ANALYSTS AND MARKETING SPECIAL...,541511,1,...,2015.0,ONLINE WAGE LIBRARY,50000.00,0.0,Year,Y,N,08830,217.0,1092.0


In [163]:
X

Unnamed: 0,VISA_CLASS,EMPLOYER_NAME,EMPLOYER_POSTAL_CODE,EMPLOYER_PHONE,AGENT_REPRESENTING_EMPLOYER,AGENT_ATTORNEY_NAME,JOB_TITLE,SOC_NAME,NAICS_CODE,TOTAL_WORKERS,...,PW_SOURCE_YEAR,PW_SOURCE_OTHER,WAGE_RATE_OF_PAY_FROM,WAGE_RATE_OF_PAY_TO,WAGE_UNIT_OF_PAY,H1B_DEPENDENT,WILLFUL_VIOLATOR,WORKSITE_POSTAL_CODE,Time_to_Review,Emp_Time
0,1,DISCOVER PRODUCTS INC.,60015,2244050900,Y,"ELLSWORTH, CHAD",ASSOCIATE DATA INTEGRATION,COMPUTER SYSTEMS ANALYSTS,522210,1,...,2015.0,OFLC ONLINE DATA CENTER,65811.00,67320.0,Year,N,N,60015,220.0,1095.0
1,1,DFS SERVICES LLC,60015,2244050900,Y,"ELLSWORTH, CHAD",SENIOR ASSOCIATE,OPERATIONS RESEARCH ANALYSTS,522210,1,...,2015.0,TOWERS WATSON DATA SERVICES 2015 CSR PROFESSIO...,53000.00,57200.0,Year,N,N,60015,211.0,1095.0
2,1,EASTBANC TECHNOLOGIES LLC,20007,2022953000,Y,"BURKE, KAREN",.NET SOFTWARE PROGRAMMER,COMPUTER PROGRAMMERS,541511,2,...,2015.0,OFLC ONLINE DATA CENTER,77000.00,0.0,Year,Y,N,20007,205.0,1095.0
3,1,INFO SERVICES LLC,48152,7343776007,N,",",PROJECT MANAGER,"COMPUTER OCCUPATIONS, ALL OTHER",541511,1,...,2016.0,OFLC ONLINE DATA CENTER,102000.00,0.0,Year,Y,N,07302,3.0,1094.0
4,1,BB&T CORPORATION,27893,2522462031,Y,"SCOFIELD, EILEEN",ASSOCIATE - ESOTERIC ASSET BACKED SECURITIES,CREDIT ANALYSTS,522110,1,...,2015.0,OFLC ONLINE DATA CENTER,132500.00,0.0,Year,N,N,10036,588.0,1096.0
5,1,"SUNTRUST BANKS, INC.",30308,4048137888,Y,"SCOFIELD, EILEEN",CREDIT RISK METRICS SPECIALIST,"FINANCIAL SPECIALISTS, ALL OTHER",522110,1,...,2015.0,OFLC ONLINE DATA CENTER,71750.00,0.0,Year,N,N,30303,570.0,1096.0
6,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",BUSINESS SYSTEMS ANALYST,MANAGEMENT ANALYSTS,541511,1,...,2015.0,ONLINE WAGE LIBRARY,61000.00,0.0,Year,Y,N,08837,220.0,1094.0
7,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",PROGRAMMER ANALYST,COMPUTER PROGRAMMERS,541511,1,...,2015.0,ONLINE WAGE LIBRARY,60500.00,0.0,Year,Y,N,08837,220.0,1094.0
8,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",PROGRAMMER ANALYST,COMPUTER PROGRAMMERS,541511,1,...,2015.0,ONLINE WAGE LIBRARY,60450.00,0.0,Year,Y,N,10005,218.0,1094.0
9,1,CITADEL INFORMATION SERVICES INC.,08830,7322380072,N,",",MARKET RESEARCH ANALYST,MARKET RESEARCH ANALYSTS AND MARKETING SPECIAL...,541511,1,...,2015.0,ONLINE WAGE LIBRARY,50000.00,0.0,Year,Y,N,08830,217.0,1092.0
