# Exploratory Data Analysis 

### Problem Statement:
- To handle the challenges in importing "job_postings" table to the MySQL WOrkbench.
- To analyse the 32 problematic rows from a total of 15887 row.

### Task Description:

   - Job posting table is the part of the LinkedIn dataset containing 15887 rows.

  - While importing the file to the MySQL Workbench, encountered issues such as file with unreadable ASCII characters , improper column separator, NULL rows, improper datatypes etc.

 - In order to import the table successfully it needs to be cleaned to Handling the issues persisting in the 32 rows for effective data analysis.
 
 ### Solution:
 
 - For the successful table import in the MySQL workbench, handled the following:
     - Analyse the Log files generated by MySQL workbench while attempting to import
     - Removal of unnecessary ASCIIs
     - Handling of missing data (NULL values) by assigning a meaningful/most relevant value
     - Analysed the datatype on the raw dataset and type casted wherever necessary.

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import csv
import re

###  Running the below code will generate and download 5 csv files to your local machine

In [2]:
input_file = 'job_posting_RowsHavingIssueInImport.csv'
df = pd.read_csv(input_file)

total_rows = len(df)
rows_per_part = total_rows // 5

parts = []
for i in range(5):
    start_idx = i * rows_per_part
    end_idx = (i + 1) * rows_per_part if i < 4 else None
    part = df.iloc[start_idx:end_idx]
    parts.append(part)

for i, part_df in enumerate(parts):
    output_file = f'JobPostings_5part_{i + 1}.csv'
    part_df.to_csv(output_file, index=False)

print("CSV file split into 5 parts evenly.")


CSV file split into 5 parts evenly.


### Handling the error on the posting_domain column row1

In [3]:
# Checking on the job_postings table from the raw dataset.
df2 = pd.read_csv('job_postings_raw.csv')
partial_value_filter = 'Regal' 
filtered_df = df2[df2['posting_domain'].str.contains(partial_value_filter, case=False, na=False)]
filtered_df

Unnamed: 0,job_id,company_id,title,description,max_salary,med_salary,min_salary,pay_period,formatted_work_type,location,...,expiry,closed_time,formatted_experience_level,skills_desc,listed_time,posting_domain,sponsored,work_type,currency,compensation_type
7680,3697350683,80911138.0,Buyer/Planner I,Position Summary\nThe Regal Rexnord Couplings ...,,,,,Full-time,"Florence, KY",...,1700000000000.0,,Entry level,,1690000000000.0,regalrexnord.wd1.myworkdayjobs.com,0,FULL_TIME,,
7703,3697351605,80911138.0,Data Analyst III,Provides analysis to different functions such ...,,,,,Full-time,"Chicago, IL",...,1700000000000.0,,Mid-Senior level,,1690000000000.0,regalrexnord.wd1.myworkdayjobs.com,0,FULL_TIME,,
7738,3697353299,80911138.0,"Manager, Site Controller",Key Accountabilities\nPerform all FP&A reporti...,,,,,Full-time,"Stuarts Draft, VA",...,1700000000000.0,,Mid-Senior level,,1690000000000.0,regalrexnord.wd1.myworkdayjobs.com,0,FULL_TIME,,
7739,3697353300,80911138.0,Assembler,Position Summary\nDisassemble custom and/or st...,,,,,Full-time,"Milwaukee, WI",...,1700000000000.0,,Entry level,,1690000000000.0,regalrexnord.wd1.myworkdayjobs.com,1,FULL_TIME,,
7781,3697354194,80911138.0,Fulfillment Center Operations Manager,The Fulfillment Center Operations Manager I in...,,,,,Full-time,"Indianapolis, IN",...,1700000000000.0,,Mid-Senior level,,1690000000000.0,regalrexnord.wd1.myworkdayjobs.com,0,FULL_TIME,,
7783,3697354197,80911138.0,"Global Category Manager, Laminations",Scope of Role:\nThis position reports to the D...,,,,,Full-time,"Rosemont, IL",...,1700000000000.0,,Mid-Senior level,,1690000000000.0,regalrexnord.wd1.myworkdayjobs.com,0,FULL_TIME,,
7823,3697355154,80911138.0,Sr. Financial Analyst,Summary\nReporting to the Thomson Director FP&...,,,,,Full-time,"Marengo, IL",...,1700000000000.0,,Mid-Senior level,,1690000000000.0,regalrexnord.wd1.myworkdayjobs.com,0,FULL_TIME,,
11406,3699098832,80911138.0,Area Account Manager - Dallas,This position will be responsible for working ...,,,,,Full-time,"Dallas, TX",...,1700000000000.0,,Mid-Senior level,,1690000000000.0,regalrexnord.wd1.myworkdayjobs.com,0,FULL_TIME,,
11407,3699098833,80911138.0,Area Account Manager - Baltimore,This position will be responsible for working ...,,,,,Full-time,"Baltimore, MD",...,1700000000000.0,,Mid-Senior level,,1690000000000.0,regalrexnord.wd1.myworkdayjobs.com,0,FULL_TIME,,
11547,3699403577,80911138.0,Inside Sales Specialist - OEM,The Inside Sales Specialist is responsible for...,,,,,Full-time,"Florence, KY",...,1700000000000.0,,Entry level,,1690000000000.0,regalrexnord.wd1.myworkdayjobs.com,0,FULL_TIME,,


In [4]:
# Applying the changes to the 32 rows issues causing csv file (job_posting_RowsHavingIssueInImport).

row_index = 0
column_name = 'posting_domain'
new_value = 'www.regalrexnord.wd1.myworkdayjobs.com'
df.at[row_index, column_name] = new_value
print("\nDataFrame with Updated Cell:")
print(df)


DataFrame with Updated Cell:
        job_id  company_id                                              title  \
0   3697354194    80911138              Fulfillment Center Operations Manager   
1   3693583923     3681497                        OFFICE TECHNICIAN (GENERAL)   
2   3693584867     3681497                        OFFICE TECHNICIAN (GENERAL)   
3   3693585813     3681497                        OFFICE TECHNICIAN (GENERAL)   
4   3693590054     3681497                        OFFICE TECHNICIAN (GENERAL)   
5   3694100381     2972588  ***Chatham Hiring Event*** Youth Counselor - A...   
6   3697352992      164885                                   Project Engineer   
7   3697359103      164885               Engineer - Power Systems Maintenance   
8   3699076815       88016  Patient Access Associate - Full Time Evenings ...   
9   3699078675       88016  Registered Nurse Medical Surgical Telemetry Pa...   
10  3699078713       88016  Registered Nurse Medical Surgical Telemetry FT...  

In [5]:
df.head()

Unnamed: 0,job_id,company_id,title,description,max_salary,med_salary,min_salary,pay_period,formatted_work_type,location,...,expiry,closed_time,formatted_experience_level,skills_desc,listed_time,posting_domain,sponsored,work_type,currency,compensation_type
0,3697354194,80911138,Fulfillment Center Operations Manager,The Fulfillment Center Operations Manager I in...,0,0.0,0,,Full-time,"Indianapolis, IN",...,1700000000000.0,0,Mid-Senior level,,1690000000000.0,www.regalrexnord.wd1.myworkdayjobs.com,0,FULL_TIME,,
1,3693583923,3681497,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145,0.0,3308,MONTHLY,Full-time,"Glendale, CA",...,1700000000000.0,0,Entry level,,1690000000000.0,www.calcareers.ca.gov,0,FULL_TIME,USD,BASE_SALARY
2,3693584867,3681497,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145,0.0,3308,MONTHLY,Full-time,"Long Beach, CA",...,1700000000000.0,0,Entry level,,1690000000000.0,www.calcareers.ca.gov,0,FULL_TIME,USD,BASE_SALARY
3,3693585813,3681497,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145,0.0,3308,MONTHLY,Full-time,"Los Angeles, CA",...,1700000000000.0,0,Entry level,,1690000000000.0,www.calcareers.ca.gov,0,FULL_TIME,USD,BASE_SALARY
4,3693590054,3681497,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145,0.0,3308,MONTHLY,Full-time,"Los Angeles, California, United States",...,1700000000000.0,0,Entry level,,1690000000000.0,www.calcareers.ca.gov,0,FULL_TIME,USD,BASE_SALARY


In [6]:
df['application_url']

0     https://regalrexnord.wd1.myworkdayjobs.com/Car...
1     https://www.calcareers.ca.gov/CalHrPublic/Jobs...
2     https://www.calcareers.ca.gov/CalHrPublic/Jobs...
3     https://www.calcareers.ca.gov/CalHrPublic/Jobs...
4     https://www.calcareers.ca.gov/CalHrPublic/Jobs...
5     https://www.governmentjobs.com/careers/northca...
6     https://www.governmentjobs.com/careers/mbta/jo...
7     https://www.governmentjobs.com/careers/mbta/jo...
8     https://jobs.silkroad.com/licommunityhospital/...
9     https://jobs.silkroad.com/licommunityhospital/...
10    https://jobs.silkroad.com/licommunityhospital/...
11    https://jobs.silkroad.com/licommunityhospital/...
12    https://jobs.silkroad.com/licommunityhospital/...
13    https://jobs.silkroad.com/licommunityhospital/...
14    https://jobs.silkroad.com/licommunityhospital/...
15    https://jobs.silkroad.com/licommunityhospital/...
16    https://jobs.silkroad.com/licommunityhospital/...
17    https://jobs.silkroad.com/licommunityhospi

In [7]:
# Applying the changes to the 32 rows issues causing csv file (job_posting_RowsHavingIssueInImport).

row_index = 0
column_name = 'application_url'
new_value = 'https://www.regalrexnord.wd1.myworkdayjobs.com/Car...'
df.at[row_index, column_name] = new_value
print("\nDataFrame with Updated Cell")



DataFrame with Updated Cell


In [8]:
df['application_url']

0     https://www.regalrexnord.wd1.myworkdayjobs.com...
1     https://www.calcareers.ca.gov/CalHrPublic/Jobs...
2     https://www.calcareers.ca.gov/CalHrPublic/Jobs...
3     https://www.calcareers.ca.gov/CalHrPublic/Jobs...
4     https://www.calcareers.ca.gov/CalHrPublic/Jobs...
5     https://www.governmentjobs.com/careers/northca...
6     https://www.governmentjobs.com/careers/mbta/jo...
7     https://www.governmentjobs.com/careers/mbta/jo...
8     https://jobs.silkroad.com/licommunityhospital/...
9     https://jobs.silkroad.com/licommunityhospital/...
10    https://jobs.silkroad.com/licommunityhospital/...
11    https://jobs.silkroad.com/licommunityhospital/...
12    https://jobs.silkroad.com/licommunityhospital/...
13    https://jobs.silkroad.com/licommunityhospital/...
14    https://jobs.silkroad.com/licommunityhospital/...
15    https://jobs.silkroad.com/licommunityhospital/...
16    https://jobs.silkroad.com/licommunityhospital/...
17    https://jobs.silkroad.com/licommunityhospi

In [9]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32 entries, 0 to 31
Data columns (total 27 columns):
 #   Column                      Non-Null Count  Dtype  
---  ------                      --------------  -----  
 0   job_id                      32 non-null     int64  
 1   company_id                  32 non-null     int64  
 2   title                       32 non-null     object 
 3   description                 32 non-null     object 
 4   max_salary                  32 non-null     int64  
 5   med_salary                  32 non-null     float64
 6   min_salary                  32 non-null     int64  
 7   pay_period                  14 non-null     object 
 8   formatted_work_type         32 non-null     object 
 9   location                    32 non-null     object 
 10  applies                     32 non-null     int64  
 11  original_listed_time        32 non-null     float64
 12  remote_allowed              32 non-null     int64  
 13  views                       32 non-nu

In [10]:
df['max_salary']

0          0
1       4145
2       4145
3       4145
4       4145
5      64194
6          0
7          0
8          0
9          0
10         0
11         0
12         0
13         0
14         0
15         0
16         0
17         0
18         0
19         0
20         0
21         0
22    143175
23        67
24         0
25    133120
26         0
27    101764
28         0
29         0
30    150000
31    191000
Name: max_salary, dtype: int64

In [11]:
df2 = pd.read_csv('job_postings_raw.csv')

In [12]:
df2

Unnamed: 0,job_id,company_id,title,description,max_salary,med_salary,min_salary,pay_period,formatted_work_type,location,...,expiry,closed_time,formatted_experience_level,skills_desc,listed_time,posting_domain,sponsored,work_type,currency,compensation_type
0,85008768,,Licensed Insurance Agent,While many industries were hurt by the last fe...,52000.0,,45760.0,YEARLY,Full-time,"Chico, CA",...,1.710000e+12,,,,1.690000e+12,,1,FULL_TIME,USD,BASE_SALARY
1,133114754,77766802.0,Sales Manager,Are you a dynamic and creative marketing profe...,,,,,Full-time,"Santa Clarita, CA",...,1.700000e+12,,,,1.690000e+12,,0,FULL_TIME,,
2,133196985,1089558.0,Model Risk Auditor,Join Us as a Model Risk Auditor – Showcase You...,,,,,Contract,"New York, NY",...,1.700000e+12,,,,1.690000e+12,,0,CONTRACT,,
3,381055942,96654609.0,Business Manager,Business ManagerFirst Baptist Church ForneyFor...,,,,,Full-time,"Forney, TX",...,1.700000e+12,,,,1.690000e+12,,0,FULL_TIME,,
4,529257371,1244539.0,NY Studio Assistant,YOU COULD BE ONE OF THE MAGIC MAKERS\nKen Fulk...,,,,,Full-time,"New York, NY",...,1.710000e+12,,,,1.690000e+12,,1,FULL_TIME,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
15881,3701373516,74718032.0,Sanitation Technician,"Location:\n\nWest Columbia, SC, US, 29172\n\n2...",,,,,Part-time,"West Columbia, SC",...,1.700000e+12,,Entry level,,1.690000e+12,aspirebakeriescareers.com,0,PART_TIME,,
15882,3701373522,38897.0,Unit Secretary,Job Title: Unit Secretary\nDepartment: Nursing...,,,,,Full-time,"Teaneck, NJ",...,1.700000e+12,,Entry level,,1.690000e+12,recruiting.ultipro.com,0,FULL_TIME,,
15883,3701373523,38897.0,"Radiology Aide, Perdiem","Job Title: Radiology Aide, Perdiem\nDepartment...",,,,,Part-time,"Teaneck, NJ",...,1.700000e+12,,Entry level,,1.690000e+12,recruiting.ultipro.com,0,PART_TIME,,
15884,3701373524,2623.0,MRI Manager,Grade 105\nJob Type: Officer of Administration...,135000.0,,110000.0,YEARLY,Full-time,"New York, NY",...,1.700000e+12,,Mid-Senior level,,1.690000e+12,opportunities.columbia.edu,0,FULL_TIME,USD,BASE_SALARY


In [13]:
df2['med_salary'].mode()

0    201182.0
Name: med_salary, dtype: float64

In [14]:
df['skills_desc'] = df['skills_desc'].fillna('NaN')

In [15]:
df

Unnamed: 0,job_id,company_id,title,description,max_salary,med_salary,min_salary,pay_period,formatted_work_type,location,...,expiry,closed_time,formatted_experience_level,skills_desc,listed_time,posting_domain,sponsored,work_type,currency,compensation_type
0,3697354194,80911138,Fulfillment Center Operations Manager,The Fulfillment Center Operations Manager I in...,0,0.0,0,,Full-time,"Indianapolis, IN",...,1700000000000.0,0,Mid-Senior level,,1690000000000.0,www.regalrexnord.wd1.myworkdayjobs.com,0,FULL_TIME,,
1,3693583923,3681497,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145,0.0,3308,MONTHLY,Full-time,"Glendale, CA",...,1700000000000.0,0,Entry level,,1690000000000.0,www.calcareers.ca.gov,0,FULL_TIME,USD,BASE_SALARY
2,3693584867,3681497,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145,0.0,3308,MONTHLY,Full-time,"Long Beach, CA",...,1700000000000.0,0,Entry level,,1690000000000.0,www.calcareers.ca.gov,0,FULL_TIME,USD,BASE_SALARY
3,3693585813,3681497,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145,0.0,3308,MONTHLY,Full-time,"Los Angeles, CA",...,1700000000000.0,0,Entry level,,1690000000000.0,www.calcareers.ca.gov,0,FULL_TIME,USD,BASE_SALARY
4,3693590054,3681497,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145,0.0,3308,MONTHLY,Full-time,"Los Angeles, California, United States",...,1700000000000.0,0,Entry level,,1690000000000.0,www.calcareers.ca.gov,0,FULL_TIME,USD,BASE_SALARY
5,3694100381,2972588,***Chatham Hiring Event*** Youth Counselor - A...,"Description of Work\nAs of August 16, 2023, th...",64194,0.0,44500,YEARLY,Full-time,"Chatham County, NC",...,1700000000000.0,0,Mid-Senior level,,1690000000000.0,www.governmentjobs.com,0,FULL_TIME,USD,BASE_SALARY
6,3697352992,164885,Project Engineer,"At the MBTA, we envision a thriving region ena...",0,0.0,0,,Full-time,"Boston, MA",...,1700000000000.0,0,Entry level,,1690000000000.0,www.governmentjobs.com,1,FULL_TIME,,
7,3697359103,164885,Engineer - Power Systems Maintenance,"At the MBTA, we envision a thriving region ena...",0,0.0,0,,Full-time,"Boston, MA",...,1700000000000.0,0,Entry level,,1690000000000.0,www.governmentjobs.com,0,FULL_TIME,,
8,3699076815,88016,Patient Access Associate - Full Time Evenings ...,"Research --> Support Staff\nPatchogue, NY\nID:...",0,0.0,0,,Full-time,"Patchogue, NY",...,1700000000000.0,0,Entry level,,1690000000000.0,jobs.silkroad.com,0,FULL_TIME,,
9,3699078675,88016,Registered Nurse Medical Surgical Telemetry Pa...,"Nursing --> Nursing\nPatchogue, NY\nID: 111608...",0,0.0,0,,Part-time,"Patchogue, NY",...,1700000000000.0,0,Mid-Senior level,,1690000000000.0,jobs.silkroad.com,0,PART_TIME,,


In [16]:
condition = (df['max_salary'].isna()) & (df['min_salary'].isna()) & (df['med_salary'].isna())
mode_value = df2['med_salary'].mode().iloc[0]  
df.loc[condition, 'med_salary'] = mode_value
df.to_csv('job_posting_32rows_final_imputed.csv', index=False)


In [17]:
df['med_salary'].info()

<class 'pandas.core.series.Series'>
RangeIndex: 32 entries, 0 to 31
Series name: med_salary
Non-Null Count  Dtype  
--------------  -----  
32 non-null     float64
dtypes: float64(1)
memory usage: 388.0 bytes


In [18]:
df['max_salary'].info()

<class 'pandas.core.series.Series'>
RangeIndex: 32 entries, 0 to 31
Series name: max_salary
Non-Null Count  Dtype
--------------  -----
32 non-null     int64
dtypes: int64(1)
memory usage: 388.0 bytes


In [19]:
df['min_salary'].info()

<class 'pandas.core.series.Series'>
RangeIndex: 32 entries, 0 to 31
Series name: min_salary
Non-Null Count  Dtype
--------------  -----
32 non-null     int64
dtypes: int64(1)
memory usage: 388.0 bytes


In [20]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32 entries, 0 to 31
Data columns (total 27 columns):
 #   Column                      Non-Null Count  Dtype  
---  ------                      --------------  -----  
 0   job_id                      32 non-null     int64  
 1   company_id                  32 non-null     int64  
 2   title                       32 non-null     object 
 3   description                 32 non-null     object 
 4   max_salary                  32 non-null     int64  
 5   med_salary                  32 non-null     float64
 6   min_salary                  32 non-null     int64  
 7   pay_period                  14 non-null     object 
 8   formatted_work_type         32 non-null     object 
 9   location                    32 non-null     object 
 10  applies                     32 non-null     int64  
 11  original_listed_time        32 non-null     float64
 12  remote_allowed              32 non-null     int64  
 13  views                       32 non-nu

**Note:**

For the above, failing to see the results while applying the conditions to df dataframe the  python.
Alternatively proceeding to fix it through  Excel.

In [21]:
from sklearn.impute import SimpleImputer
ori_file  = 'companies.csv'
df3 = pd.read_csv(ori_file)

In [22]:
num_columns = df.select_dtypes(include=['number']).columns
cat_columns = df.select_dtypes(include=['object']).columns

num_imputer = SimpleImputer(strategy='median')
df[num_columns] = num_imputer.fit_transform(df[num_columns])


cat_imputer = SimpleImputer(strategy='most_frequent')
df[cat_columns] = cat_imputer.fit_transform(df[cat_columns])

Result_dataset = 'Companies_final_imputed_dataset.csv'
df.to_csv(Result_dataset, index=False)

### Handling the epoch_timestamp to datetime format

#### * Handling date time format issue

In [23]:
df['listed_time']

0     1.690000e+12
1     1.690000e+12
2     1.690000e+12
3     1.690000e+12
4     1.690000e+12
5     1.690000e+12
6     1.690000e+12
7     1.690000e+12
8     1.690000e+12
9     1.690000e+12
10    1.690000e+12
11    1.690000e+12
12    1.690000e+12
13    1.690000e+12
14    1.690000e+12
15    1.690000e+12
16    1.690000e+12
17    1.690000e+12
18    1.690000e+12
19    1.690000e+12
20    1.690000e+12
21    1.690000e+12
22    1.690000e+12
23    1.690000e+12
24    1.690000e+12
25    1.690000e+12
26    1.690000e+12
27    1.690000e+12
28    1.690000e+12
29    1.690000e+12
30    1.690000e+12
31    1.690000e+12
Name: listed_time, dtype: float64

In [24]:
df2['listed_time'].info()

<class 'pandas.core.series.Series'>
RangeIndex: 15886 entries, 0 to 15885
Series name: listed_time
Non-Null Count  Dtype  
--------------  -----  
15886 non-null  float64
dtypes: float64(1)
memory usage: 124.2 KB


In [25]:
df2.head()

Unnamed: 0,job_id,company_id,title,description,max_salary,med_salary,min_salary,pay_period,formatted_work_type,location,...,expiry,closed_time,formatted_experience_level,skills_desc,listed_time,posting_domain,sponsored,work_type,currency,compensation_type
0,85008768,,Licensed Insurance Agent,While many industries were hurt by the last fe...,52000.0,,45760.0,YEARLY,Full-time,"Chico, CA",...,1710000000000.0,,,,1690000000000.0,,1,FULL_TIME,USD,BASE_SALARY
1,133114754,77766802.0,Sales Manager,Are you a dynamic and creative marketing profe...,,,,,Full-time,"Santa Clarita, CA",...,1700000000000.0,,,,1690000000000.0,,0,FULL_TIME,,
2,133196985,1089558.0,Model Risk Auditor,Join Us as a Model Risk Auditor – Showcase You...,,,,,Contract,"New York, NY",...,1700000000000.0,,,,1690000000000.0,,0,CONTRACT,,
3,381055942,96654609.0,Business Manager,Business ManagerFirst Baptist Church ForneyFor...,,,,,Full-time,"Forney, TX",...,1700000000000.0,,,,1690000000000.0,,0,FULL_TIME,,
4,529257371,1244539.0,NY Studio Assistant,YOU COULD BE ONE OF THE MAGIC MAKERS\nKen Fulk...,,,,,Full-time,"New York, NY",...,1710000000000.0,,,,1690000000000.0,,1,FULL_TIME,,


In [26]:
df2.tail()

Unnamed: 0,job_id,company_id,title,description,max_salary,med_salary,min_salary,pay_period,formatted_work_type,location,...,expiry,closed_time,formatted_experience_level,skills_desc,listed_time,posting_domain,sponsored,work_type,currency,compensation_type
15881,3701373516,74718032.0,Sanitation Technician,"Location:\n\nWest Columbia, SC, US, 29172\n\n2...",,,,,Part-time,"West Columbia, SC",...,1700000000000.0,,Entry level,,1690000000000.0,aspirebakeriescareers.com,0,PART_TIME,,
15882,3701373522,38897.0,Unit Secretary,Job Title: Unit Secretary\nDepartment: Nursing...,,,,,Full-time,"Teaneck, NJ",...,1700000000000.0,,Entry level,,1690000000000.0,recruiting.ultipro.com,0,FULL_TIME,,
15883,3701373523,38897.0,"Radiology Aide, Perdiem","Job Title: Radiology Aide, Perdiem\nDepartment...",,,,,Part-time,"Teaneck, NJ",...,1700000000000.0,,Entry level,,1690000000000.0,recruiting.ultipro.com,0,PART_TIME,,
15884,3701373524,2623.0,MRI Manager,Grade 105\nJob Type: Officer of Administration...,135000.0,,110000.0,YEARLY,Full-time,"New York, NY",...,1700000000000.0,,Mid-Senior level,,1690000000000.0,opportunities.columbia.edu,0,FULL_TIME,USD,BASE_SALARY
15885,3701373527,84659.0,Area Director of Business Development,Nexion Health Management affiliates operate 56...,,,,,Full-time,"Vicksburg, MS",...,1700000000000.0,,,,1690000000000.0,,0,FULL_TIME,,


In [27]:
df.tail()

Unnamed: 0,job_id,company_id,title,description,max_salary,med_salary,min_salary,pay_period,formatted_work_type,location,...,expiry,closed_time,formatted_experience_level,skills_desc,listed_time,posting_domain,sponsored,work_type,currency,compensation_type
27,3699057000.0,6104.0,Ambulatory Care Nurse - **Gyn/GYN exp highly p...,1116619_RR00080994 Job ID: 1116619_RR00080994\...,101764.0,0.0,100000.0,YEARLY,Full-time,"New York, NY",...,1700000000000.0,0.0,Mid-Senior level,,1690000000000.0,nyu.contacthr.com,0.0,FULL_TIME,USD,BASE_SALARY
28,3699406000.0,6104.0,Faculty Group Practice Secretary I Intake/Sche...,1110989_RR00076289 Job ID: 1110989_RR00076289\...,0.0,48308.0,0.0,YEARLY,Full-time,"Huntington, NY",...,1700000000000.0,0.0,Mid-Senior level,,1690000000000.0,nyu.contacthr.com,0.0,FULL_TIME,USD,BASE_SALARY
29,3697392000.0,3626.0,EMS Coordinator,Description\nIntroduction\nDo you have the car...,0.0,0.0,0.0,YEARLY,Full-time,"Dallas, TX",...,1700000000000.0,0.0,Mid-Senior level,,1690000000000.0,careers.hcahealthcare.com,1.0,FULL_TIME,USD,BASE_SALARY
30,3693067000.0,2859.0,Audit Manager,Job Description\nJob Summary:\nThe Assurance M...,150000.0,0.0,75996.0,YEARLY,Full-time,"Pittsburgh, PA",...,1700000000000.0,0.0,Mid-Senior level,,1690000000000.0,ebqb.fa.us2.oraclecloud.com,0.0,FULL_TIME,USD,BASE_SALARY
31,3700546000.0,1264.0,Senior Business Manager - Legal CAO,Position Overview\nEmployer: DWS Group\nTitle:...,191000.0,0.0,114000.0,YEARLY,Full-time,"New York, NY",...,1700000000000.0,0.0,Mid-Senior level,,1690000000000.0,careers.db.com,0.0,FULL_TIME,USD,BASE_SALARY


In [28]:
df_listed_time = pd.DataFrame()
df_listed_time['epoch_timestamp'] = df['listed_time']
df_listed_time['datetime'] = pd.to_datetime(df_listed_time['epoch_timestamp'], unit='ms')  # 'ms' for milliseconds
print(df_listed_time)


    epoch_timestamp            datetime
0      1.690000e+12 2023-07-22 04:26:40
1      1.690000e+12 2023-07-22 04:26:40
2      1.690000e+12 2023-07-22 04:26:40
3      1.690000e+12 2023-07-22 04:26:40
4      1.690000e+12 2023-07-22 04:26:40
5      1.690000e+12 2023-07-22 04:26:40
6      1.690000e+12 2023-07-22 04:26:40
7      1.690000e+12 2023-07-22 04:26:40
8      1.690000e+12 2023-07-22 04:26:40
9      1.690000e+12 2023-07-22 04:26:40
10     1.690000e+12 2023-07-22 04:26:40
11     1.690000e+12 2023-07-22 04:26:40
12     1.690000e+12 2023-07-22 04:26:40
13     1.690000e+12 2023-07-22 04:26:40
14     1.690000e+12 2023-07-22 04:26:40
15     1.690000e+12 2023-07-22 04:26:40
16     1.690000e+12 2023-07-22 04:26:40
17     1.690000e+12 2023-07-22 04:26:40
18     1.690000e+12 2023-07-22 04:26:40
19     1.690000e+12 2023-07-22 04:26:40
20     1.690000e+12 2023-07-22 04:26:40
21     1.690000e+12 2023-07-22 04:26:40
22     1.690000e+12 2023-07-22 04:26:40
23     1.690000e+12 2023-07-22 04:26:40


In [29]:
df_listed_time

Unnamed: 0,epoch_timestamp,datetime
0,1690000000000.0,2023-07-22 04:26:40
1,1690000000000.0,2023-07-22 04:26:40
2,1690000000000.0,2023-07-22 04:26:40
3,1690000000000.0,2023-07-22 04:26:40
4,1690000000000.0,2023-07-22 04:26:40
5,1690000000000.0,2023-07-22 04:26:40
6,1690000000000.0,2023-07-22 04:26:40
7,1690000000000.0,2023-07-22 04:26:40
8,1690000000000.0,2023-07-22 04:26:40
9,1690000000000.0,2023-07-22 04:26:40


In [30]:
df['listed_time'] = df_listed_time['datetime']

In [31]:
df.head()

Unnamed: 0,job_id,company_id,title,description,max_salary,med_salary,min_salary,pay_period,formatted_work_type,location,...,expiry,closed_time,formatted_experience_level,skills_desc,listed_time,posting_domain,sponsored,work_type,currency,compensation_type
0,3697354000.0,80911138.0,Fulfillment Center Operations Manager,The Fulfillment Center Operations Manager I in...,0.0,0.0,0.0,YEARLY,Full-time,"Indianapolis, IN",...,1700000000000.0,0.0,Mid-Senior level,,2023-07-22 04:26:40,www.regalrexnord.wd1.myworkdayjobs.com,0.0,FULL_TIME,USD,BASE_SALARY
1,3693584000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Glendale, CA",...,1700000000000.0,0.0,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
2,3693585000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Long Beach, CA",...,1700000000000.0,0.0,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
3,3693586000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Los Angeles, CA",...,1700000000000.0,0.0,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
4,3693590000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Los Angeles, California, United States",...,1700000000000.0,0.0,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY


In [32]:
df_listed_time['datetime']

0    2023-07-22 04:26:40
1    2023-07-22 04:26:40
2    2023-07-22 04:26:40
3    2023-07-22 04:26:40
4    2023-07-22 04:26:40
5    2023-07-22 04:26:40
6    2023-07-22 04:26:40
7    2023-07-22 04:26:40
8    2023-07-22 04:26:40
9    2023-07-22 04:26:40
10   2023-07-22 04:26:40
11   2023-07-22 04:26:40
12   2023-07-22 04:26:40
13   2023-07-22 04:26:40
14   2023-07-22 04:26:40
15   2023-07-22 04:26:40
16   2023-07-22 04:26:40
17   2023-07-22 04:26:40
18   2023-07-22 04:26:40
19   2023-07-22 04:26:40
20   2023-07-22 04:26:40
21   2023-07-22 04:26:40
22   2023-07-22 04:26:40
23   2023-07-22 04:26:40
24   2023-07-22 04:26:40
25   2023-07-22 04:26:40
26   2023-07-22 04:26:40
27   2023-07-22 04:26:40
28   2023-07-22 04:26:40
29   2023-07-22 04:26:40
30   2023-07-22 04:26:40
31   2023-07-22 04:26:40
Name: datetime, dtype: datetime64[ns]

In [33]:
df.head()

Unnamed: 0,job_id,company_id,title,description,max_salary,med_salary,min_salary,pay_period,formatted_work_type,location,...,expiry,closed_time,formatted_experience_level,skills_desc,listed_time,posting_domain,sponsored,work_type,currency,compensation_type
0,3697354000.0,80911138.0,Fulfillment Center Operations Manager,The Fulfillment Center Operations Manager I in...,0.0,0.0,0.0,YEARLY,Full-time,"Indianapolis, IN",...,1700000000000.0,0.0,Mid-Senior level,,2023-07-22 04:26:40,www.regalrexnord.wd1.myworkdayjobs.com,0.0,FULL_TIME,USD,BASE_SALARY
1,3693584000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Glendale, CA",...,1700000000000.0,0.0,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
2,3693585000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Long Beach, CA",...,1700000000000.0,0.0,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
3,3693586000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Los Angeles, CA",...,1700000000000.0,0.0,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
4,3693590000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Los Angeles, California, United States",...,1700000000000.0,0.0,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY


In [34]:
df_expiry_time = pd.DataFrame()
df_expiry_time['epoch_timestamp'] = df['expiry']
df_expiry_time['datetime'] = pd.to_datetime(df_expiry_time['epoch_timestamp'], unit='ms')
print(df_expiry_time)

    epoch_timestamp            datetime
0      1.700000e+12 2023-11-14 22:13:20
1      1.700000e+12 2023-11-14 22:13:20
2      1.700000e+12 2023-11-14 22:13:20
3      1.700000e+12 2023-11-14 22:13:20
4      1.700000e+12 2023-11-14 22:13:20
5      1.700000e+12 2023-11-14 22:13:20
6      1.700000e+12 2023-11-14 22:13:20
7      1.700000e+12 2023-11-14 22:13:20
8      1.700000e+12 2023-11-14 22:13:20
9      1.700000e+12 2023-11-14 22:13:20
10     1.700000e+12 2023-11-14 22:13:20
11     1.700000e+12 2023-11-14 22:13:20
12     1.700000e+12 2023-11-14 22:13:20
13     1.700000e+12 2023-11-14 22:13:20
14     1.700000e+12 2023-11-14 22:13:20
15     1.700000e+12 2023-11-14 22:13:20
16     1.700000e+12 2023-11-14 22:13:20
17     1.700000e+12 2023-11-14 22:13:20
18     1.700000e+12 2023-11-14 22:13:20
19     1.700000e+12 2023-11-14 22:13:20
20     1.700000e+12 2023-11-14 22:13:20
21     1.700000e+12 2023-11-14 22:13:20
22     1.700000e+12 2023-11-14 22:13:20
23     1.700000e+12 2023-11-14 22:13:20


In [35]:
df['expiry']

0     1.700000e+12
1     1.700000e+12
2     1.700000e+12
3     1.700000e+12
4     1.700000e+12
5     1.700000e+12
6     1.700000e+12
7     1.700000e+12
8     1.700000e+12
9     1.700000e+12
10    1.700000e+12
11    1.700000e+12
12    1.700000e+12
13    1.700000e+12
14    1.700000e+12
15    1.700000e+12
16    1.700000e+12
17    1.700000e+12
18    1.700000e+12
19    1.700000e+12
20    1.700000e+12
21    1.700000e+12
22    1.700000e+12
23    1.700000e+12
24    1.700000e+12
25    1.700000e+12
26    1.700000e+12
27    1.700000e+12
28    1.700000e+12
29    1.700000e+12
30    1.700000e+12
31    1.700000e+12
Name: expiry, dtype: float64

In [36]:
df['expiry'] = df_expiry_time['datetime']

In [37]:
df['expiry']

0    2023-11-14 22:13:20
1    2023-11-14 22:13:20
2    2023-11-14 22:13:20
3    2023-11-14 22:13:20
4    2023-11-14 22:13:20
5    2023-11-14 22:13:20
6    2023-11-14 22:13:20
7    2023-11-14 22:13:20
8    2023-11-14 22:13:20
9    2023-11-14 22:13:20
10   2023-11-14 22:13:20
11   2023-11-14 22:13:20
12   2023-11-14 22:13:20
13   2023-11-14 22:13:20
14   2023-11-14 22:13:20
15   2023-11-14 22:13:20
16   2023-11-14 22:13:20
17   2023-11-14 22:13:20
18   2023-11-14 22:13:20
19   2023-11-14 22:13:20
20   2023-11-14 22:13:20
21   2023-11-14 22:13:20
22   2023-11-14 22:13:20
23   2023-11-14 22:13:20
24   2023-11-14 22:13:20
25   2023-11-14 22:13:20
26   2023-11-14 22:13:20
27   2023-11-14 22:13:20
28   2023-11-14 22:13:20
29   2023-11-14 22:13:20
30   2023-11-14 22:13:20
31   2023-11-14 22:13:20
Name: expiry, dtype: datetime64[ns]

In [38]:
df2['closed_time'].info()

<class 'pandas.core.series.Series'>
RangeIndex: 15886 entries, 0 to 15885
Series name: closed_time
Non-Null Count  Dtype  
--------------  -----  
928 non-null    float64
dtypes: float64(1)
memory usage: 124.2 KB


In [39]:
non_null_values = df2['closed_time'].dropna()
print(non_null_values)

30       1.690000e+12
39       1.690000e+12
75       1.690000e+12
86       1.690000e+12
109      1.690000e+12
             ...     
15803    1.690000e+12
15820    1.690000e+12
15826    1.690000e+12
15829    1.690000e+12
15873    1.690000e+12
Name: closed_time, Length: 928, dtype: float64


In [40]:
df_closed_time = pd.DataFrame()
df_closed_time['epoch_timestamp'] = df['closed_time']
df_closed_time['datetime'] = pd.to_datetime(df_closed_time['epoch_timestamp'], unit='ms')
print(df_closed_time)

    epoch_timestamp   datetime
0               0.0 1970-01-01
1               0.0 1970-01-01
2               0.0 1970-01-01
3               0.0 1970-01-01
4               0.0 1970-01-01
5               0.0 1970-01-01
6               0.0 1970-01-01
7               0.0 1970-01-01
8               0.0 1970-01-01
9               0.0 1970-01-01
10              0.0 1970-01-01
11              0.0 1970-01-01
12              0.0 1970-01-01
13              0.0 1970-01-01
14              0.0 1970-01-01
15              0.0 1970-01-01
16              0.0 1970-01-01
17              0.0 1970-01-01
18              0.0 1970-01-01
19              0.0 1970-01-01
20              0.0 1970-01-01
21              0.0 1970-01-01
22              0.0 1970-01-01
23              0.0 1970-01-01
24              0.0 1970-01-01
25              0.0 1970-01-01
26              0.0 1970-01-01
27              0.0 1970-01-01
28              0.0 1970-01-01
29              0.0 1970-01-01
30              0.0 1970-01-01
31      

In [41]:
df['closed_time'] = df_closed_time['datetime']

In [42]:
df['closed_time']

0    1970-01-01
1    1970-01-01
2    1970-01-01
3    1970-01-01
4    1970-01-01
5    1970-01-01
6    1970-01-01
7    1970-01-01
8    1970-01-01
9    1970-01-01
10   1970-01-01
11   1970-01-01
12   1970-01-01
13   1970-01-01
14   1970-01-01
15   1970-01-01
16   1970-01-01
17   1970-01-01
18   1970-01-01
19   1970-01-01
20   1970-01-01
21   1970-01-01
22   1970-01-01
23   1970-01-01
24   1970-01-01
25   1970-01-01
26   1970-01-01
27   1970-01-01
28   1970-01-01
29   1970-01-01
30   1970-01-01
31   1970-01-01
Name: closed_time, dtype: datetime64[ns]

In [43]:
df.head()

Unnamed: 0,job_id,company_id,title,description,max_salary,med_salary,min_salary,pay_period,formatted_work_type,location,...,expiry,closed_time,formatted_experience_level,skills_desc,listed_time,posting_domain,sponsored,work_type,currency,compensation_type
0,3697354000.0,80911138.0,Fulfillment Center Operations Manager,The Fulfillment Center Operations Manager I in...,0.0,0.0,0.0,YEARLY,Full-time,"Indianapolis, IN",...,2023-11-14 22:13:20,1970-01-01,Mid-Senior level,,2023-07-22 04:26:40,www.regalrexnord.wd1.myworkdayjobs.com,0.0,FULL_TIME,USD,BASE_SALARY
1,3693584000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Glendale, CA",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
2,3693585000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Long Beach, CA",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
3,3693586000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Los Angeles, CA",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
4,3693590000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Los Angeles, California, United States",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY


### Displaying the corrected epoch values however the values are repetitive and dated 1970.

In [44]:
df_corrected = df[['closed_time', 'listed_time', 'expiry']]
print(df_corrected)

   closed_time         listed_time              expiry
0   1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
1   1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
2   1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
3   1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
4   1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
5   1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
6   1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
7   1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
8   1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
9   1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
10  1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
11  1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
12  1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
13  1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
14  1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
15  1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
16  1970-01-01 2023-07-22 04:26:40 2023-11-14 22:13:20
17  1970-0

In [45]:
df_corrected

Unnamed: 0,closed_time,listed_time,expiry
0,1970-01-01,2023-07-22 04:26:40,2023-11-14 22:13:20
1,1970-01-01,2023-07-22 04:26:40,2023-11-14 22:13:20
2,1970-01-01,2023-07-22 04:26:40,2023-11-14 22:13:20
3,1970-01-01,2023-07-22 04:26:40,2023-11-14 22:13:20
4,1970-01-01,2023-07-22 04:26:40,2023-11-14 22:13:20
5,1970-01-01,2023-07-22 04:26:40,2023-11-14 22:13:20
6,1970-01-01,2023-07-22 04:26:40,2023-11-14 22:13:20
7,1970-01-01,2023-07-22 04:26:40,2023-11-14 22:13:20
8,1970-01-01,2023-07-22 04:26:40,2023-11-14 22:13:20
9,1970-01-01,2023-07-22 04:26:40,2023-11-14 22:13:20


In [46]:
df_corrected = df

In [47]:
df

Unnamed: 0,job_id,company_id,title,description,max_salary,med_salary,min_salary,pay_period,formatted_work_type,location,...,expiry,closed_time,formatted_experience_level,skills_desc,listed_time,posting_domain,sponsored,work_type,currency,compensation_type
0,3697354000.0,80911138.0,Fulfillment Center Operations Manager,The Fulfillment Center Operations Manager I in...,0.0,0.0,0.0,YEARLY,Full-time,"Indianapolis, IN",...,2023-11-14 22:13:20,1970-01-01,Mid-Senior level,,2023-07-22 04:26:40,www.regalrexnord.wd1.myworkdayjobs.com,0.0,FULL_TIME,USD,BASE_SALARY
1,3693584000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Glendale, CA",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
2,3693585000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Long Beach, CA",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
3,3693586000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Los Angeles, CA",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
4,3693590000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Los Angeles, California, United States",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
5,3694100000.0,2972588.0,***Chatham Hiring Event*** Youth Counselor - A...,"Description of Work\nAs of August 16, 2023, th...",64194.0,0.0,44500.0,YEARLY,Full-time,"Chatham County, NC",...,2023-11-14 22:13:20,1970-01-01,Mid-Senior level,,2023-07-22 04:26:40,www.governmentjobs.com,0.0,FULL_TIME,USD,BASE_SALARY
6,3697353000.0,164885.0,Project Engineer,"At the MBTA, we envision a thriving region ena...",0.0,0.0,0.0,YEARLY,Full-time,"Boston, MA",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.governmentjobs.com,1.0,FULL_TIME,USD,BASE_SALARY
7,3697359000.0,164885.0,Engineer - Power Systems Maintenance,"At the MBTA, we envision a thriving region ena...",0.0,0.0,0.0,YEARLY,Full-time,"Boston, MA",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.governmentjobs.com,0.0,FULL_TIME,USD,BASE_SALARY
8,3699077000.0,88016.0,Patient Access Associate - Full Time Evenings ...,"Research --> Support Staff\nPatchogue, NY\nID:...",0.0,0.0,0.0,YEARLY,Full-time,"Patchogue, NY",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,jobs.silkroad.com,0.0,FULL_TIME,USD,BASE_SALARY
9,3699079000.0,88016.0,Registered Nurse Medical Surgical Telemetry Pa...,"Nursing --> Nursing\nPatchogue, NY\nID: 111608...",0.0,0.0,0.0,YEARLY,Part-time,"Patchogue, NY",...,2023-11-14 22:13:20,1970-01-01,Mid-Senior level,,2023-07-22 04:26:40,jobs.silkroad.com,0.0,PART_TIME,USD,BASE_SALARY


In [57]:
df3

Unnamed: 0,job_id,company_id,title,description,skills_desc,work_type,location,currency,remote_allowed,sponsored,...,applies,views,original_listed_time,listed_time,expiry,closed_time,posting_domain,job_posting_url,application_url,application_type
0,8.500877e+07,267032.5,Licensed Insurance Agent,While many industries were hurt by the last fe...,EducationBachelors or better in Education or r...,FULL_TIME,"Chico, CA",USD,0.0,1.0,...,6.0,5.0,1970-01-20 13:26:40,1970-01-20 13:26:40,1970-01-20 19:00:00,1970-01-20 13:26:40,jobs.smartrecruiters.com,https://www.linkedin.com/jobs/view/85008768/?t...,https://jobs.jedunn.com/job/kansas-city-genera...,ComplexOnsiteApply
1,1.331148e+08,77766802.0,Sales Manager,Are you a dynamic and creative marketing profe...,EducationBachelors or better in Education or r...,FULL_TIME,"Santa Clarita, CA",USD,0.0,0.0,...,6.0,25.0,1970-01-20 13:26:40,1970-01-20 13:26:40,1970-01-20 16:13:20,1970-01-20 13:26:40,jobs.smartrecruiters.com,https://www.linkedin.com/jobs/view/133114754/?...,https://jobs.jedunn.com/job/kansas-city-genera...,ComplexOnsiteApply
2,1.331970e+08,1089558.0,Model Risk Auditor,Join Us as a Model Risk Auditor Showcase Your...,EducationBachelors or better in Education or r...,CONTRACT,"New York, NY",USD,0.0,0.0,...,1.0,17.0,1970-01-20 13:26:40,1970-01-20 13:26:40,1970-01-20 16:13:20,1970-01-20 13:26:40,jobs.smartrecruiters.com,https://www.linkedin.com/jobs/view/133196985/?...,https://jobs.jedunn.com/job/kansas-city-genera...,ComplexOnsiteApply
3,3.810559e+08,96654609.0,Business Manager,Business ManagerFirst Baptist Church ForneyFor...,EducationBachelors or better in Education or r...,FULL_TIME,"Forney, TX",USD,0.0,0.0,...,6.0,25.0,1970-01-20 13:26:40,1970-01-20 13:26:40,1970-01-20 16:13:20,1970-01-20 13:26:40,jobs.smartrecruiters.com,https://www.linkedin.com/jobs/view/381055942/?...,https://jobs.jedunn.com/job/kansas-city-genera...,ComplexOnsiteApply
4,5.292574e+08,1244539.0,NY Studio Assistant,YOU COULD BE ONE OF THE MAGIC MAKERS\nKen Fulk...,EducationBachelors or better in Education or r...,FULL_TIME,"New York, NY",USD,0.0,1.0,...,6.0,2.0,1970-01-20 13:26:40,1970-01-20 13:26:40,1970-01-20 19:00:00,1970-01-20 13:26:40,jobs.smartrecruiters.com,https://www.linkedin.com/jobs/view/529257371/?...,https://jobs.jedunn.com/job/kansas-city-genera...,ComplexOnsiteApply
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
15881,3.701374e+09,74718032.0,Sanitation Technician,"Location:\n\nWest Columbia, SC, US, 29172\n\n2...",EducationBachelors or better in Education or r...,PART_TIME,"West Columbia, SC",USD,0.0,0.0,...,6.0,1.0,1970-01-20 13:26:40,1970-01-20 13:26:40,1970-01-20 16:13:20,1970-01-20 13:26:40,aspirebakeriescareers.com,https://www.linkedin.com/jobs/view/3701373516/...,https://aspirebakeriescareers.com/job/west-col...,OffsiteApply
15882,3.701374e+09,38897.0,Unit Secretary,Job Title: Unit Secretary\nDepartment: Nursing...,EducationBachelors or better in Education or r...,FULL_TIME,"Teaneck, NJ",USD,0.0,0.0,...,2.0,7.0,1970-01-20 13:26:40,1970-01-20 13:26:40,1970-01-20 16:13:20,1970-01-20 13:26:40,recruiting.ultipro.com,https://www.linkedin.com/jobs/view/3701373522/...,https://recruiting.ultipro.com/hol1005hnmc/job...,OffsiteApply
15883,3.701374e+09,38897.0,"Radiology Aide, Perdiem","Job Title: Radiology Aide, Perdiem\nDepartment...",EducationBachelors or better in Education or r...,PART_TIME,"Teaneck, NJ",USD,0.0,0.0,...,6.0,3.0,1970-01-20 13:26:40,1970-01-20 13:26:40,1970-01-20 16:13:20,1970-01-20 13:26:40,recruiting.ultipro.com,https://www.linkedin.com/jobs/view/3701373523/...,https://recruiting.ultipro.com/hol1005hnmc/job...,OffsiteApply
15884,3.701374e+09,2623.0,MRI Manager,Grade 105\nJob Type: Officer of Administration...,EducationBachelors or better in Education or r...,FULL_TIME,"New York, NY",USD,0.0,0.0,...,6.0,10.0,1970-01-20 13:26:40,1970-01-20 13:26:40,1970-01-20 16:13:20,1970-01-20 13:26:40,opportunities.columbia.edu,https://www.linkedin.com/jobs/view/3701373524/...,https://opportunities.columbia.edu/jobs/mri-ma...,OffsiteApply


In [52]:
df

Unnamed: 0,job_id,company_id,title,description,max_salary,med_salary,min_salary,pay_period,formatted_work_type,location,...,expiry,closed_time,formatted_experience_level,skills_desc,listed_time,posting_domain,sponsored,work_type,currency,compensation_type
0,3697354000.0,80911138.0,Fulfillment Center Operations Manager,The Fulfillment Center Operations Manager I in...,0.0,0.0,0.0,YEARLY,Full-time,"Indianapolis, IN",...,2023-11-14 22:13:20,1970-01-01,Mid-Senior level,,2023-07-22 04:26:40,www.regalrexnord.wd1.myworkdayjobs.com,0.0,FULL_TIME,USD,BASE_SALARY
1,3693584000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Glendale, CA",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
2,3693585000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Long Beach, CA",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
3,3693586000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Los Angeles, CA",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
4,3693590000.0,3681497.0,OFFICE TECHNICIAN (GENERAL),Equal Opportunity Employer\nThe State of Calif...,4145.0,0.0,3308.0,MONTHLY,Full-time,"Los Angeles, California, United States",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.calcareers.ca.gov,0.0,FULL_TIME,USD,BASE_SALARY
5,3694100000.0,2972588.0,***Chatham Hiring Event*** Youth Counselor - A...,"Description of Work\nAs of August 16, 2023, th...",64194.0,0.0,44500.0,YEARLY,Full-time,"Chatham County, NC",...,2023-11-14 22:13:20,1970-01-01,Mid-Senior level,,2023-07-22 04:26:40,www.governmentjobs.com,0.0,FULL_TIME,USD,BASE_SALARY
6,3697353000.0,164885.0,Project Engineer,"At the MBTA, we envision a thriving region ena...",0.0,0.0,0.0,YEARLY,Full-time,"Boston, MA",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.governmentjobs.com,1.0,FULL_TIME,USD,BASE_SALARY
7,3697359000.0,164885.0,Engineer - Power Systems Maintenance,"At the MBTA, we envision a thriving region ena...",0.0,0.0,0.0,YEARLY,Full-time,"Boston, MA",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,www.governmentjobs.com,0.0,FULL_TIME,USD,BASE_SALARY
8,3699077000.0,88016.0,Patient Access Associate - Full Time Evenings ...,"Research --> Support Staff\nPatchogue, NY\nID:...",0.0,0.0,0.0,YEARLY,Full-time,"Patchogue, NY",...,2023-11-14 22:13:20,1970-01-01,Entry level,,2023-07-22 04:26:40,jobs.silkroad.com,0.0,FULL_TIME,USD,BASE_SALARY
9,3699079000.0,88016.0,Registered Nurse Medical Surgical Telemetry Pa...,"Nursing --> Nursing\nPatchogue, NY\nID: 111608...",0.0,0.0,0.0,YEARLY,Part-time,"Patchogue, NY",...,2023-11-14 22:13:20,1970-01-01,Mid-Senior level,,2023-07-22 04:26:40,jobs.silkroad.com,0.0,PART_TIME,USD,BASE_SALARY


In [None]:
Issues to fix:
    