# Data Cleaning: Normalize individual functions

## Background

There have been quite a few notebooks in this competition focusing on the task of data mining using various types of Python string manipulations, most notably regular expression or regex, for short. While these notebooks achieve highly satisfactory outcomes and create directions for future research or inquiries, they quickly encounter their limits due to the poor quality of the given dataset(s). Recall that we are given a collection of .txt files that describe past City of Los Angeles' jobs along with their pdf versions for the actual job postings. While the City has done an excellent jobs of composing these text files, the **source** that generated these .txt files is not revealed, which makes it very hard for participants to anticipate all possible inconsistencies in the data. For example, a diligent search shows that file, FIRE ASSISTANT CHIEF 2166 011218.txt, doesn't have JOB_DUTIES field, which is very questionable since the `kaggle_data_dictionary.csv` file explicitly spells that every job must have this field (see *Accepts Null Values?* column).

By now, hopefully we can see that such inconsistencies will certainly fail any attempt to write code for data mining purposes **before** data cleaning. Even if we can agree to hand-wave this problem (e.g., violating the non-null requirement of JOB_DUTIES field), a person with a strong scientific mindset cannot just hand in the solution and walk away without at least noting where/what/why we have missing values. On the other hand, since it is well-known that exploratory data analysis (EDA) and data cleaning typically take up much of the time of a data scientist, it is understandable that very few kernels (if none at all) are dedicated to tackling data cleaning issue. For unstructured data like .txt files, this could mean one has to actively read the descriptions of several jobs before being able to figure out what should be done for data cleaning.

<font size=4, color='red'>**Important Details.**</font> This series of notebooks is dedicated to data cleaning as I've realized that this dataset is very rich in information. Thus, a first complete and accurate [csv file](https://www.kaggle.com/c/data-science-for-good-city-of-los-angeles/overview), which advanced analyses can be built upon, will definitely assist the City of Los Angeles in restructuring their job posting to attract more talents in the upcoming years. The approach taken here is quite novel, at least comparing to other kernels. First, I focus on writing code to parse SYSTEMS ANALYST 1596 102717.txt into a csv file that is exactly the same as `sample job class export template.csv`. This is done by writing 25 main functions, each of which is dedicated to only extracting information for 25 field names, besides two helper functions. Then I use these functions to validate the consistency of other .txt files and manually change/modify the files to fit the pattern of SYSTEMS ANALYST 1596 102717.txt. For example, in the requirement section of ACCOUNTANT 1513 062218.txt, there is no itemizer, e.g., 1) or 1., so I'll manually add a 1. before the word, Graduation, to match with the pattern in SYSTEMS ANALYST 1596 102717.txt.

Admittedly, the steps taken here are extremely labor-intensive. While I agree that there might be other easier ways, I do this for a purpose as I've learned that it is the best way to get myself familiar with this unstructured data. As mentioned above, since **the source that generated these text file is not known**, we really have no idea on how to retrieve relevant information from pattern matching on raw data. For example, one might attempt to do something such as `job[job.find('REQUIREMENTS/MINIMUM QUALIFICATIONS':job.find('PROCESS NOTES')]` or any similar expressions using regex. Although that this statement may help him retrieve **only** relevant information regarding school type, education majors, etc., this is not guranteed! For instance, I found that in some jobs, the word, PROCESS NOTES, came after the word WHERE TO APPLY, which unarguably causes severe headache later. The only way to avoid this, as far as I realize, is to patiently do some manual data cleaning before any analysis.

Mention job bulletins_clean

## Import relevant modules

In [1]:
import os                       # module to interface with the underlying OS
import numpy as np              # linear algebra
import pandas as pd             # dataframe
import re                       # regular expression
import matplotlib.pyplot as plt # data visualization
%matplotlib inline
import toolkits as tk           # user-define module for efficiently reading files

## Get paths and names of files in each path

In [2]:
# Path and list of jobs in Job Bulletins.
# NOTE 1: These are raw data
(raw_path, raw_jobs) = tk.get_raw_jobs() # tk is a user-define module

# Path and list of jobs in JobBulletins_cleaned
# NOTE 2: These are cleaned data
(cleaned_path, cleaned_jobs) = tk.get_cleaned_jobs()

## Normalize JOB_CLASS_TITLES (jct)

In [3]:
# This is a helper function
def job_class_title(job):
    '''Returns the field JOB_CLASS_TITLE (jct)'''
    # From the beginning to the word Class Code is where the information located
    temp = job[:job.index('Class Code')]
    # Split at white space, skipping all escape characters. This feature of split is amazing!
    jct = temp.split()
    
    # Returns
    jct = ' '.join(jct) # join back words with white spaces
    return jct

In [4]:
# Normalization Strategy: 
# print jct in the try clause and print job_path in the except clause
# look at printouts and detect unusual jct's.
for file_name in raw_jobs:
    job_path = raw_path + file_name        # define path to file_name
    raw_job  = open(job_path, 'rt').read() # read in job as a string
    try:
        print(job_class_title(job=raw_job)) 
    except:                                # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

311 DIRECTOR
ACCOUNTANT
ACCOUNTING CLERK
ACCOUNTING RECORDS SUPERVISOR
ADMINISTRATIVE ANALYST
ADMINISTRATIVE CLERK
ADMINISTRATIVE HEARING EXAMINER
ADVANCE PRACTICE PROVIDER CORRECTIONAL CARE
AIR CONDITIONING MECHANIC
AIR CONDITIONING MECHANIC SUPERVISOR
AIRPORT AIDE
AIRPORT CHIEF INFORMATION SECURITY OFFICER
AIRPORT ENGINEER
AIRPORT GUIDE
AIRPORT INFORMATION SPECIALIST
AIRPORT LABOR RELATIONS ADVOCATE
AIRPORT MANAGER
AIRPORT POLICE CAPTAIN
AIRPORT POLICE LIEUTENANT
AIRPORT POLICE OFFICER
AIRPORT POLICE SPECIALIST
AIRPORT SUPERINTENDENT OF OPERATIONS
AIRPORTS MAINTENANCE SUPERINTENDENT
AIRPORTS MAINTENANCE SUPERVISOR
AIRPORTS PUBLIC AND COMMUNITY RELATIONS DIRECTOR
ANIMAL CARE ASSISTANT
ANIMAL CARE TECHNICIAN
WATER TREATMENT OPERATOR
ANIMAL CONTROL OFFICER
ANIMAL KEEPER
APPARATUS OPERATOR
APPLICATIONS PROGRAMMER
APPRENTICE - METAL TRADES
APPRENTICE MACHINIST
AQUARIST
AQUARIUM EDUCATOR
AQUATIC DIRECTOR
AQUATIC FACILITY MANAGER
AQUEDUCT AND RESERVOIR KEEPER
AQUEDUCT AND RESERVOIR SUPERVIS

PAINTER SUPERVISOR
PARK MAINTENANCE SUPERVISOR
PARK RANGER
PARK SERVICES ATTENDANT
PARK SERVICES SUPERVISOR
PARKING ATTENDANT
PARKING ENFORCEMENT MANAGER
PARKING MANAGER
PARKING METER TECHNICIAN
PARKING METER TECHNICIAN SUPERVISOR
PAYROLL ANALYST
PAYROLL SUPERVISOR
PERFORMING ARTS DIRECTOR
PERSONNEL ANALYST
PERSONNEL DIRECTOR
PERSONNEL RECORDS SUPERVISOR
PERSONNEL RESEARCH ANALYST
PHOTOGRAPHER
PILE DRIVER WORKER
PIPEFITTER
PIPEFITTER SUPERVISOR
PLANNING ASSISTANT
PLUMBER
PLUMBER SUPERVISOR
PLUMBING INSPECTOR
POLICE ADMINISTRATOR
POLICE CAPTAIN
POLICE COMMANDER
POLICE DETECTIVE
POLICE LIEUTENANT
POLICE OFFICER
POLICE PERFORMANCE AUDITOR
POLICE SERGEANT
POLICE SERVICE REPRESENTATIVE
POLICE SPECIAL INVESTIGATOR
POLICE SPECIALIST
POLICE SURVEILLANCE SPECIALIST
POLYGRAPH EXAMINER
PORT ELECTRICAL MECHANIC
PORT ELECTRICAL MECHANIC SUPERVISOR
PORT MAINTENANCE SUPERVISOR
PORT PILOT
PORT POLICE CAPTAIN
PORT POLICE LIEUTENANT
PORT POLICE OFFICER
PORT POLICE SERGEANT
PORTFOLIO MANAGER
POWER ENGINE

A few things to notice here:
1. The function `job_class_title` was able to read all raw jobs (no errors were catched).
2. On the first eyeview, it's clear that the following two jobs fall out of the pattern are DISTRICT SUPERVISOR ANIMAL SERVICES 4320 022318.txt and MARINE ENVIRONMENTAL SUPERVISOR 9433 071114 (1).txt since they have their titles too long:
    * Of course, this is due to the bug introduced in our code since it is based solely on SYSTEMS ANALYST 1596 102717.txt. Thus, one may come back to the function `job_class_title` and modify it by using regex, for example, to capture all titles. However, we will **not** do that since the approach we are taking here is **data normalization**, that is, we make sure every job follows the same pattern as SYSTEMS ANALYST 1596 102717.txt by modifying its content.
    * This may sound unexciting; however, the real benefit of this approach won't come in until later when the need to parse information regarding jobs' requirements arises. Once getting there, we'll see that the data normalization approach outperforms most of the traditional ones, which focus on writing functions that fit all jobs. On top of that, by patiently modifying these jobs manually, we've already familiarized ourselves with this type of unstructured data as well as raised our awarenesses when missing values occur.
    * These two jobs were modified by deleting unecessary information (DISTRICT SUPERVISOR ANIMAL SERVICES 4320 022318.txt) and entering into a new line (MARINE ENVIRONMENTAL SUPERVISOR 9433 071114 (1).txt):
        * DISTRICT SUPERVISOR ANIMAL SERVICES 4320 022318.txt
        * MARINE ENVIRONMENTAL SUPERVISOR 9433 071114 (1).txt
3. On the second eyeview, we see that some jobs have suspiciously weird titles, such as CAMPUS INTERVIEW ONLY. 
    * The following list names the jobs that were modified. The phrase CAMPUS INTERVIEW ONLY that used to appear before the job title was moved to the section NOTES: and put inside forward and backward Python prompt symbol, i.e. >>>CAMPUS INTERVIEW ONLY<<<
        * CityofLA/Job Bulletins/ARCHITECTURAL ASSOCIATE 7926 013114 REV 032916.txt
        * CityofLA/Job Bulletins/ENVIRONMENTAL ENGINEERING ASSOCIATE  7871 020113 REV 032916.txt
        * CityofLA/Job Bulletins/STREET LIGHTING ENGINEERING ASSOCIATE 7527 101102 REV 032916.txt

**Let's rerun the function and observe the changes. Note we use .txt files in the JobBulletins_cleaned folder.**

Observing the printouts carefully this time, we see that all of the nuances above have been resolved.

In [5]:
# Rerun the function job_class_title using cleaned data
for file_name in cleaned_jobs:
    job_path     = cleaned_path + file_name     # define path to file_name
    cleaned_job  = open(job_path, 'rt').read()  # read in job as a string
    try:
        print(job_class_title(job=cleaned_job)) 
    except:                                     # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

311 DIRECTOR
ACCOUNTANT
ACCOUNTING CLERK
ACCOUNTING RECORDS SUPERVISOR
ADMINISTRATIVE ANALYST
ADMINISTRATIVE CLERK
ADMINISTRATIVE HEARING EXAMINER
ADVANCE PRACTICE PROVIDER CORRECTIONAL CARE
AIR CONDITIONING MECHANIC
AIR CONDITIONING MECHANIC SUPERVISOR
AIRPORT AIDE
AIRPORT CHIEF INFORMATION SECURITY OFFICER
AIRPORT ENGINEER
AIRPORT GUIDE
AIRPORT INFORMATION SPECIALIST
AIRPORT LABOR RELATIONS ADVOCATE
AIRPORT MANAGER
AIRPORT POLICE CAPTAIN
AIRPORT POLICE LIEUTENANT
AIRPORT POLICE OFFICER
AIRPORT POLICE SPECIALIST
AIRPORT SUPERINTENDENT OF OPERATIONS
AIRPORTS MAINTENANCE SUPERINTENDENT
AIRPORTS MAINTENANCE SUPERVISOR
AIRPORTS PUBLIC AND COMMUNITY RELATIONS DIRECTOR
ANIMAL CARE ASSISTANT
ANIMAL CARE TECHNICIAN
WATER TREATMENT OPERATOR
ANIMAL CONTROL OFFICER
ANIMAL KEEPER
APPARATUS OPERATOR
APPLICATIONS PROGRAMMER
APPRENTICE - METAL TRADES
APPRENTICE MACHINIST
AQUARIST
AQUARIUM EDUCATOR
AQUATIC DIRECTOR
AQUATIC FACILITY MANAGER
AQUEDUCT AND RESERVOIR KEEPER
AQUEDUCT AND RESERVOIR SUPERVIS

POLICE SERGEANT
POLICE SERVICE REPRESENTATIVE
POLICE SPECIAL INVESTIGATOR
POLICE SPECIALIST
POLICE SURVEILLANCE SPECIALIST
POLYGRAPH EXAMINER
PORT ELECTRICAL MECHANIC
PORT ELECTRICAL MECHANIC SUPERVISOR
PORT MAINTENANCE SUPERVISOR
PORT PILOT
PORT POLICE CAPTAIN
PORT POLICE LIEUTENANT
PORT POLICE OFFICER
PORT POLICE SERGEANT
PORTFOLIO MANAGER
POWER ENGINEERING MANAGER
POWER SHOVEL OPERATOR
PRE-PRESS OPERATOR
PRINCIPAL ACCOUNTANT
PRINCIPAL ANIMAL KEEPER
PRINCIPAL CITY PLANNER
PRINCIPAL CIVIL ENGINEER
PRINCIPAL CIVIL ENGINEERING DRAFTING TECHNICIAN
PRINCIPAL CLERK
PRINCIPAL CLERK POLICE
PRINCIPAL CLERK UTILITY
PRINCIPAL COMMUNICATIONS OPERATOR
PRINCIPAL CONSTRUCTION INSPECTOR
PRINCIPAL DEPUTY CONTROLLER
PRINCIPAL DETENTION OFFICER
PRINCIPAL ELECTRIC TROUBLE DISPATCHER
PRINCIPAL ELECTRICAL ENGINEERING DRAFTING TECHNICIAN
PRINCIPAL ENVIRONMENTAL ENGINEER
PRINCIPAL GROUNDS MAINTENANCE SUPERVISOR
PRINCIPAL INSPECTOR
PRINCIPAL LIBRARIAN
PRINCIPAL MECHANICAL ENGINEERING DRAFTING TECHNICIAN
PRIN

## Normalize JOB_CLASS_NO (jcn)

In [6]:
# This is a helper function
def job_class_no(job):
    '''Returns the field JOB_CLASS_NO (jcn)'''
    # From the word Class Code to the word Open Date is where the information located.
    temp = job[job.index('Class Code'):job.index('Open Date')]
    # Check if anything in temp is a digit via isdigit(). If it is, get it
    jcn  = [e for e in temp.split() if e.isdigit()][0] # first element is what we want
    # Per requirement, if Class Code only has 3 non-zero digits, then becomes 0###
    if len(jcn) <= 3:
        jcn = '0'+jcn
    
    # Returns
    return jcn

In [7]:
# Normalization Strategy: 
# print jcn in the try clause and print job_path in the except clause
# look at printouts and detect unusual jctn's.
for file_name in raw_jobs:
    job_path = raw_path + file_name        # define path to file_name
    raw_job  = open(job_path, 'rt').read() # read in job as a string
    try:
        print(job_class_no(job=raw_job))
    except:                                # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

9206
1513
1223
1119
1590
1358
9135
2325
3774
3781
1540
1404
7256
0845
1783
9210
7260
3228
3227
3225
3236
7268
3331
3336
1788
4323
4310
5885
4311
4304
2121
1429
3789
3764
2400
2493
2419
2423
5813
5816
7925
7926
7922
1191
2478
2448
2447
2454
2455
3440
3435
4143
4145
7259
3808
3684
4219
9377
3142
4208
9415
3818
3809
3150
1860
7998
6147
1517
3704
3706
3707
3721
3595
3714
3565
1759
1764
1203
3733
3735
3737
7244
3124
7543
4211
3190
7561
4251
5923
3338
3333
3588
3589
1801
3344
3346
3418
3353
3354
3351
7833
1554
7274
9151
5927
1253
1260
1249
1249
1466
7296
3182
5237
4289
9230
2237
9286
4254
1619
9182
7945
7271
7258
9180
1968
5154
4260
3187
4286
1211
4275
7944
7941
7237
7246
7232
1767
1600
1603
1213
9734
3800
3802
3686
3689
7610
7607
1461
2496
8500
2501
9165
3129
3127
3541
3341
7291
9168
7230
2317
2236
2234
3149
3156
3176
1230
1229
1136
1470
5131
1121
1593
3211
1768
9304
9302
7625
4266
4321
1568
7270
3722
3123
1488
3208
9375
4320
6157
3521
1493
3879
3873
3822
7520
5224
3828
3799
7525
7532
4221


Below list two jobs that were caught with errors. MARINE ENVIRONMENTAL SUPERVISOR 9433 071114 (1).txt has 'Class  Code' (two white spaces) instead of 'Class Code' (one white space) and PUBLIC INFORMATION DIRECTOR 1800 030317.txt has 'Open date' instead of 'Open Date'. They were thus modified to fix these limitations.
* CityofLA/Job Bulletins/PUBLIC INFORMATION DIRECTOR 1800 030317.txt
* CityofLA/Job Bulletins/MARINE ENVIRONMENTAL SUPERVISOR 9433 071114 (1).txt

**Let's rerun the function and observe the changes. Note we use .txt files in the JobBulletins_cleaned folder.**

Observing the printouts carefully this time, we see that all of the nuances above have been resolved.

In [8]:
# Rerun the function job_class_no using cleaned data
for file_name in cleaned_jobs:
    job_path     = cleaned_path + file_name     # define path to file_name
    cleaned_job  = open(job_path, 'rt').read()  # read in job as a string
    try:
        print(job_class_no(job=cleaned_job)) 
    except:                                     # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

9206
1513
1223
1119
1590
1358
9135
2325
3774
3781
1540
1404
7256
0845
1783
9210
7260
3228
3227
3225
3236
7268
3331
3336
1788
4323
4310
5885
4311
4304
2121
1429
3789
3764
2400
2493
2419
2423
5813
5816
7925
7926
7922
1191
2478
2448
2447
2454
2455
3440
3435
4143
4145
7259
3808
3684
4219
9377
3142
4208
9415
3818
3809
3150
1860
7998
6147
1517
3704
3706
3707
3721
3595
3714
3565
1759
1764
1203
3733
3735
3737
7244
3124
7543
4211
3190
7561
4251
5923
3338
3333
3588
3589
1801
3344
3346
3418
3353
3354
3351
7833
1554
7274
9151
5927
1253
1260
1249
1249
1466
7296
3182
5237
4289
9230
2237
9286
4254
1619
9182
7945
7271
7258
9180
1968
5154
4260
3187
4286
1211
4275
7944
7941
7237
7246
7232
1767
1600
1603
1213
9734
3800
3802
3686
3689
7610
7607
1461
2496
8500
2501
9165
3129
3127
3541
3341
7291
9168
7230
2317
2236
2234
3149
3156
3176
1230
1229
1136
1470
5131
1121
1593
3211
1768
9304
9302
7625
4266
4321
1568
7270
3722
3123
1488
3208
9375
4320
6157
3521
1493
3879
3873
3822
7520
5224
3828
3799
7525
7532
4221


## Get DUTIES

In [9]:
# First, make sure the word 'DUTIES' can be found in the job postings.
# Do an index('DUTIES') in the try clause, not find('DUTIES'), to catch the errors
for file_name in raw_jobs:
    job_path = raw_path + file_name        # define path to file_name
    raw_job  = open(job_path, 'rt').read() # read in job as a string
    try:
        if 'DUTIES' not in raw_job.split():
            print(job_path)
    except:                                # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

CityofLA/Job Bulletins/APPARATUS OPERATOR 2121 071417 (1).txt
CityofLA/Job Bulletins/ENGINEER OF FIRE DEPARTMENT 2131 111116.txt
CityofLA/Job Bulletins/FIRE ASSISTANT CHIEF 2166 011218.txt
CityofLA/Job Bulletins/FIRE BATTALION CHIEF 2152 030918.txt
CityofLA/Job Bulletins/FIRE HELICOPTER PILOT 3563 081415 REV. 081815.txt
CityofLA/Job Bulletins/FIRE INSPECTOR 2128 031717.txt


Below is a list of jobs which don't have the word DUTIES in them. 
* CityofLA/Job Bulletins/APPARATUS OPERATOR 2121 071417 (1).txt
* CityofLA/Job Bulletins/ENGINEER OF FIRE DEPARTMENT 2131 111116.txt
* CityofLA/Job Bulletins/FIRE ASSISTANT CHIEF 2166 011218.txt
* CityofLA/Job Bulletins/FIRE BATTALION CHIEF 2152 030918.txt
* CityofLA/Job Bulletins/FIRE HELICOPTER PILOT 3563 081415 REV. 081815.txt
* CityofLA/Job Bulletins/FIRE INSPECTOR 2128 031717.txt

This is very questionable as discussed in the Background above. However, we'll proceed for the moment and note that a missing value here indicates that a job posting doesn't have a DUTIES section. On the other hand, we modify the content of these missing-duties jobs by purposely adding the word DUTIES before the word REQUIREMENTS/MINIMUM QUALIFICATIONS in order to normalize them.

**Let's rerun the function and observe the changes. Note we use .txt files in the JobBulletins_cleaned folder.**

Observing the printouts carefully this time, we see that all of the nuances above have been resolved.

In [10]:
# Rerun the code above to check if every job has the word DUTIES in it.
# If nothing gets printed out, that means we pass the test.
for file_name in cleaned_jobs:
    job_path     = cleaned_path + file_name     # define path to file_name
    cleaned_job  = open(job_path, 'rt').read()  # read in job as a string
    try:
        if 'DUTIES' not in raw_job.split():
            print(job_path)
    except:                                     # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

In [11]:
# Second, make sure the pharase 'REQUIREMENTS/MINIMUM QUALIFICATIONS' can be found in the job postings.
# Do an index() in the try clause, not find(), to catch the errors
for file_name in raw_jobs:
    job_path = raw_path + file_name        # define path to file_name
    raw_job  = open(job_path, 'rt').read() # read in job as a string
    try:
        if 'REQUIREMENTS/MINIMUM QUALIFICATIONS' not in raw_job.split('\n'):
            print(job_path)
    except:                                # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

CityofLA/Job Bulletins/311 DIRECTOR  9206 041814.txt
CityofLA/Job Bulletins/ACCOUNTANT 1513 062218.txt
CityofLA/Job Bulletins/ACCOUNTING CLERK 1223 071318.txt
CityofLA/Job Bulletins/ACCOUNTING RECORDS SUPERVISOR 1119 072718.txt
CityofLA/Job Bulletins/ADMINISTRATIVE CLERK 1358 033018 (2).txt
CityofLA/Job Bulletins/ADMINISTRATIVE HEARING EXAMINER 9135 100915.txt
CityofLA/Job Bulletins/ADVANCE PRACTICE PROVIDER CORRECTIONAL CARE 2325 020808 REV 111214.txt
CityofLA/Job Bulletins/AIR CONDITIONING MECHANIC SUPERVISOR 3781 111618 2.txt
CityofLA/Job Bulletins/AIRPORT AIDE 1540 081018.txt
CityofLA/Job Bulletins/AIRPORT CHIEF INFORMATION SECURITY OFFICER 1404 120415_Modified.txt
CityofLA/Job Bulletins/AIRPORT ENGINEER 7256 070618.txt
CityofLA/Job Bulletins/AIRPORT GUIDE 0845 042018.txt
CityofLA/Job Bulletins/AIRPORT INFORMATION SPECIALIST 1783 121115.txt
CityofLA/Job Bulletins/AIRPORT POLICE CAPTAIN 3228 021618.txt
CityofLA/Job Bulletins/AIRPORT POLICE LIEUTENANT 3227 091616.txt
CityofLA/Job Bul

This time, the list is quite long, but again, patience will pay off later. We'll modify the contents of the following jobs by manually adding into them the phrase REQUIREMENTS/MINIMUM QUALIFICATIONS.
* CityofLA/Job Bulletins/311 DIRECTOR  9206 041814.txt
* CityofLA/Job Bulletins/ACCOUNTANT 1513 062218.txt
* CityofLA/Job Bulletins/ACCOUNTING CLERK 1223 071318.txt
* CityofLA/Job Bulletins/ACCOUNTING RECORDS SUPERVISOR 1119 072718.txt
* CityofLA/Job Bulletins/ADMINISTRATIVE CLERK 1358 033018 (2).txt
* CityofLA/Job Bulletins/ADMINISTRATIVE HEARING EXAMINER 9135 100915.txt
* CityofLA/Job Bulletins/ADVANCE PRACTICE PROVIDER CORRECTIONAL CARE 2325 020808 REV 111214.txt
* CityofLA/Job Bulletins/AIR CONDITIONING MECHANIC SUPERVISOR 3781 111618 2.txt
* CityofLA/Job Bulletins/AIRPORT AIDE 1540 081018.txt
* CityofLA/Job Bulletins/AIRPORT CHIEF INFORMATION SECURITY OFFICER 1404 120415_Modified.txt
* CityofLA/Job Bulletins/AIRPORT ENGINEER 7256 070618.txt
* CityofLA/Job Bulletins/AIRPORT GUIDE 0845 042018.txt
* CityofLA/Job Bulletins/AIRPORT INFORMATION SPECIALIST 1783 121115.txt
* CityofLA/Job Bulletins/AIRPORT POLICE CAPTAIN 3228 021618.txt
* CityofLA/Job Bulletins/AIRPORT POLICE LIEUTENANT 3227 091616.txt
* CityofLA/Job Bulletins/AIRPORT POLICE OFFICER 3225 110906 Rev 060115.txt
* CityofLA/Job Bulletins/AIRPORT POLICE SPECIALIST 3236 063017 (2).txt
* CityofLA/Job Bulletins/AIRPORT SUPERINTENDENT OF OPERATIONS 7268 121815 (1).txt
* CityofLA/Job Bulletins/ANIMAL CARE TECHNICIAN 4310 040116 REV 041416.txt
* CityofLA/Job Bulletins/ANIMAL CARE TECHNICIAN SUPERVISOR 4313 122118.txt
* CityofLA/Job Bulletins/APPRENTICE - METAL TRADES 3789 070816.txt
* CityofLA/Job Bulletins/APPRENTICE MACHINIST 3764 071516.txt
* CityofLA/Job Bulletins/AQUARIST 2400 050214.txt
* CityofLA/Job Bulletins/AQUARIUM EDUCATOR 2493 010816.txt
* CityofLA/Job Bulletins/AQUATIC FACILITY MANAGER 2423 052915 REVISED 060915.txt
* CityofLA/Job Bulletins/AQUEDUCT AND RESERVOIR KEEPER 5813 063017 (1).txt
* CityofLA/Job Bulletins/AQUEDUCT AND RESERVOIR SUPERVISOR 5816 091115.txt
* CityofLA/Job Bulletins/ARCHITECTURAL ASSOCIATE 7926 013114 REV 032916.txt
* CityofLA/Job Bulletins/ARCHIVIST 1191 020918.txt
* CityofLA/Job Bulletins/ART CENTER DIRECTOR 2478 053014.txt
* CityofLA/Job Bulletins/ART INSTRUCTOR 2447 051316.txt
* CityofLA/Job Bulletins/ASBESTOS SUPERVISOR 3440 012916.txt
* CityofLA/Job Bulletins/ASPHALT PLANT OPERATOR 4143 102414.txt
* CityofLA/Job Bulletins/ASPHALT PLANT SUPERVISOR 4145 110317.txt
* CityofLA/Job Bulletins/ASSISTANT DIRECTOR INFORMATION SYSTEMS 9377 030218.txt
* CityofLA/Job Bulletins/ASSISTANT INSPECTOR 4208 111315.txt
* CityofLA/Job Bulletins/ASSISTANT RETIREMENT PLAN MANAGER 9415 050616.txt
* CityofLA/Job Bulletins/ASSISTANT SIGNAL SYSTEMS ELECTRICIAN 3818 073115_REVISED.txt
* CityofLA/Job Bulletins/ASSISTANT STREET LIGHTING ELECTRICIAN 3809 072117 REV 080818.txt
* CityofLA/Job Bulletins/AUDIO VISUAL TECHNICIAN 6147 062014.txt
* CityofLA/Job Bulletins/AUDITOR 1517 031816 (1).txt
* CityofLA/Job Bulletins/AUTO BODY REPAIR SUPERVISOR 3706 051515.txt
* CityofLA/Job Bulletins/AUTO ELECTRICIAN 3707 052215.txt
* CityofLA/Job Bulletins/AUTOMOTIVE DISPATCHER 3595 102017 revised.txt
* CityofLA/Job Bulletins/AUTOMOTIVE SUPERVISOR 3714 062416.txt
* CityofLA/Job Bulletins/AVIONICS SPECIALIST 3565 103114revised.txt
* CityofLA/Job Bulletins/BENEFITS SPECIALIST 1203 011918.txt
* CityofLA/Job Bulletins/BOILERMAKER 3735 110714.txt
* CityofLA/Job Bulletins/BOILERMAKER SUPERVISOR 3737 101714.txt
* CityofLA/Job Bulletins/BUILDING CIVIL ENGINEER 7244 032318.txt
* CityofLA/Job Bulletins/BUILDING CONSTRUCTION AND MAINTENANCE SUPERINTENDENT 3124 122818.txt
* CityofLA/Job Bulletins/BUILDING ELECTRICAL ENGINEER 7543 071516 REV 071816.txt
* CityofLA/Job Bulletins/BUILDING OPERATING ENGINEER 5923 111618 REV 112818.txt
* CityofLA/Job Bulletins/BUILDING REPAIR SUPERVISOR 3338 111816.txt
* CityofLA/Job Bulletins/BUILDING REPAIRER 3333 030218.txt
* CityofLA/Job Bulletins/BUS OPERATOR 3588 090216.txt
* CityofLA/Job Bulletins/BUS OPERATOR SUPERVISOR 3589 012216.txt
* CityofLA/Job Bulletins/CARPENTER SUPERVISOR 3346 051316.txt
* CityofLA/Job Bulletins/CARPET LAYER 3418 061915.txt
* CityofLA/Job Bulletins/CEMENT FINISHER SUPERVISOR 3354 120916.txt
* CityofLA/Job Bulletins/CEMENT FINISHER WORKER 3351 103015.txt
* CityofLA/Job Bulletins/CHIEF ADMINISTRATIVE ANALYST 1554 062416.txt
* CityofLA/Job Bulletins/CHIEF AIRPORTS ENGINEER 7274 051515 (1).txt
* CityofLA/Job Bulletins/CHIEF BUILDING OPERATING ENGINEER 5927 080516.txt
* CityofLA/Job Bulletins/CHIEF CLERK PERSONNEL 1260 042117.txt
* CityofLA/Job Bulletins/CHIEF CLERK POLICE 1219 061215.txt
* CityofLA/Job Bulletins/CHIEF CLERK POLICE 1249 083118.txt
* CityofLA/Job Bulletins/CHIEF CONSTRUCTION INSPECTOR 7296 122818.txt
* CityofLA/Job Bulletins/CHIEF CUSTODIAN SUPERVISOR 3182 041015.txt
* CityofLA/Job Bulletins/CHIEF ELECTRIC PLANT OPERATOR 5237 121115.txt
* CityofLA/Job Bulletins/CHIEF ENVIRONMENTAL COMPLIANCE INSPECTOR 4289 033018.txt
* CityofLA/Job Bulletins/CHIEF FINANCIAL OFFICER 9230 041114.txt
* CityofLA/Job Bulletins/CHIEF HARBOR ENGINEER 9286 112015.txt
* CityofLA/Job Bulletins/CHIEF INSPECTOR 4254 082517.txt
* CityofLA/Job Bulletins/CHIEF INTERNAL AUDITOR 1619 090916 (5).txt
* CityofLA/Job Bulletins/CHIEF MANAGEMENT ANALYST 9182 020918.txt
* CityofLA/Job Bulletins/CHIEF OF DRAFTING OPERATIONS 7271  042018.txt
* CityofLA/Job Bulletins/CHIEF OF PARKING ENFORCEMENT OPERATIONS 9180 031618.txt
* CityofLA/Job Bulletins/CHIEF PARK RANGER 1968 120106 REV 121306.txt
* CityofLA/Job Bulletins/CHIEF PORT PILOT 5154 031816.txt
* CityofLA/Job Bulletins/CHIEF STREET SERVICES INVESTIGATOR 4286 2017Revised 11.21.txt
* CityofLA/Job Bulletins/CHIEF TAX COMPLIANCE OFFICER 1211 041814.txt
* CityofLA/Job Bulletins/CHIEF TRANSPORTATION INVESTIGATOR 4275 103114.txt
* CityofLA/Job Bulletins/CIVIL ENGINEER 7237 020918.txt
* CityofLA/Job Bulletins/CLAIMS AGENT 1767 020317.txt
* CityofLA/Job Bulletins/COMMERCIAL SERVICE SUPERVISOR  1213 061617.txt
* CityofLA/Job Bulletins/COMMISSION EXECUTIVE ASSISTANT 9734 092118.txt
* CityofLA/Job Bulletins/COMMUNICATIONS CABLE SUPERVISOR 3800 051917 REVISED 060117.txt
* CityofLA/Job Bulletins/COMMUNICATIONS CABLE WORKER 3802 11816.txt
* CityofLA/Job Bulletins/COMMUNITY AFFAIRS ADVOCATE 2496 111414.txt
* CityofLA/Job Bulletins/COMMUNITY HOUSING PROGRAMS MANAGER 8500 072018 REV 080918 (2).txt
* CityofLA/Job Bulletins/CONSTRUCTION AND MAINTENANCE SUPERINTENDENT 3129 082616 REV 090816.txt
* CityofLA/Job Bulletins/CONSTRUCTION AND MAINTENANCE SUPERVISOR 3127 030416.txt
* CityofLA/Job Bulletins/CONSTRUCTION ESTIMATOR 3341 070816 REVISED 072116 (1).txt
* CityofLA/Job Bulletins/CORRECTIONAL NURSE 2317 101615.txt
* CityofLA/Job Bulletins/CRIMINALIST 2234 030918.txt
* CityofLA/Job Bulletins/CUSTODIAN SUPERVISOR 3176 042817 051117 REV.txt
* CityofLA/Job Bulletins/CUSTOMER SERVICE REPRESENTATIVE 1230 020918.txt
* CityofLA/Job Bulletins/DECK HAND 5131 093016.txt
* CityofLA/Job Bulletins/DEPARTMENTAL CHIEF ACCOUNTANT 1593 111717 revised 11.21.txt
* CityofLA/Job Bulletins/DIRECTOR OF HOUSING 1568 062317.txt
* CityofLA/Job Bulletins/DIRECTOR OF MAINTENANCE AIRPORTS 7270 041516.txt
* CityofLA/Job Bulletins/DIRECTOR OF POLICE TRANSPORTATION 3722 061915.txt
* CityofLA/Job Bulletins/DIRECTOR OF PORT CONSTRUCTION AND MAINTENANCE 3123 030416.txt
* CityofLA/Job Bulletins/ELECTRIC DISTRIBUTION MECHANIC SUPERVISOR 3873 102816.txt
* CityofLA/Job Bulletins/ELECTRIC SERVICE REPRESENTATIVE 7520 020317.txt
* CityofLA/Job Bulletins/ELECTRIC TROUBLE DISPATCHER 3828 063017 (1).txt
* CityofLA/Job Bulletins/ELECTRICAL ENGINEERING DRAFTING TECHNICIAN 7532 113018.txt
* CityofLA/Job Bulletins/ELECTRICAL INSPECTOR 4221 030218.txt
* CityofLA/Job Bulletins/ELECTRICAL MECHANIC SUPERVISOR 3835 072216 REVISED 080416.txt
* CityofLA/Job Bulletins/ELECTRICAL REPAIR SUPERVISOR 3855 092217 (7).txt
* CityofLA/Job Bulletins/ELEVATOR MECHANIC 3866 012717 REV 080718.txt
* CityofLA/Job Bulletins/ELEVATOR REPAIR SUPERVISOR 032516 REVISED 040516.txt
* CityofLA/Job Bulletins/EMERGENCY MEDICAL SERVICES EDUCATOR  2322 110615 REV 112515.txt
* CityofLA/Job Bulletins/EMS NURSE PRACTITIONER SUPERVISOR 2340 031116 REV 032316 (1).txt
* CityofLA/Job Bulletins/ENGINEER OF SURVEYS 9486 101615.txt
* CityofLA/Job Bulletins/ENGINEERING DESIGNER 7217 082517 REV 090717.txt
* CityofLA/Job Bulletins/ENGINEERING GEOLOGIST 7255 022318.txt
* CityofLA/Job Bulletins/ENGINEERING GEOLOGIST ASSOCIATE 7253 082517.txt
* CityofLA/Job Bulletins/ENVIRONMENTAL COMPLIANCE INSPECTOR 4292 080516 REV 081616.txt
* CityofLA/Job Bulletins/ENVIRONMENTAL ENGINEER  7872 082616 REV 090116.txt
* CityofLA/Job Bulletins/ENVIRONMENTAL ENGINEERING ASSOCIATE  7871 020113 REV 032916.txt
* CityofLA/Job Bulletins/ENVIRONMENTAL SPECIALIST 7310 012916.txt
* CityofLA/Job Bulletins/ENVIRONMENTAL SUPERVISOR 7304 052518 (1).txt
* CityofLA/Job Bulletins/EQUIPMENT REPAIR SUPERVISOR 3746 012717.txt
* CityofLA/Job Bulletins/EQUIPMENT SUPERINTENDENT 3750 121914.txt
* CityofLA/Job Bulletins/EQUIPMENT SUPERVISOR 3527 041318.txt
* CityofLA/Job Bulletins/EXAMINER OF QUESTIONED DOCUMENTS 3229 120415.txt
* CityofLA/Job Bulletins/EXECUTIVE ADMINISTRATIVE ASSISTANT 1117 083118.txt
* CityofLA/Job Bulletins/EXECUTIVE ASSISTANT AIRPORTS 9186 060917 (1).txt
* CityofLA/Job Bulletins/EXHIBIT PREPARATOR 2444 062416.txt
* CityofLA/Job Bulletins/FINANCIAL MANAGER 1557 070116 Rev.txt
* CityofLA/Job Bulletins/FINGERPRINT IDENTIFICATION EXPERT 1157 052915.txt
* CityofLA/Job Bulletins/FIRE BATTALION CHIEF 2152 030918.txt
* CityofLA/Job Bulletins/FIRE CAPTAIN 2142 033018.txt
* CityofLA/Job Bulletins/FIRE HELICOPTER PILOT 3563 081415 REV. 081815.txt
* CityofLA/Job Bulletins/FIRE INSPECTOR 2128 031717.txt
* CityofLA/Job Bulletins/FIRE PROTECTION ENGINEERING ASSOCIATE 7978 041318.txt
* CityofLA/Job Bulletins/FIRE SPECIAL INVESTIGATOR 1632 021216.txt
* CityofLA/Job Bulletins/FIREARMS EXAMINER 2233 062416.txt
* CityofLA/Job Bulletins/FIREBOAT MATE 5125 102315 rev110515.txt
* CityofLA/Job Bulletins/FIREBOAT PILOT 5127 102315 rev110515 (1).txt
* CityofLA/Job Bulletins/GALLERY ATTENDANT 2442 092515.txt
* CityofLA/Job Bulletins/GARAGE ASSISTANT 3538 012017.txt
* CityofLA/Job Bulletins/GARAGE ATTENDANT 3531 013015.txt
* CityofLA/Job Bulletins/GENERAL AUTOMOTIVE SUPERVISOR 3718 061915.txt
* CityofLA/Job Bulletins/GENERAL SERVICES MANAGER  9601 042117.txt
* CityofLA/Job Bulletins/GEOGRAPHIC INFORMATION SYSTEMS CHIEF  7211 030416.txt
* CityofLA/Job Bulletins/GEOGRAPHIC INFORMATION SYSTEMS SPECIALIST 7213 012414 revised.txt
* CityofLA/Job Bulletins/GEOTECHNICAL ENGINEER 7239 090718 REV 092018.txt
* CityofLA/Job Bulletins/GOLF STARTER 2453 121115.txt
* CityofLA/Job Bulletins/GOLF STARTER SUPERVISOR 2479 120817.txt
* CityofLA/Job Bulletins/GRAPHICS SUPERVISOR 7935 052617 (4).txt
* CityofLA/Job Bulletins/HARBOR PLANNING AND ECONOMIC ANALYST 9224 111816 REV 112916.txt
* CityofLA/Job Bulletins/HEATING AND REFRIGERATION INSPECTOR 4245 121115.txt
* CityofLA/Job Bulletins/HEAVY DUTY EQUIPMENT MECHANIC 3743 021717.txt
* CityofLA/Job Bulletins/HELICOPTER MECHANIC 3742 072206 REV 020818.txt
* CityofLA/Job Bulletins/HELICOPTER MECHANIC SUPERVISOR 3749 121616 REV 122216.txt
* CityofLA/Job Bulletins/HYDROGRAPHER 7263 012717.txt
* CityofLA/Job Bulletins/IMPROVEMENT ASSESSOR SUPERVISOR 1564 100215.txt
* CityofLA/Job Bulletins/INDUSTRIAL AND COMMERCIAL FINANCE OFFICER 9191 rev051515.txt
* CityofLA/Job Bulletins/INDUSTRIAL CHEMIST 7834 020714.txt
* CityofLA/Job Bulletins/INFORMATION SYSTEMS MANAGER 1409 090117 (2).txt
* CityofLA/Job Bulletins/INSTRUMENT MECHANIC SUPERVISOR 3844 051917 final.txt
* CityofLA/Job Bulletins/INTERNAL AUDITOR 1625 011918.txt
* CityofLA/Job Bulletins/IRRIGATION SPECIALIST 3913 020615.txt
* CityofLA/Job Bulletins/LABOR SUPERVISOR 3126 121815.txt
* CityofLA/Job Bulletins/LABORATORY TECHNICIAN 7854 030416.txt
* CityofLA/Job Bulletins/LAND SURVEYING ASSISTANT 7283 120817.txt
* CityofLA/Job Bulletins/LANDSCAPE ARCHITECT 7929 090718.txt
* CityofLA/Job Bulletins/LANDSCAPE ARCHITECTURAL ASSOCIATE  7933 07222016.txt
* CityofLA/Job Bulletins/LEGISLATIVE ASSISTANT 1182 091815.txt
* CityofLA/Job Bulletins/LIBRARIAN 6152 051217 REV 020218.txt
* CityofLA/Job Bulletins/LIBRARY ASSISTANT 1172 051118 (2).txt
* CityofLA/Job Bulletins/LICENSED VOCATIONAL NURSE  2332 042415.txt
* CityofLA/Job Bulletins/LINE MAINTENANCE ASSISTANT 3882 122818.txt
* CityofLA/Job Bulletins/MACHINIST 3763 061016.txt
* CityofLA/Job Bulletins/MACHINIST SUPERVISOR 3766 121815.txt
* CityofLA/Job Bulletins/MANAGEMENT ASSISTANT 1539 032318.txt
* CityofLA/Job Bulletins/MARINE AQUARIUM PROGRAM DIRECTOR 2403 082517.txt
* CityofLA/Job Bulletins/MARINE ENVIRONMENTAL MANAGER 9437 060614.txt
* CityofLA/Job Bulletins/MARINE ENVIRONMENTAL SUPERVISOR 9433 071114 (1).txt
* CityofLA/Job Bulletins/MATERIALS TESTING TECHNICIAN 7968 081318.txt
* CityofLA/Job Bulletins/MECHANICAL HELPER 3771 011317.txt
* CityofLA/Job Bulletins/MECHANICAL REPAIR GENERAL SUPERVISOR 3731 040116 REV 041416.txt
* CityofLA/Job Bulletins/MECHANICAL REPAIR SUPERVISOR 3795 051818.txt
* CityofLA/Job Bulletins/MECHANICAL REPAIRER 3773 092118.txt
* CityofLA/Job Bulletins/METER READER 1611 080715.txt
* CityofLA/Job Bulletins/MOTOR SWEEPER OPERATOR 3585 031618.txt
* CityofLA/Job Bulletins/OCCUPATIONAL HEALTH NURSE  2314 020317 REV 022317 (1).txt
* CityofLA/Job Bulletins/OFFICE ENGINEERING TECHNICIAN 7212 110218.txt
* CityofLA/Job Bulletins/OFFICE TRAINEE 1101 012017.txt
* CityofLA/Job Bulletins/PAINTER SUPERVISOR 3426 120514.txt
* CityofLA/Job Bulletins/PARK SERVICES ATTENDANT 2412 032219.txt
* CityofLA/Job Bulletins/PARK SERVICES SUPERVISOR 2426 072018.txt
* CityofLA/Job Bulletins/PARKING ENFORCEMENT MANAGER 9025 021916 rev022516.txt
* CityofLA/Job Bulletins/PARKING MANAGER 9170 020714.txt
* CityofLA/Job Bulletins/PARKING METER TECHNICIAN 3738 110615 (1).txt
* CityofLA/Job Bulletins/PARKING METER TECHNICIAN SUPERVISOR 3757 2017.txt
* CityofLA/Job Bulletins/PAYROLL ANALYST 1630 031816.txt
* CityofLA/Job Bulletins/PAYROLL SUPERVISOR 1170 102618.txt
* CityofLA/Job Bulletins/PERSONNEL DIRECTOR 1714 050418.txt
* CityofLA/Job Bulletins/PERSONNEL RECORDS SUPERVISOR 1129 041318.txt
* CityofLA/Job Bulletins/PHOTOGRAPHER 1793 041516.txt
* CityofLA/Job Bulletins/PILE DRIVER WORKER 3553 041417.txt
* CityofLA/Job Bulletins/PIPEFITTER SUPERVISOR 3438 081216.txt
* CityofLA/Job Bulletins/PLUMBER 3443 113018.txt
* CityofLA/Job Bulletins/POLICE COMMANDER 2251 092917.txt
* CityofLA/Job Bulletins/POLICE DETECTIVE 2223 033018.txt
* CityofLA/Job Bulletins/POLICE OFFICER 2214 110906 Rev 060115.txt
* CityofLA/Job Bulletins/POLICE PERFORMANCE AUDITOR 1627 120216.txt
* CityofLA/Job Bulletins/POLICE SERGEANT 2227 102116.txt
* CityofLA/Job Bulletins/POLICE SERVICE REPRESENTATIVE 2207 051316 REV 051716.txt
* CityofLA/Job Bulletins/POLICE SPECIAL INVESTIGATOR 1640 072018 REV 011019.txt
* CityofLA/Job Bulletins/POLICE SPECIALIST 2217 110906 Rev 060115.txt
* CityofLA/Job Bulletins/POLICE SURVEILLANCE SPECIALIST 3687 052215.txt
* CityofLA/Job Bulletins/POLYGRAPH EXAMINER 2240 121517.txt
* CityofLA/Job Bulletins/PORT ELECTRICAL MECHANIC 3758 022616.txt
* CityofLA/Job Bulletins/PORT ELECTRICAL MECHANIC SUPERVISOR 3759 031816.txt
* CityofLA/Job Bulletins/PORT MAINTENANCE SUPERVISOR 3128 052016 REV 060216.txt
* CityofLA/Job Bulletins/PORT POLICE CAPTAIN 3224 110416.txt
* CityofLA/Job Bulletins/PORT POLICE LIEUTENANT 3223 120916.txt
* CityofLA/Job Bulletins/PORT POLICE OFFICER 3221 110906 Rev 060115.txt
* CityofLA/Job Bulletins/PORT POLICE SERGEANT 3222 121616.txt
* CityofLA/Job Bulletins/POWER SHOVEL OPERATOR 3558 062416.txt
* CityofLA/Job Bulletins/PRE-PRESS OPERATOR 1481 072817 (4).txt
* CityofLA/Job Bulletins/PRINCIPAL ACCOUNTANT 1525 121517.txt
* CityofLA/Job Bulletins/PRINCIPAL ANIMAL KEEPER 4312 070618.txt
* CityofLA/Job Bulletins/PRINCIPAL CIVIL ENGINEER 9489 022318.txt
* CityofLA/Job Bulletins/PRINCIPAL CIVIL ENGINEERING DRAFTING TECHNICIAN 7219 110218.txt
* CityofLA/Job Bulletins/PRINCIPAL CLERK 1201 021618.txt
* CityofLA/Job Bulletins/PRINCIPAL CLERK POLICE 1152 121815.txt
* CityofLA/Job Bulletins/PRINCIPAL COMMUNICATIONS OPERATOR 1458 072514.txt
* CityofLA/Job Bulletins/PRINCIPAL CONSTRUCTION INSPECTOR 7297 021618.txt
* CityofLA/Job Bulletins/PRINCIPAL DEPUTY CONTROLLER 7260 032814.txt
* CityofLA/Job Bulletins/PRINCIPAL DETENTION OFFICER 3215 101218.txt
* CityofLA/Job Bulletins/PRINCIPAL ELECTRIC TROUBLE DISPATCHER 3830 022616.txt
* CityofLA/Job Bulletins/PRINCIPAL ELECTRICAL ENGINEERING DRAFTING TECHNICIAN 7531 090916 TRACK CHANGES.txt
* CityofLA/Job Bulletins/PRINCIPAL ENVIRONMENTAL ENGINEER 7875 092118.txt
* CityofLA/Job Bulletins/PRINCIPAL GROUNDS MAINTENANCE SUPERVISOR  3147 111315.txt
* CityofLA/Job Bulletins/PRINCIPAL INSPECTOR 4226 061617.txt
* CityofLA/Job Bulletins/PRINCIPAL MECHANICAL ENGINEERING DRAFTING TECHNICIAN 7550 081415.txt
* CityofLA/Job Bulletins/PRINCIPAL PHOTOGRAPHER 1794 040116.txt
* CityofLA/Job Bulletins/PRINCIPAL PROPERTY OFFICER 3210 121517.txt
* CityofLA/Job Bulletins/PRINCIPAL RECREATION SUPERVISOR 2464 021618.txt
* CityofLA/Job Bulletins/PRINCIPAL STOREKEEPER 1839 072718.txt
* CityofLA/Job Bulletins/PRINCIPAL TAX AUDITOR 1524 110416.txt
* CityofLA/Job Bulletins/PRINCIPAL TAX COMPLIANCE OFFICER 1195 030218.txt
* CityofLA/Job Bulletins/PRINCIPAL UTILITY ACCOUNTANT 1589 030218 updated.txt
* CityofLA/Job Bulletins/PRINCIPAL WORKERS_ COMPENSATION ANALYST 1777 071814.txt
* CityofLA/Job Bulletins/PRINTING PRESS OPERATOR 1494 092515.txt
* CityofLA/Job Bulletins/PROPERTY OFFICER 3207 071417 (1).txt
* CityofLA/Job Bulletins/PROTECTIVE COATING WORKER 3463 082115.txt
* CityofLA/Job Bulletins/PUBLIC RELATIONS SPECIALIST 1785 012017.txt
* CityofLA/Job Bulletins/RATES MANAGER 5601 012017.txt
* CityofLA/Job Bulletins/REAL ESTATE ASSOCIATE 1941 052716.txt
* CityofLA/Job Bulletins/REAL ESTATE OFFICER 1960 051118.txt
* CityofLA/Job Bulletins/RECREATION COORDINATOR 2469 091517(1).txt
* CityofLA/Job Bulletins/RECREATION SUPERVISOR 2460 101416 REVISED 102716.txt
* CityofLA/Job Bulletins/REFUSE COLLECTION SUPERVISOR 4101 033117.txt
* CityofLA/Job Bulletins/REHABILITATION PROJECT COORDINATOR 8502 032715.txt
* CityofLA/Job Bulletins/REINFORCING STEEL WORKER 3483 022318.txt
* CityofLA/Job Bulletins/REPROGRAPHICS OPERATOR 3162 110615.txt
* CityofLA/Job Bulletins/REPROGRAPHICS SUPERVISOR  3163 091517.txt
* CityofLA/Job Bulletins/RETIREMENT PLAN MANAGER 9149 052314 (1).txt
* CityofLA/Job Bulletins/RISK AND INSURANCE ASSISTANT 1645 072718.txt
* CityofLA/Job Bulletins/RISK MANAGER 1530 2016 061716_REVISED.txt
* CityofLA/Job Bulletins/ROOFER 3476 121214.txt
* CityofLA/Job Bulletins/SAFETY ADMINISTRATOR 1728 101615.txt
* CityofLA/Job Bulletins/SAFETY ENGINEER 1727 021717.txt
* CityofLA/Job Bulletins/SAFETY ENGINEER ELEVATORS 4263 112015 REV 120215.txt
* CityofLA/Job Bulletins/SANITATION SOLID RESOURCES MANAGER 4126 060515.txt
* CityofLA/Job Bulletins/SECRETARY 1116 030317.txt
* CityofLA/Job Bulletins/SECRETARY LEGAL  1924 081718.txt
* CityofLA/Job Bulletins/SECURITY AIDE  3199 090415.txt
* CityofLA/Job Bulletins/SENIOR ACCOUNTANT 1523 030218.txt
* CityofLA/Job Bulletins/SENIOR ADMINISTRATIVE CLERK 1368 062918 REV 091718.txt
* CityofLA/Job Bulletins/SENIOR ANIMAL CONTROL OFFICER 4316 111618.txt
* CityofLA/Job Bulletins/SENIOR ANIMAL KEEPER 4305 022616.txt
* CityofLA/Job Bulletins/SENIOR ARCHITECTURAL DRAFTING TECHNICIAN 7208 091418.txt
* CityofLA/Job Bulletins/SENIOR AUDITOR 1518 102618.txt
* CityofLA/Job Bulletins/SENIOR AUTOMOTIVE SUPERVISOR 3716 112015.txt
* CityofLA/Job Bulletins/SENIOR BUILDING INSPECTOR 4213 010816.txt
* CityofLA/Job Bulletins/SENIOR BUILDING MECHANICAL INSPECTOR 4253 2017 REV (1).txt
* CityofLA/Job Bulletins/SENIOR BUILDING OPERATING ENGINEER 5925 011615 (1).txt
* CityofLA/Job Bulletins/SENIOR CARPENTER  3345 081117 REV 082417.txt
* CityofLA/Job Bulletins/SENIOR CHEMIST 7830 030416.txt
* CityofLA/Job Bulletins/SENIOR CIVIL ENGINEERING DRAFTING TECHNICIAN 7207 081718.txt
* CityofLA/Job Bulletins/SENIOR CLAIMS REPRESENTATIVE 1770 070717 (1).txt
* CityofLA/Job Bulletins/SENIOR COMMUNICATIONS CABLE WORKER 3801 102116 draft.txt
* CityofLA/Job Bulletins/SENIOR COMMUNICATIONS ELECTRICIAN 3638 030317 (1).txt
* CityofLA/Job Bulletins/SENIOR COMMUNICATIONS ELECTRICIAN SUPERVISOR 3691 041318.txt
* CityofLA/Job Bulletins/SENIOR COMMUNICATIONS OPERATOR 1467 122118.txt
* CityofLA/Job Bulletins/SENIOR COMPUTER OPERATOR 1428 102017.txt
* CityofLA/Job Bulletins/SENIOR CONSTRUCTION ENGINEER 7289 042514.txt
* CityofLA/Job Bulletins/SENIOR DATA PROCESSING TECHNICIAN 1139 081117.txt
* CityofLA/Job Bulletins/SENIOR DETENTION OFFICER 3212 012017.txt
* CityofLA/Job Bulletins/SENIOR ELECTRIC TROUBLE DISPATCHER 3829 100716.txt
* CityofLA/Job Bulletins/SENIOR ELECTRICAL ENGINEERING DRAFTING TECHNICIAN 7209 042817 REV 051117.txt
* CityofLA/Job Bulletins/SENIOR ELECTRICAL INSPECTOR 4223 042718.txt
* CityofLA/Job Bulletins/SENIOR ELECTRICAL MECHANIC 3834 060217 (2) REVISED.txt
* CityofLA/Job Bulletins/SENIOR ELECTRICAL MECHANIC SUPERVISOR 3836 080417.txt
* CityofLA/Job Bulletins/SENIOR ELECTRICAL REPAIR SUPERVISOR 3856 060118.txt
* CityofLA/Job Bulletins/SENIOR ELECTRICAL TEST TECHNICIAN  7515 092917 REV 101117.txt
* CityofLA/Job Bulletins/SENIOR ELECTRICIAN  3864 102116 Rev 110216.txt
* CityofLA/Job Bulletins/SENIOR ENVIRONMENTAL COMPLIANCE INSPECTOR 4293 042916 REV 051916.txt
* CityofLA/Job Bulletins/SENIOR ENVIRONMENTAL ENGINEER 7874 121815.txt
* CityofLA/Job Bulletins/SENIOR EQUIPMENT MECHANIC 3712 010518 REV 080718.txt
* CityofLA/Job Bulletins/SENIOR FIRE PROTECTION ENGINEER 7981 021916 rev022516.txt
* CityofLA/Job Bulletins/SENIOR FORENSIC PRINT SPECIALIST 2201 090718.txt
* CityofLA/Job Bulletins/SENIOR GARDENER 3143 121517 (1)revised.txt
* CityofLA/Job Bulletins/SENIOR HEATING AND REFRIGERATION INSPECTOR 4247 121115.txt
* CityofLA/Job Bulletins/SENIOR HEAVY DUTY EQUIPMENT MECHANIC 3745 012017.txt
* CityofLA/Job Bulletins/SENIOR HOUSING INSPECTOR 4244 042718.txt
* CityofLA/Job Bulletins/SENIOR HYDROGRAPHER 7264 030714.txt
* CityofLA/Job Bulletins/SENIOR LIBRARIAN 6153 033117.txt
* CityofLA/Job Bulletins/SENIOR LOAD DISPATCHER 5235 060118.txt
* CityofLA/Job Bulletins/SENIOR MACHINIST SUPERVISOR 3768 051016.txt
* CityofLA/Job Bulletins/SENIOR MANAGEMENT ANALYST 9171 040618.txt
* CityofLA/Job Bulletins/SENIOR MECHANICAL ENGINEERING DRAFTING TECHNICIAN 7210 110416.txt
* CityofLA/Job Bulletins/SENIOR MECHANICAL REPAIRER 3772 030416.txt
* CityofLA/Job Bulletins/SENIOR PAINTER 3424 041318.txt
* CityofLA/Job Bulletins/SENIOR PARK MAINTENANCE SUPERVISOR 3146 101416.txt
* CityofLA/Job Bulletins/SENIOR PARK RANGER 1967 091815.txt
* CityofLA/Job Bulletins/SENIOR PARKING ATTENDANT 3529 032417.txt
* CityofLA/Job Bulletins/SENIOR PHOTOGRAPHER 1795 041516 REVISED 042816.txt
* CityofLA/Job Bulletins/SENIOR PLUMBER 3444 020516.txt
* CityofLA/Job Bulletins/SENIOR PLUMBING INSPECTOR 4233 051818.txt
* CityofLA/Job Bulletins/SENIOR POLICE SERVICE REPRESENTATIVE 2209 020918.txt
* CityofLA/Job Bulletins/SENIOR PROPERTY OFFICER 3209 012618.txt
* CityofLA/Job Bulletins/SENIOR RECREATION DIRECTOR 2446 050517 REV 051117.txt
* CityofLA/Job Bulletins/SENIOR ROOFER 3477 101708 REV 110608.txt
* CityofLA/Job Bulletins/SENIOR SAFETY ENGINEER ELEVATORS 4264  042718.txt
* CityofLA/Job Bulletins/SENIOR SECURITY OFFICER 3184 122818.txt
* CityofLA/Job Bulletins/SENIOR STOREKEEPER 1837 052518.txt
* CityofLA/Job Bulletins/SENIOR SYSTEMS ANALYST 1597 100617.txt
* CityofLA/Job Bulletins/SENIOR TITLE EXAMINER 1947 121517.txt
* CityofLA/Job Bulletins/SENIOR TRAFFIC SUPERVISOR 3218 121517.txt
* CityofLA/Job Bulletins/SENIOR TRANSPORTATION INVESTIGATOR 4273 070717 (2).txt
* CityofLA/Job Bulletins/SENIOR UNDERGROUND DISTRIBUTION CONSTRUCTION SUPERVISOR 3815 072817.txt
* CityofLA/Job Bulletins/SENIOR UTILITY ACCOUNTANT 1521 100716.txt
* CityofLA/Job Bulletins/SENIOR UTILITY BUYER 1862 052518.txt
* CityofLA/Job Bulletins/SENIOR UTILITY SERVICES SPECIALIST 3573 113018.txt
* CityofLA/Job Bulletins/SENIOR UTILITY SERVICES SPECIALIST 3753 121815 (1).txt
* CityofLA/Job Bulletins/SENIOR WINDOW CLEANER 3174 013114 Rev021314.txt
* CityofLA/Job Bulletins/SHEET METAL SUPERVISOR 3777 061314.txt
* CityofLA/Job Bulletins/SHEET METAL WORKER 3775 093016.txt
* CityofLA/Job Bulletins/SHIFT SUPERINTENDENT WASTEWATER TREATMENT 7242 072415.txt
* CityofLA/Job Bulletins/SHOPS SUPERINTENDENT 3780 051118.txt
* CityofLA/Job Bulletins/SIGN PAINTER 3428 121214.txt
* CityofLA/Job Bulletins/SIGN SHOP SUPERVISOR 3419 030615.txt
* CityofLA/Job Bulletins/SIGNAL SYSTEMS SUPERINTENDENT 3832 110416.txt
* CityofLA/Job Bulletins/SIGNAL SYSTEMS SUPERVISOR 3839 092818.txt
* CityofLA/Job Bulletins/SOCIAL WORKER 2385 102717  revised.txt
* CityofLA/Job Bulletins/SOLID RESOURCES SUPERINTENDENT 4102 031017 REV 032317 (2).txt
* CityofLA/Job Bulletins/SPECIAL INVESTIGATOR 0602 042216.txt
* CityofLA/Job Bulletins/SR CRIME _ INTELLIGENCE ANALYST 2241 011516.txt
* CityofLA/Job Bulletins/STAFF ASSISTANT TO GENERAL MANAGER WATER AND POWER 9185 032715.txt
* CityofLA/Job Bulletins/STEAM PLANT MAINTENANCE MECHANIC 5630 0902116.txt
* CityofLA/Job Bulletins/STEAM PLANT MAINTENANCE SUPERVISOR 3786 033117.txt
* CityofLA/Job Bulletins/STEAM PLANT OPERATOR 5624 101416.txt
* CityofLA/Job Bulletins/STORES SUPERVISOR 1866 122917.txt
* CityofLA/Job Bulletins/STREET LIGHTING CONSTRUCTION AND MAINTENANCE SUPERINTENDENT 3820 051818.txt
* CityofLA/Job Bulletins/STREET LIGHTING ELECTRICIAN SUPERVISOR 3840 031717.txt
* CityofLA/Job Bulletins/STREET LIGHTING ENGINEERING ASSOCIATE 7527 101102 REV 032916.txt
* CityofLA/Job Bulletins/STREET SERVICES GENERAL SUPERINTENDENT 4160 042916.txt
* CityofLA/Job Bulletins/STREET SERVICES INVESTIGATOR 4283 102315 REV 110315.txt
* CityofLA/Job Bulletins/STREET SERVICES SUPERVISOR 4152 082815.txt
* CityofLA/Job Bulletins/STREET SERVICES WORKER 4150 032318.txt
* CityofLA/Job Bulletins/STREET TREE SUPERINTENDENT 3160 060917.txt
* CityofLA/Job Bulletins/STRUCTURAL ENGINEER 7956 101918.txt
* CityofLA/Job Bulletins/STRUCTURAL STEEL FABRICATOR 3793 122316.txt
* CityofLA/Job Bulletins/STRUCTURAL STEEL FABRICATOR SUPERVISOR 3794 060217.txt
* CityofLA/Job Bulletins/SUPERINTENDENT OF RECREATION AND PARKS OPERATIONS 2472 012618.txt
* CityofLA/Job Bulletins/SUPERVISING CRIMINALIST 2235 030416.txt
* CityofLA/Job Bulletins/SUPERVISING OCCUPATIONAL HEALTH  2315 111414.txt
* CityofLA/Job Bulletins/SUPERVISING TRANSPORTATION PLANNER 2481 072216.txt
* CityofLA/Job Bulletins/SUPERVISING WATER SERVICE REPRESENTATIVE 1697 081318.txt
* CityofLA/Job Bulletins/SURVEY PARTY CHIEF 7286 093016.txt
* CityofLA/Job Bulletins/SURVEY SUPERVISOR 7287 110918.txt
* CityofLA/Job Bulletins/SYSTEMS AIDE 1599 070116.txt
* CityofLA/Job Bulletins/SYSTEMS PROGRAMMER 1455 091616 REV 100416.txt
* CityofLA/Job Bulletins/TAX COMPLIANCE AIDE 1173 061215 (1).txt
* CityofLA/Job Bulletins/TAX COMPLIANCE OFFICER 1179 111816.txt
* CityofLA/Job Bulletins/TILE SETTER 3493 090415.txt
* CityofLA/Job Bulletins/TITLE EXAMINER 1943 032318 REV 040518.txt
* CityofLA/Job Bulletins/TRAFFIC MARKING AND SIGN SUPERINTENDENT 3430 032219.txt
* CityofLA/Job Bulletins/TRAFFIC OFFICER 3214 040116.txt
* CityofLA/Job Bulletins/TRAFFIC PAINTER AND SIGN POSTER 3421 033117.txt
* CityofLA/Job Bulletins/TRANSMISSION AND DISTRIBUTION DISTRICT SUPERVISOR 3875 050418 REV 051718.txt
* CityofLA/Job Bulletins/TRANSPORTATION ENGINEER 7278 092917.txt
* CityofLA/Job Bulletins/TRANSPORTATION ENGINEERING AIDE 7285 100915.txt
* CityofLA/Job Bulletins/TRANSPORTATION ENGINEERING ASSOCIATE 7280 072415.txt
* CityofLA/Job Bulletins/TRANSPORTATION INVESTIGATOR 4271 061016.txt
* CityofLA/Job Bulletins/TRANSPORTATION PLANNING ASSOCIATE 2480 072018.txt
* CityofLA/Job Bulletins/TREE SURGEON ASSISTANT 3151 060316.txt
* CityofLA/Job Bulletins/TRUCK OPERATOR 3583 012618.txt
* CityofLA/Job Bulletins/UNDERGROUND DISTRIBUTION CONSTRUCTION SUPERVISOR 3814 121418 REV 122718 (1).txt
* CityofLA/Job Bulletins/UPHOLSTERER 3723 041715.txt
* CityofLA/Job Bulletins/UTILITIES SERVICE INVESTIGATOR 1631 101615 (1).txt
* CityofLA/Job Bulletins/UTILITY ACCOUNTANT 1511 092818.txt
* CityofLA/Job Bulletins/UTILITY ADMINISTRATOR 9105 060217.txt
* CityofLA/Job Bulletins/UTILITY BUYER 1861 090718.txt
* CityofLA/Job Bulletins/UTILITY EXECUTIVE SECRETARY 1336 042817 (1).txt
* CityofLA/Job Bulletins/UTILITY SERVICES SPECIALIST 3755 072117 (1).txt
* CityofLA/Job Bulletins/VETERINARY TECHNICIAN 2369 020599 REV 120417.txt
* CityofLA/Job Bulletins/VIDEO TECHNICIAN 6145 012717.txt
* CityofLA/Job Bulletins/WASTEWATER TREATMENT ELECTRICIAN SUPERVISOR 5613 060515.txt
* CityofLA/Job Bulletins/WATER BIOLOGIST 7856 120216.txt
* CityofLA/Job Bulletins/WATER MICROBIOLOGIST  7857 072514 rev073114.txt
* CityofLA/Job Bulletins/WATER SERVICE REPRESENTATIVE 1693 111717.txt
* CityofLA/Job Bulletins/WATER SERVICE SUPERVISOR 3930 012717.txt
* CityofLA/Job Bulletins/WATER TREATMENT OPERATOR 5885 122118.txt
* CityofLA/Job Bulletins/WATER TREATMENT SUPERVISOR 5887 072018.txt
* CityofLA/Job Bulletins/WATER UTILITY SUPERINTENDENT 3980 121418.txt
* CityofLA/Job Bulletins/WATERSHED RESOURCES SPECIALIST  7862 080516 (1).txt
* CityofLA/Job Bulletins/WATERWORKS ENGINEER 7248 071516 (1).txt
* CityofLA/Job Bulletins/WATERWORKS MECHANIC SUPERVISOR 3987 051614 (1).txt
* CityofLA/Job Bulletins/WELDER SUPERVISOR 3798 120817.txt
* CityofLA/Job Bulletins/WORKERS_ COMPENSATION ANALYST 1774 032417R.txt
* CityofLA/Job Bulletins/WORKERS_ COMPENSATION CLAIMS ASSISTANT 1775 041114.txt
* CityofLA/Job Bulletins/X-RAY AND LABORATORY TECHNICIAN 2358 012916.txt
* CityofLA/Job Bulletins/ZOO CURATOR 4297 040816.txt

**Let's rerun the function and observe the changes. Note we use .txt files in the JobBulletins_cleaned folder.**

Observing the printouts carefully this time, we see that all of the nuances above have been resolved.

In [12]:
# Rerun the code above to check if every job has the pharase 'REQUIREMENTS/MINIMUM QUALIFICATIONS' in it.
# If nothing gets printed out, that means we pass the test.
for file_name in cleaned_jobs:
    job_path     = cleaned_path + file_name    # define path to file_name
    cleaned_job  = open(job_path, 'rt').read() # read in job as a string
    try:
        if 'REQUIREMENTS/MINIMUM QUALIFICATIONS' not in cleaned_job.split('\n'):
            print(job_path)
    except:                                    # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

Finally, since our code depends on a strict logical order of sections in a job posting, we want to make sure that there is no other section between DUTIES and REQUIREMENTS/MINIMUM QUALIFICATIONS. Now, you can understand why the labor-intensive work we've done so far has started to pay off. Experience shows that sometimes there is a NOTE: or NOTES: section follows DUTIES before REQUIREMENTS/MINIMUM QUALIFICATIONS, although this doesn't happen a lot. The code below thus attempts to identify which job has this pattern. If there is one, we'll move it to **before** DUTIES and change it to NOTES (without colon to avoid confusion with another NOTES:). Also note that we're verifying this on .txt files in JobBulletings_cleaned.

In [13]:
# Make sure there is no other section between DUTIES and REQUIREMENTS/MINIMUM QUALIFICATIONS.
for file_name in cleaned_jobs:
    job_path     = cleaned_path + file_name    # define path to file_name
    cleaned_job  = open(job_path, 'rt').read() # read in job as a string
    try:
        txt = (cleaned_job[cleaned_job.index('DUTIES')
                            :cleaned_job.index('REQUIREMENTS/MINIMUM QUALIFICATIONS')]) # extract relevant text
        txt = txt.replace(':', ' ')             # strip off colons
        if ('NOTES' in txt) or ('NOTE' in txt): # search for NOTES or NOTE
            print(job_path)
    except:                                    # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

**Let's rerun the function and observe the changes. Note we use .txt files in the JobBulletins_cleaned folder.**

Observing the printouts carefully this time, we see that all of the nuances above have been resolved.

In [14]:
# Make sure there is no other section between DUTIES and REQUIREMENTS/MINIMUM QUALIFICATIONS.
for file_name in cleaned_jobs:
    job_path     = cleaned_path + file_name    # define path to file_name
    cleaned_job  = open(job_path, 'rt').read() # read in job as a string
    try:
        txt = (cleaned_job[cleaned_job.index('DUTIES')
                            :cleaned_job.index('REQUIREMENTS/MINIMUM QUALIFICATIONS')]) # extract relevant text
        txt = txt.replace(':', ' ')             # strip off colons
        if ('NOTES' in txt) or ('NOTE' in txt): # search for NOTES or NOTE
            print(job_path)
    except:                                    # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

## Get ENTRY_SALARY_GEN (esg)
Since the helper function for this field is closely related to the helper function for ENTRY_SALARY_DWP, we'll write a sub-helper function (starts with an underscore _ ) to assist the process. Now, recall above that we have the following section order: ... ANNUAL SALARY - NOTES: - NOTES - DUTIES, where the first NOTES: has a colon in it, which most of the time is used for explaining salary information, while the second NOTES doesn't have a colon in it, which is used for explaining specifics of DUTIES (e.g., night shift is required, etc.). Since the salary information is in between ANNUAL SALARY - NOTES:, our first job is to make sure that NOTES: is in every job. Thus, we wonder how many jobs that don't have NOTES: in between ANNUAL SALARY and DUTIES. Then, for such a job, we'll manually add NOTES: in the order mentioned.

We should also take advantage of this opportunity to check for any inconsistency that we haven't encountered so far, for the above functions. This is true power of the normalization approach: Jobs are crossed check in so many ways that the chance that we miss a job with an inconsistency we haven't known is extremely unlikely, which makes the final .csv output contains very accurate information.

In [22]:
# Find which job doesn't have NOTES: in between ANNUAL SALARY and DUTIES
for file_name in raw_jobs:
    job_path = raw_path + file_name        # define path to file_name
    raw_job  = open(job_path, 'rt').read() # read in job as a string
    try:
        txt  = raw_job[raw_job.index('ANNUAL SALARY'):raw_job.index('DUTIES')]
        if 'NOTES:' not in txt:
            print(job_path)
    except:                                # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

CityofLA/Job Bulletins/311 DIRECTOR  9206 041814.txt
CityofLA/Job Bulletins/ACCOUNTING CLERK 1223 071318.txt
CityofLA/Job Bulletins/ADMINISTRATIVE HEARING EXAMINER 9135 100915.txt
CityofLA/Job Bulletins/ADVANCE PRACTICE PROVIDER CORRECTIONAL CARE 2325 020808 REV 111214.txt
CityofLA/Job Bulletins/AIRPORT AIDE 1540 081018.txt
CityofLA/Job Bulletins/AIRPORT CHIEF INFORMATION SECURITY OFFICER 1404 120415_Modified.txt
CityofLA/Job Bulletins/AIRPORT LABOR RELATIONS ADVOCATE 9210 020119.txt
CityofLA/Job Bulletins/AIRPORT POLICE CAPTAIN 3228 021618.txt
CityofLA/Job Bulletins/AIRPORT POLICE LIEUTENANT 3227 091616.txt
CityofLA/Job Bulletins/AIRPORT POLICE OFFICER 3225 110906 Rev 060115.txt
CityofLA/Job Bulletins/AIRPORT POLICE SPECIALIST 3236 063017 (2).txt
CityofLA/Job Bulletins/AIRPORTS MAINTENANCE SUPERINTENDENT 3331 021518.txt
CityofLA/Job Bulletins/AIRPORTS PUBLIC AND COMMUNITY RELATIONS DIRECTOR 1788 120817.txt
CityofLA/Job Bulletins/ANIMAL CARE ASSISTANT 4323 020119.txt
CityofLA/Job Bulle

**Let's rerun the function and observe the changes. Note we use .txt files in the JobBulletins_cleaned folder.**

Observing the printouts carefully this time, we see that all of the nuances above have been resolved.

In [24]:
# Rerun the code above to check if every job has NOTES: between ANNUAL SALARY and DUTIES
# If nothing gets printed out, that means we pass the test.
for file_name in cleaned_jobs:
    job_path     = cleaned_path + file_name    # define path to file_name
    cleaned_job  = open(job_path, 'rt').read() # read in job as a string
    try:
        txt  = cleaned_job[cleaned_job.index('ANNUAL SALARY'):cleaned_job.index('DUTIES')]
        if 'NOTES:' not in txt:
            print(job_path)
    except:                                    # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

## Get OPEN_DATE (od)

Since SYSTEMS ANALYST 1596 102717.txt has the order: Open Date-(Exam Open to All, including Current City Employees)-ANNUAL SALARY, we'll get the open date by extracting information from Open Date to (, i.e., the openning parenthesis. If a job posting doesn't have such one, we'll manually add it to the its .txt file.

On the other hand, to test if the openning parenthesis exists, we search for it by looking at the text from Open Date to ANNUAL SALARY.

In [15]:
# Test whether the oppenning parenthesis is in the text from Open Date to ANNUAL SALARY
for file_name in raw_jobs:
    job_path = raw_path + file_name        # define path to file_name
    raw_job  = open(job_path, 'rt').read() # read in job as a string
    try:
        txt = raw_job[raw_job.index('Open Date'):raw_job.index('ANNUAL SALARY')]
        if '(' not in txt:
            print(job_path)
    except:                                # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

CityofLA/Job Bulletins/311 DIRECTOR  9206 041814.txt
CityofLA/Job Bulletins/ADVANCE PRACTICE PROVIDER CORRECTIONAL CARE 2325 020808 REV 111214.txt
CityofLA/Job Bulletins/AQUARIST 2400 050214.txt
CityofLA/Job Bulletins/ART CENTER DIRECTOR 2478 053014.txt
CityofLA/Job Bulletins/ASPHALT PLANT OPERATOR 4143 102414.txt
CityofLA/Job Bulletins/AUDIO VISUAL TECHNICIAN 6147 062014.txt
CityofLA/Job Bulletins/AVIONICS SPECIALIST 3565 103114revised.txt
CityofLA/Job Bulletins/BOILERMAKER 3735 110714.txt
CityofLA/Job Bulletins/BOILERMAKER SUPERVISOR 3737 101714.txt
CityofLA/Job Bulletins/CHIEF FINANCIAL OFFICER 9230 041114.txt
CityofLA/Job Bulletins/CHIEF PARK RANGER 1968 120106 REV 121306.txt
CityofLA/Job Bulletins/CHIEF TAX COMPLIANCE OFFICER 1211 041814.txt
CityofLA/Job Bulletins/CHIEF TRANSPORTATION INVESTIGATOR 4275 103114.txt
CityofLA/Job Bulletins/COMMUNITY AFFAIRS ADVOCATE 2496 111414.txt
CityofLA/Job Bulletins/GEOGRAPHIC INFORMATION SYSTEMS SPECIALIST 7213 012414 revised.txt
CityofLA/Job Bu

A few things to notice here:
1. There are two jobs that were caught with errors:
    * CityofLA/Job Bulletins/PUBLIC INFORMATION DIRECTOR 1800 030317.txt. It has the word Open date (lower case d) instead of Open Date (upper case D)
    * CityofLA/Job Bulletins/SENIOR REAL ESTATE OFFICER 1961 0413018 (2).txt. It has word ANNUALSALARY instead of ANNUAL SALARY.
2. The following jobs don't have the openning parenthesis in them:
    * CityofLA/Job Bulletins/311 DIRECTOR  9206 041814.txt
    * CityofLA/Job Bulletins/ADVANCE PRACTICE PROVIDER CORRECTIONAL CARE 2325 020808 REV 111214.txt
    * CityofLA/Job Bulletins/AQUARIST 2400 050214.txt
    * CityofLA/Job Bulletins/ART CENTER DIRECTOR 2478 053014.txt
    * CityofLA/Job Bulletins/ASPHALT PLANT OPERATOR 4143 102414.txt
    * CityofLA/Job Bulletins/AUDIO VISUAL TECHNICIAN 6147 062014.txt
    * CityofLA/Job Bulletins/AVIONICS SPECIALIST 3565 103114revised.txt
    * CityofLA/Job Bulletins/BOILERMAKER 3735 110714.txt
    * CityofLA/Job Bulletins/BOILERMAKER SUPERVISOR 3737 101714.txt
    * CityofLA/Job Bulletins/CHIEF FINANCIAL OFFICER 9230 041114.txt
    * CityofLA/Job Bulletins/CHIEF PARK RANGER 1968 120106 REV 121306.txt
    * CityofLA/Job Bulletins/CHIEF TAX COMPLIANCE OFFICER 1211 041814.txt
    * CityofLA/Job Bulletins/CHIEF TRANSPORTATION INVESTIGATOR 4275 103114.txt
    * CityofLA/Job Bulletins/COMMUNITY AFFAIRS ADVOCATE 2496 111414.txt
    * CityofLA/Job Bulletins/GEOGRAPHIC INFORMATION SYSTEMS SPECIALIST 7213 012414 revised.txt
    * CityofLA/Job Bulletins/INDUSTRIAL CHEMIST 7834 020714.txt
    * CityofLA/Job Bulletins/MARINE ENVIRONMENTAL MANAGER 9437 060614.txt
    * CityofLA/Job Bulletins/MARINE ENVIRONMENTAL SUPERVISOR 9433 071114 (1).txt
    * CityofLA/Job Bulletins/PAINTER SUPERVISOR 3426 120514.txt
    * CityofLA/Job Bulletins/PARKING MANAGER 9170 020714.txt
    * CityofLA/Job Bulletins/PRINCIPAL COMMUNICATIONS OPERATOR 1458 072514.txt
    * CityofLA/Job Bulletins/PRINCIPAL DEPUTY CONTROLLER 7260 032814.txt
    * CityofLA/Job Bulletins/PRINCIPAL WORKERS_ COMPENSATION ANALYST 1777 071814.txt
    * CityofLA/Job Bulletins/RETIREMENT PLAN MANAGER 9149 052314 (1).txt
    * CityofLA/Job Bulletins/ROOFER 3476 121214.txt
    * CityofLA/Job Bulletins/SENIOR CONSTRUCTION ENGINEER 7289 042514.txt
    * CityofLA/Job Bulletins/SENIOR HYDROGRAPHER 7264 030714.txt
    * CityofLA/Job Bulletins/SENIOR ROOFER 3477 101708 REV 110608.txt
    * CityofLA/Job Bulletins/SENIOR WINDOW CLEANER 3174 013114 Rev021314.txt
    * CityofLA/Job Bulletins/SHEET METAL SUPERVISOR 3777 061314.txt
    * CityofLA/Job Bulletins/SIGN PAINTER 3428 121214.txt
    * CityofLA/Job Bulletins/SUPERVISING OCCUPATIONAL HEALTH  2315 111414.txt
    * CityofLA/Job Bulletins/WATER MICROBIOLOGIST  7857 072514 rev073114.txt
    * CityofLA/Job Bulletins/WATERWORKS MECHANIC SUPERVISOR 3987 051614 (1).txt
    * CityofLA/Job Bulletins/WORKERS_ COMPENSATION CLAIMS ASSISTANT 1775 041114.txt

**Let's rerun the function and observe the changes. Note we use .txt files in the JobBulletins_cleaned folder.**

Observing the printouts carefully this time, we see that all of the nuances above have been resolved.

In [16]:
# Rerun the code above to check if every job has the pharase 'REQUIREMENTS/MINIMUM QUALIFICATIONS' in it.
# If nothing gets printed out, that means we pass the test.
for file_name in cleaned_jobs:
    job_path     = cleaned_path + file_name    # define path to file_name
    cleaned_job  = open(job_path, 'rt').read() # read in job as a string
    try:
        txt = cleaned_job[cleaned_job.index('Open Date'):cleaned_job.index('ANNUAL SALARY')]
        if '(' not in txt:
            print(job_path)
    except:                                    # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

Now, while fixing the jobs listed above, we noted that the phrase, Revised: ##-##-##, comes before the openning parenthesis, which can break our code. Let's fix this by listing all the jobs that has the word Revised in between Open Date and ANNUAL SALARY. If there is, we'll move it to the **end** of the job.

In [17]:
for file_name in cleaned_jobs:
    job_path     = cleaned_path + file_name    # define path to file_name
    cleaned_job  = open(job_path, 'rt').read() # read in job as a string
    try:
        txt = cleaned_job[cleaned_job.index('Open Date'):cleaned_job.index('ANNUAL SALARY')].lower()
        txt = txt.replace(':', ' ')            # strip off colons
        if 'revised' in txt:
            print(job_path)
    except:                                    # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

Finally, we can extract Open Date by using the helper function defined below.

In [18]:
# This is a helper function
def open_date(job):
    '''Returns OPEN_DATE (od)'''
    # Open Date: 10-27-17\n(Exam Open to All
    # From Open Date to the first '(' is where the information located
    temp = job[job.index('Open Date'):job.index('(')]
    # Get the last element
    od   = temp.split()[-1]
    
    return od

In [19]:
for file_name in cleaned_jobs:
    job_path = cleaned_path + file_name        # define path to file_name
    cleaned_job  = open(job_path, 'rt').read() # read in job as a string
    try:
        print(open_date(job=cleaned_job))
    except:                                # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

04-18-14
06-22-18
07-13-18
07-27-18
06-01-18
03-30-18
10-09-15
02-08-08
04-14-17
11-16-18
08-10-18
12-04-15
07-06-18
4-20-18
12-11-15
02-01-19
12-02-16
02-16-18
09-16-16
11-09-06
06-30-17
12-18-15
02-16-18
11-16-18
12-08-17
02-01-19
04-01-16
12-21-18
06-15-18
08-31-18
07-14-17
06-29-18
07-08-16
07-15-16
05-02-14
01-08-16
08-25-17
05-29-15
06-30-17
09-11-15
03-30-18
01-31-14
08-04-17
02-09-18
05-30-14
07-15-16
05-13-16
07-21-17
09-28-18
01-29-16
10-05-18
10-24-14
11-03-17
01-26-18
08-04-17
09-09-16
04-28-17
03-02-18
01-20-17
11-13-15
05-06-16
07-31-15
07-21-17
01-20-17
05-18-18
10-26-18
06-20-14
03-18-16
10-13-17
05-15-15
05-22-15
02-24-17
10-20-17
06-24-16
10-31-14
06-09-17
10-26-18
01-19-18
08-25-17
11-07-14
10-17-14
03-23-18
12-28-18
07-15-16
10-19-18
04-28-17
04-07-17
08-28-15
11-16-18
11-18-16
03-02-18
09-02-16
01-22-16
12-14-18
01-12-18
05-13-16
06-19-15
03-09-18
12-09-16
10-30-15
10-20-17
06-24-16
05-15-15
09-23-16
08-05-16
01-20-17
04-21-17
06-12-15
08-31-18
12-22-17
12-28-18
04

Note that there were two jobs that threw an error while reading in:
* EMERGENCY MEDICAL SERVICES EDUCATOR  2322 110615 REV 112515.txt
* EQUIPMENT MECHANIC 3711 051818.txt

The reason is that these two jobs have parentheses in their titles, thus causing the IndexError. For the moment, we'll fix these jobs by replacing the parentheses in their titles with squared brackets.

**Let's rerun the function and observe the changes. Note we use .txt files in the JobBulletins_cleaned folder.**

Observing the printouts carefully this time, we see that all of the nuances above have been resolved.

In [20]:
for file_name in cleaned_jobs:
    job_path = cleaned_path + file_name        # define path to file_name
    cleaned_job  = open(job_path, 'rt').read() # read in job as a string
    try:
        print(open_date(job=cleaned_job))
    except:                                # do some pretty printings here to help our eyes from pain
        ## define some useful variables
        border_line = '##############################################################################################'
        how_many    = int((len(border_line) - len(job_path))/2)
        print(border_line)
        ## do pretty printings
        print('#'*how_many + job_path + '#'*how_many)
        print(border_line)

04-18-14
06-22-18
07-13-18
07-27-18
06-01-18
03-30-18
10-09-15
02-08-08
04-14-17
11-16-18
08-10-18
12-04-15
07-06-18
4-20-18
12-11-15
02-01-19
12-02-16
02-16-18
09-16-16
11-09-06
06-30-17
12-18-15
02-16-18
11-16-18
12-08-17
02-01-19
04-01-16
12-21-18
06-15-18
08-31-18
07-14-17
06-29-18
07-08-16
07-15-16
05-02-14
01-08-16
08-25-17
05-29-15
06-30-17
09-11-15
03-30-18
01-31-14
08-04-17
02-09-18
05-30-14
07-15-16
05-13-16
07-21-17
09-28-18
01-29-16
10-05-18
10-24-14
11-03-17
01-26-18
08-04-17
09-09-16
04-28-17
03-02-18
01-20-17
11-13-15
05-06-16
07-31-15
07-21-17
01-20-17
05-18-18
10-26-18
06-20-14
03-18-16
10-13-17
05-15-15
05-22-15
02-24-17
10-20-17
06-24-16
10-31-14
06-09-17
10-26-18
01-19-18
08-25-17
11-07-14
10-17-14
03-23-18
12-28-18
07-15-16
10-19-18
04-28-17
04-07-17
08-28-15
11-16-18
11-18-16
03-02-18
09-02-16
01-22-16
12-14-18
01-12-18
05-13-16
06-19-15
03-09-18
12-09-16
10-30-15
10-20-17
06-24-16
05-15-15
09-23-16
08-05-16
01-20-17
04-21-17
06-12-15
08-31-18
12-22-17
12-28-18
04