## Public-Use Sample of Occupation and Industry Write-ins from ACS 2009   

Description : This public-use file shows industry and occupation write-in and code information from a sample of American Community Survey (ACS) 2009 records. The file provides data users with an idea of the types and variety of raw answers (write-ins) provided by respondents and the corresponding 4-digit Census classification codes given by the U.S. Census Bureau’s clerical coders. The Census classification codes are based on the SOC 2000 and the NAICS 2007. It should be noted that in determining the codes, clerical coders are provided with more information than is shown in this file, such as employer name, job duties, education, sex, and age.  

The file started from an initial sample of 10,000 records, but records with potentially personally identifiable information were removed, resulting in a final total of 9860 records.  

**File Layout**  

    Obs # : Observation number in this data file  
    OCC code : Occupation code given, Census code based on SOC 2000  
    OCW1 : Occupation write-in (ACS 2009 question 45)  
    IND code : Industry code given, Census code based on NAICS 2007   
    INW3 : Industry write-in (ACS 2009 question 43)  

*Sorted in the following order: occupation code, occupation write-in, industry code, Industry write-in*  

**RELATED LINKS**  

[The ASC 2009 Questionnaire](http://www.census.gov/acs/www/Downloads/questionnaires/2009/Quest09.pdf)  

[General Information about the ACS](http://www.census.gov/acs/www/)  
[The Standard Occupational Classification (SOC) System](http://www.bls.gov/soc/)  
[The North American Industry Classification System (NAICS)](http://www.census.gov/eos/www/naics/)  
[The Census Bureau Industry and Occupation Statistics Topic Page](http://www.census.gov/hhes/www/ioindex/ioindex.html)  


 

## STANDARD OCCUPATIONAL CLASSIFICATION  

[BLS 2018 Manual](https://www.bls.gov/soc/2018/soc_2018_manual.pdf)

### Classification principles  

The SOC Classification Principles form the basis on which the SOC system is structured. 
1. The SOC covers all occupations in which work is performed for pay or profit, including
work performed in family-operated enterprises by family members who are not directly
compensated. It excludes occupations unique to volunteers. Each occupation is assigned to
only one occupational category at the most detailed level of the classification.  
2. Occupations are classified based on work performed and, in some cases, on the skills,
education and/or training needed to perform the work.  
3. Workers primarily engaged in planning and the directing of resources are classified in
management occupations in Major Group 11–0000. Duties of these workers may include
supervision.  
4. Supervisors of workers in Major Groups 13–0000 through 29–0000 usually have work
experience and perform activities similar to those of the workers they supervise, and
therefore are classified with the workers they supervise.  
5. Workers in Major Group 31–0000 Healthcare Support Occupations assist and are usually
supervised by workers in Major Group 29–0000 Healthcare Practitioners and Technical
Occupations, and therefore there are no first-line supervisor occupations in Major Group
31–0000.  
6. Workers in Major Groups 33–0000 through 53–0000 whose primary duty is supervising are
classified in the appropriate first-line supervisor category because their work activities are
distinct from those of the workers they supervise.   
7. Apprentices and trainees are classified with the occupations for which they are being
trained, while helpers and aides are classified separately because they are not in training for
the occupation they are helping.  
8. If an occupation is not included as a distinct detailed occupation in the structure, it is
classified in an appropriate ‘‘All Other” occupation. ‘‘All Other’’ occupations are placed in
the structure when it is determined that the detailed occupations comprising a broad
occupation group do not account for all of the workers in the group, even though such
workers may perform a distinct set of work activities. These occupations appear as the last
occupation in the group with a code ending in ‘‘9’’ and are identified in their title by
having ‘‘All Other’’ appear at the end.  
9. The U.S. Bureau of Labor Statistics and the U.S. Census Bureau are charged with
collecting and reporting data on total U.S. employment across the full spectrum of SOC
Major Groups. Thus, for a detailed occupation to be included in the SOC, either the Bureau
of Labor Statistics or the Census Bureau must be able to collect and report data on that
occupation.  
10. To maximize the comparability of data, time series continuity is maintained to the extent
possible.


### Coding guidelines  

The following SOC coding guidelines are intended to assist users in consistently assigning SOC 
codes and titles to survey responses and in other coding activities.  
1. A worker should be assigned to an SOC occupation code based on work performed.  
2. When workers in a single job could be coded in more than one occupation, they should be
coded in the occupation that requires the highest level of skill. If there is no measurable
difference in skill requirements, workers should be coded in the occupation in which they
spend the most time. Workers whose job is to teach at different levels (e.g., elementary,
middle, or secondary) should be coded in the occupation corresponding to the highest
educational level they teach.  
3. Data collection and reporting agencies should assign workers to the most detailed
occupation possible. Different agencies may use different levels of aggregation, depending
on their ability to collect data.  
4. Workers who perform activities not described in any distinct detailed occupation in the
SOC structure should be coded in an appropriate ‘‘All Other’’ occupation. These
occupations appear as the last occupation in a group with a code ending in ‘‘9’’ and are
identified by having the words ‘‘All Other’’ appear at the end of the title.  
5. Workers in Major Groups 33–0000 through 53–0000 who spend 80 percent or more of their
time performing supervisory activities are coded in the appropriate first-line supervisor
category in the SOC. In these same Major Groups (33–0000 through 53– 0000), persons
with supervisory duties who spend less than 80 percent of their time supervising are coded
with the workers they supervise.  
6. Licensed and non-licensed workers performing the same work should be coded together in
the same detailed occupation, except where specified otherwise in the SOC definition.  

### SOC coding structure  

The occupations in the SOC are classified at four levels of aggregation to suit the needs of 
various data users: major group, minor group, broad occupation, and detailed occupation. Each 
lower level of detail identifies a more specific group of occupations. The 23 major groups, listed 
below, are divided into 98 minor groups, 459 broad occupations, and 867 detailed occupations.  
2018 SOC major groups  
Code Title  
11-0000 Management Occupations  
13-0000 Business and Financial Operations Occupations  
15-0000 Computer and Mathematical Occupations  
17-0000 Architecture and Engineering Occupations  
19-0000 Life, Physical, and Social Science Occupations  
21-0000 Community and Social Service Occupations  
23-0000 Legal Occupations  
25-0000 Educational Instruction and Library Occupations  
27-0000 Arts, Design, Entertainment, Sports, and Media Occupations  
29-0000 Healthcare Practitioners and Technical Occupations  
31-0000 Healthcare Support Occupations  
33-0000 Protective Service Occupations  
35-0000 Food Preparation and Serving Related Occupations  
37-0000 Building and Grounds Cleaning and Maintenance Occupations  
39-0000 Personal Care and Service Occupations  
41-0000 Sales and Related Occupations  
43-0000 Office and Administrative Support Occupations  
45-0000 Farming, Fishing, and Forestry Occupations  
47-0000 Construction and Extraction Occupations  
49-0000 Installation, Maintenance, and Repair Occupations  
51-0000 Production Occupations  
53-0000 Transportation and Material Moving Occupations  
55-0000 Military Specific Occupations  

Some users may require aggregations other than the SOC system built on these major groups. 
Further details on alternate occupational aggregations and approved modifications to the SOC 
structure are provided in the following section, Approved modifications to the structure.  

Major groups are broken into minor groups, which, in turn, are divided into broad occupations. 
Broad occupations are then divided into one or more detailed occupations, as follows:  
    29-0000 Healthcare Practitioners and Technical Occupations  
    29-1000 Health Diagnosing or Treating Practitioners  
    29-1020 Dentists  
    29-1022 Oral and Maxillofacial Surgeons  

* Major group codes end with 0000 (e.g., 29-0000 Healthcare Practitioners and Technical 
Occupations).   
* Minor groups generally end with 000 (e.g., 29-1000 Health Diagnosing or Treating 
Practitioners)—the exceptions are minor groups 15-1200 Computer Occupations, 31-
1100 Home Health and Personal Care Aides; and Nursing Assistants, Orderlies, and 
Psychiatric Aides, and 51-5100 Printing Workers, which end with 00.  
* Broad occupations end with 0 (e.g., 29-1020 Dentists).   
* Detailed occupations end with a number other than 0 (e.g., 29-1022 Oral and 
Maxillofacial Surgeons).  

Each item in the SOC is designated by a six-digit code. The hyphen between the second and third 
digit is used only for clarity. (See figure 1).  

![Figure 1](MarkdownFingures/BLS_2018_Figure1.png)

As shown in figure 2, “All Other” occupations (and “Other” or “Miscellaneous” occupations), 
whether at the detailed or broad occupation or minor group level, contain a “9” at the level of the 
“All Other” occupation. Minor groups that are major group “All Other” occupations end in 9000 
(e.g., 33-9000, Other Protective Service Workers). Broad occupations that are minor group “All 
Other” occupations end in 90 (e.g., 33-9090, Miscellaneous Protective Service Workers). 
Detailed “All Other” occupations end in 9 (e.g., 33-9099, Protective Service Workers, All 
Other).  

![Figure 1](MarkdownFingures/BLS_2018_Figure2.png)

If there are more than nine broad occupations in a minor group (e.g., 51-9000 Other Production 
Occupations); or more than eight, if there is no “All Other” occupation (e.g., 47-2000 
Construction Trades Workers), then the code xx-x090 is skipped (reserved for “All Other” 
occupations), the code xx-x000 is skipped (reserved for minor groups), and the numbering 
system will continue with code xx-x110. The “All Other” broad occupation is then code xx-x190 
or xx-x290 (e.g., 51-9190, Miscellaneous Production Workers).   

The structure is comprehensive, and encompasses all occupations in the U.S. economy. If a 
specific occupation is not listed, it is included in an “All Other” category with similar 
occupations.   

Detailed occupations are identified and defined so that each occupation includes workers who 
perform similar job tasks as described in Classification Principle 2. Definitions begin with the 
duties that all workers in the occupation perform. Some definitions include a sentence at the end 
describing tasks workers in an occupation may, but do not necessarily have to perform, in order 
to be included in the occupation. Where the definitions include tasks also performed by workers 
in another occupation, cross-references to that occupation are provided in the definition.   

Figure 3 identifies the eight elements that appear in detailed SOC occupations. All six-digit 
detailed occupations have a (1) SOC code, (2) title, and (3) definition. All workers classified in 
an occupation are required to perform the duties described in (4) the first sentence(s) of each 
definition that do not start with “May.” Some definitions also have a (5) “May” statement, a (6) 
“Includes” statement, and/or a (7) “Excludes” statement. Almost all occupations have one or 
more (8) “Illustrative Examples.” Illustrative examples are job titles classified in only that 
occupation, and were selected from the Direct Match Title File.   

![Figure 1](MarkdownFingures/BLS_2018_Figure3.png)  

“May” statements describe tasks that workers in that occupation may—but are not required to—
perform in order to be classified with Survey Researchers. The “Includes” statement identifies 
particular workers who should be classified with Survey Researchers. The “Excludes” statement 
indicates other detailed occupations that may be similar to Survey Researchers and clarifies that 
workers who fall into those occupations should be excluded from Survey Researchers.  

### Approved modifications to the structure   
Agencies may use the SOC or parts of the SOC at varying levels of the system. For example, 

data may be collected at the broad occupation level in some areas and at the detailed level in 
others.  

### Occupations below the detailed level  

The coding system is designed to allow SOC users desiring a delineation of occupations below 
the detailed occupation level to use a decimal point and additional digit(s) after the sixth digit. 
For example, Secondary School Teachers, Except Special and Career/Technical Education (25-
2031) is a detailed occupation. Agencies wishing to collect more particular information on 
teachers by subject matter might use 25-2031.01 for secondary school science teachers or 25-
2031.02 for secondary school mathematics teachers. Additional levels of detail also may be used 
to distinguish workers who have different training or years of experience.   
OMB recommends that SOC users needing extra detail should employ the structure of the 
Department of Labor’s Employment and Training Administration’s Occupational Information 
Network (O*NET). For more information, see https://online.onetcenter.org.   
 

### Higher levels of aggregation  

Some users may wish to present occupational data at higher levels of aggregation than the SOC 
major groups. To meet this need and to maintain consistency and comparability across datasets, 
BLS recommends that either the intermediate or the high-level aggregations presented in tables 5 
and 6 should be used for data tabulation purposes.   

 : Table 5. Intermediate aggregation to 13 groups, 2018 SOC  

: Intermediate aggregation : Major groups included : Intermediate aggregation title :  
: ------------------------ : --------------------- :------------------------------- :  
: 1 : 11–13 : Management, Business, and Financial Occupations :  
: 2 : 15–19 : Computer, Engineering, and Science Occupations :  
: 3 : 21–27 : Education, Legal, Community Service, Arts, and Media Occupations :  
: 4 : 29 : Healthcare Practitioners and Technical Occupations :  
: 5 : 31–39 : Service Occupations :  
: 6 : 41 : Sales and Related Occupations :  
: 7 : 43 : Office and Administrative Support Occupations :  
: 8 : 45 : Farming, Fishing, and Forestry Occupations :  
: 9 : 47 : Construction and Extraction Occupations :  
: 10 : 49 : Installation, Maintenance, and Repair Occupations :  
: 11 : 51 : Production Occupations :  
: 12 : 53 : Transportation and Material Moving Occupations :  
: 13 : 55 : Military Specific Occupations :  



: Table 6. High-level aggregation to 6 groups, 2018 SOC  

: High-level aggregation : Major groups included : High-level aggregation title :  
:--------------------------:-----------------------:--------------------------------:  
: 1 : 11–29 : Management, Business, Science, and Arts Occupations :  
: 2 : 31–39 : Service Occupations :  
: 3 : 41–43 : Sales and Office Occupations :  
: 4 : 45–49 : Natural Resources, Construction, and Maintenance Occupations :  
: 5 : 51–53 : Production, Transportation, and Material Moving Occupations :  
: 6 : 55 : Military Specific Occupations :  

### Alternate aggregations   

Data collection issues or confidentiality concerns may prevent agencies from reporting all the 
detail indicated in the SOC. For example, an agency might report the detail of at least one 
occupational category at a particular level of the SOC structure but must aggregate the other 
occupations at that level. In such cases, the agency may adjust the occupational categories so 
long as these adjustments permit aggregation to the next higher SOC level. In such a situation, 
agencies must distinguish such groups from the official SOC aggregation. If agencies choose this 
option, they must obtain approval from the SOCPC for their proposed aggregation scheme.   


In [1]:
# import pandas as pd
import os
os.getcwd()
# dat = pd.DataFrame(col1 = ["Apple","Banna"])
# dat.sort_values()

'h:\\home\\GIT\\NLP-Community\\Project - Standard Occ Classification'