## Introduction ##

What did the top wage earners, based on percentile,  in select states, study in college?  Are they currently in their field of study?  Furthermore, were they born in this county, are they married, have they moved around from state to state in the past years...what is the distribution of income in their fields what is the difference of earning potential of man vs women, race etc.

This is one of Kaggle's larger Dataset.  In addition, these records are weighted.  For example, one record may represent 30 to 70 people.  Since I want to look at percentiles, it's going to be necessary to expand these records, which might limit how far a single kernel can going in doing this type of analysis.

Again. From my understand of the documentation, **PWGTP** must be taken into account.


By default the code is hidden in this notebook, because it can get kind of messy.  But, if you click
on the link below, it should expose the code.


<button onclick="javascript:code_toggle()()" class="button">Show Code</button>

In [None]:
"""
Note:  To see the full code in this notebook, you may have to
       fork and edit, since unhiding the DIV tags exapand only
       to the default lenght of the page in the brower.
       
        


"""

from IPython.display import display
from IPython.display import HTML
import IPython.core.display as di 

htmlCode="""
<style>
    .button {
        background-color: #008CBA;;
        border: none;
        color: white;
        padding: 8px 22px;
        text-align: center;
        text-decoration: none;
        display: inline-block;
        font-size: 16px;
        margin: 4px 2px;
        cursor: pointer;
    }
</style>
<script>
    code_show=true;
    function code_toggle() {
        if (code_show){
            $('div.input').hide();
        } else {
            $('div.input').show();
        }
        code_show = !code_show
    }
    $( document ).ready(code_toggle);
</script>

"""
HTML(htmlCode)

In [None]:
from IPython.display import HTML

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import warnings
warnings.filterwarnings("ignore")
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

sns.set(style="white", color_codes=True)

from subprocess import check_output
#print(check_output(["ls", "../input"]).decode("utf8"))

# ST,    STATE
# SEX    1 Make, 2 Female
# PWGTP  (Weighting of record. Very important)
# AGEP   Age
# MSP    Married, spouse present/spouse absent
# FOD1P  Recoded field of degree - first entry (College)
# PERNP  Total person's earnings
# ADJINC Use ADJINC to adjust PERNP to constant dollars.
# SCHL   Highest education allocation flag
# SCH    School enrollment
# ESR    Employment status recode
# WKHP   Usual hours worked per week past 12 months allocation flag
# RAC1P  Recoded detailed race code 1) White  2) Black ... see below
# FOD1P  Field of Degree
# OCCP   Occupational Code
             

# Careful -- you don't want to select every column in the Dataset. That's too much.    
fields = ["PUMA", "ST", "SEX","PWGTP","AGEP", "MSP","FOD1P","PERNP","ADJINC","SCHL",
          "SCH","ESR","WKHP","RAC1P","FOD1P","OCCP"]

# Two files...each file is about 1.5G in size.
data_a=pd.read_csv("../input/ss15pusa.csv",skipinitialspace=True, usecols=fields)
data_b=pd.read_csv("../input/ss15pusb.csv",skipinitialspace=True, usecols=fields)

In [None]:
# Concatinate -- we'll work with everything.
d=pd.concat([data_a,data_b])

# Get the correct income in 2015 dollars
d['INCOME']=d['ADJINC']*d['PERNP']/1000000

del d['ADJINC']


d.head()

## Dictionary for State and College Majors ##

Reference the Data Dictionary for the Dataset.  If you unhide the code, you'll see 
states, college major first choice, and occupational code put into dictionaries.

In [None]:


#ST codes
states={6:"California/CA",
        8:"Colorado/CO",
        10:"Delaware/DE",
        24:"Maryland/MD",
        25:"Massachusetts/MA",
        34:"New Jersey/NJ",        
        36:"New York/NY",
        42:"Pennsylvania/PA",
        49:"Utah/UT",
        53:"Washington/WA",
        54:"West Virginia/WV"}
 
 
# College Majors
fod1p={0: 'PRODUCTION', 4101: 'PHYSICAL FITNESS PARKS RECREATION AND LEISURE', 
       5200: 'PSYCHOLOGY', 6402: 'HISTORY', 3600: 'BIOLOGY', 3601: 'BIOCHEMICAL SCIENCES', 
       3602: 'BOTANY', 3603: 'MOLECULAR BIOLOGY', 3604: 'ECOLOGY', 3605: 'GENETICS', 
       3606: 'MICROBIOLOGY', 3607: 'PHARMACOLOGY', 3608: 'PHYSIOLOGY', 3609: 'ZOOLOGY', 
       3611: 'NEUROSCIENCE', 2599: 'MISCELLANEOUS ENGINEERING TECHNOLOGIES', 
       2601: 'LINGUISTICS AND COMPARATIVE LANGUAGE AND LITERATURE', 
       2602: 'FRENCH GERMAN LATIN AND OTHER COMMON FOREIGN LANGUAGE', 
       2603: 'OTHER FOREIGN LANGUAGES', 2100: 'COMPUTER AND INFORMATION SYSTEMS', 
       2101: 'COMPUTER PROGRAMMING AND DATA PROCESSING', 2102: 'COMPUTER SCIENCE', 
       6199: 'MISCELLANEOUS HEALTH MEDICAL PROFESSIONS', 6200: 'GENERAL BUSINESS', 
       2105: 'INFORMATION SCIENCES', 2106: 'COMPUTER ADMINISTRATION MANAGEMENT AND SECURITY', 
       2107: 'COMPUTER NETWORKING AND TELECOMMUNICATIONS', 
       6204: 'OPERATIONS LOGISTICS AND E-COMMERCE', 6205: 'BUSINESS ECONOMICS', 
       6206: 'MARKETING AND MARKETING RESEARCH', 6207: 'FINANCE', 
       6209: 'HUMAN RESOURCES AND PERSONNEL MANAGEMENT', 6210: 'INTERNATIONAL BUSINESS', 
       6211: 'HOSPITALITY MANAGEMENT', 6212: 'MANAGEMENT INFORMATION SYSTEMS AND STATISTICS', 
       5701: 'ELECTRICAL', 1100: 'GENERAL AGRICULTURE', 
       1101: 'AGRICULTURE PRODUCTION AND MANAGEMENT', 1102: 'AGRICULTURAL ECONOMICS', 
       1103: 'ANIMAL SCIENCES', 1104: 'FOOD SCIENCE', 1105: 'PLANT SCIENCE AND AGRONOMY', 
       1106: 'SOIL SCIENCE', 5203: 'COUNSELING PSYCHOLOGY', 
       5205: 'INDUSTRIAL AND ORGANIZATIONAL PSYCHOLOGY', 5206: 'SOCIAL PSYCHOLOGY', 
       3699: 'MISCELLANEOUS BIOLOGY', 3700: 'MATHEMATICS', 3701: 'APPLIED MATHEMATICS', 
       3702: 'STATISTICS AND DECISION SCIENCE', 3201: 'COURT REPORTING', 
       3202: 'PRE-LAW AND LEGAL STUDIES', 2413: 'MATERIALS ENGINEERING AND MATERIALS SCIENCE', 
       2414: 'MECHANICAL ENGINEERING', 2201: 'COSMETOLOGY SERVICES AND CULINARY ARTS', 
       2415: 'METALLURGICAL ENGINEERING', 1904: 'ADVERTISING AND PUBLIC RELATIONS', 
       6001: 'DRAMA AND THEATER ARTS', 2418: 'NUCLEAR ENGINEERING', 
       1199: 'MISCELLANEOUS AGRICULTURE', 5299: 'MISCELLANEOUS PSYCHOLOGY', 
       5301: 'CRIMINAL JUSTICE AND FIRE PROTECTION', 4801: 'PHILOSOPHY AND RELIGIOUS STUDIES', 
       3801: 'MILITARY TECHNOLOGIES', 3301: 'ENGLISH LANGUAGE AND LITERATURE', 
       3302: 'COMPOSITION AND RHETORIC', 2300: 'GENERAL EDUCATION', 
       2301: 'EDUCATIONAL ADMINISTRATION AND SUPERVISION', 2303: 'SCHOOL STUDENT COUNSELING', 
       2304: 'ELEMENTARY EDUCATION', 2305: 'MATHEMATICS TEACHER EDUCATION', 
       2306: 'PHYSICAL AND HEALTH EDUCATION TEACHING', 2307: 'EARLY CHILDHOOD EDUCATION', 
       2308: 'SCIENCE AND COMPUTER TEACHER EDUCATION', 2309: 'SECONDARY TEACHER EDUCATION', 
       2310: 'SPECIAL NEEDS EDUCATION', 2311: 'SOCIAL SCIENCE OR HISTORY TEACHER EDUCATION', 
       2312: 'TEACHER EDUCATION: MULTIPLE LEVELS', 2313: 'LANGUAGE AND DRAMA EDUCATION', 
       2314: 'ART AND MUSIC EDUCATION', 5901: 'TRANSPORTATION SCIENCES AND TECHNOLOGIES', 
       1301: 'ENVIRONMENTAL SCIENCE', 1302: 'FORESTRY', 1303: 'NATURAL RESOURCES MANAGEMENT', 
       5401: 'PUBLIC ADMINISTRATION', 5402: 'PUBLIC POLICY', 
       5403: 'HUMAN SERVICES AND COMMUNITY ORGANIZATION', 5404: 'SOCIAL WORK', 
       4901: 'THEOLOGY AND RELIGIOUS VOCATIONS', 6403: 'UNITED STATES HISTORY', 
       1501: 'AREA ETHNIC AND CIVILIZATION STUDIES', 2416: 'MINING AND MINERAL ENGINEERING', 
       3401: 'LIBERAL ARTS', 3402: 'HUMANITIES', 2901: 'FAMILY AND CONSUMER SCIENCES', 
       6201: 'ACCOUNTING', 6202: 'ACTUARIAL SCIENCE', 2399: 'MISCELLANEOUS EDUCATION', 
       2400: 'GENERAL ENGINEERING', 2401: 'AEROSPACE ENGINEERING', 2402: 'BIOLOGICAL ENGINEERING', 
       2403: 'ARCHITECTURAL ENGINEERING', 2404: 'BIOMEDICAL ENGINEERING', 
       2405: 'CHEMICAL ENGINEERING', 2406: 'CIVIL ENGINEERING', 2407: 'COMPUTER ENGINEERING', 
       2408: 'ELECTRICAL ENGINEERING', 2409: 'ENGINEERING MECHANICS PHYSICS AND SCIENCE', 
       2410: 'ENVIRONMENTAL ENGINEERING', 2411: 'GEOLOGICAL AND GEOPHYSICAL ENGINEERING', 
       2412: 'INDUSTRIAL AND MANUFACTURING ENGINEERING', 1901: 'COMMUNICATIONS', 
       1902: 'JOURNALISM', 1903: 'MASS MEDIA', 6000: 'FINE ARTS', 
       2417: 'NAVAL ARCHITECTURE AND MARINE ENGINEERING', 6002: 'MUSIC', 
       6003: 'VISUAL AND PERFORMING ARTS', 6004: 'COMMERCIAL ART AND GRAPHIC DESIGN', 
       6005: 'FILM VIDEO AND PHOTOGRAPHIC ARTS', 6006: 'ART HISTORY AND CRITICISM', 
       6007: 'STUDIO ARTS', 1401: 'ARCHITECTURE', 5500: 'GENERAL SOCIAL SCIENCES', 
       5501: 'ECONOMICS', 5502: 'ANTHROPOLOGY AND ARCHEOLOGY', 5503: 'CRIMINOLOGY', 
       5504: 'GEOGRAPHY', 5505: 'INTERNATIONAL RELATIONS', 
       5506: 'POLITICAL SCIENCE AND GOVERNMENT', 5507: 'SOCIOLOGY', 
       5000: 'PHYSICAL SCIENCES', 5001: 'ASTRONOMY AND ASTROPHYSICS', 
       5002: 'ATMOSPHERIC SCIENCES AND METEOROLOGY', 5003: 'CHEMISTRY', 
       5004: 'GEOLOGY AND EARTH SCIENCE', 5005: 'GEOSCIENCES', 5006: 'OCEANOGRAPHY', 
       5007: 'PHYSICS', 5008: 'MATERIALS SCIENCE', 4000: 'MULTI/INTERDISCIPLINARY STUDIES', 
       4001: 'INTERCULTURAL AND INTERNATIONAL STUDIES', 4002: 'NUTRITION SCIENCES', 
       6299: 'MISCELLANEOUS BUSINESS & MEDICAL ADMINISTRATION', 
       4005: 'MATHEMATICS AND COMPUTER SCIENCE', 4006: 'COGNITIVE SCIENCE AND BIOPSYCHOLOGY', 
       4007: 'INTERDISCIPLINARY SOCIAL SCIENCES', 3501: 'LIBRARY SCIENCE', 
       6203: 'BUSINESS MANAGEMENT AND ADMINISTRATION', 2499: 'MISCELLANEOUS ENGINEERING', 
       2500: 'ENGINEERING TECHNOLOGIES', 2501: 'ENGINEERING AND INDUSTRIAL MANAGEMENT', 
       2502: 'ELECTRICAL ENGINEERING TECHNOLOGY', 2503: 'INDUSTRIAL PRODUCTION TECHNOLOGIES', 
       2504: 'MECHANICAL ENGINEERING RELATED TECHNOLOGIES', 2419: 'PETROLEUM ENGINEERING', 
       2001: 'COMMUNICATION TECHNOLOGIES', 6099: 'MISCELLANEOUS FINE ARTS', 
       6100: 'GENERAL MEDICAL AND HEALTH SERVICES', 
       6102: 'COMMUNICATION DISORDERS SCIENCES AND SERVICES', 
       6103: 'HEALTH AND MEDICAL ADMINISTRATIVE SERVICES', 
       6104: 'MEDICAL ASSISTING SERVICES', 6105: 'MEDICAL TECHNOLOGIES TECHNICIANS', 
       6106: 'HEALTH AND MEDICAL PREPARATORY PROGRAMS', 6107: 'NURSING', 
       6108: 'PHARMACY PHARMACEUTICAL SCIENCES AND ADMINISTRATION', 
       6109: 'TREATMENT THERAPY PROFESSIONS', 6110: 'COMMUNITY AND PUBLIC HEALTH', 
       5599: 'MISCELLANEOUS SOCIAL SCIENCES', 5601: 'CONSTRUCTION SERVICES', 
       5201: 'EDUCATIONAL PSYCHOLOGY', 5098: 'MULTI-DISCIPLINARY OR GENERAL SCIENCE', 
       5202: 'CLINICAL PSYCHOLOGY', 5102: 'NUCLEAR'}

occp = {10:'MGR-CHIEF EXECUTIVES AND LEGISLATORS',
       20:'MGR-GENERAL AND OPERATIONS MANAGERS',
       40:'MGR-ADVERTISING AND PROMOTIONS MANAGERS',
       50:'MGR-MARKETING AND SALES MANAGERS',
       60:'MGR-PUBLIC RELATIONS AND FUNDRAISING MANAGERS',
       100:'MGR-ADMINISTRATIVE SERVICES MANAGERS',
       110:'MGR-COMPUTER AND INFORMATION SYSTEMS MANAGERS',
       120:'MGR-FINANCIAL MANAGERS',
       135:'MGR-COMPENSATION AND BENEFITS MANAGERS',
       136:'MGR-HUMAN RESOURCES MANAGERS',
       137:'MGR-TRAINING AND DEVELOPMENT MANAGERS',
       140:'MGR-INDUSTRIAL PRODUCTION MANAGERS',
       150:'MGR-PURCHASING MANAGERS',
       160:'MGR-TRANSPORTATION, STORAGE, AND DISTRIBUTION MANAGERS',
       205:'MGR-FARMERS, RANCHERS, AND OTHER AGRICULTURAL MANAGERS',
       220:'MGR-CONSTRUCTION MANAGERS',
       230:'MGR-EDUCATION ADMINISTRATORS',
       300:'MGR-ARCHITECTURAL AND ENGINEERING MANAGERS',
       310:'MGR-FOOD SERVICE MANAGERS',
       330:'MGR-GAMING MANAGERS',
       340:'MGR-LODGING MANAGERS',
       350:'MGR-MEDICAL AND HEALTH SERVICES MANAGERS',
       360:'MGR-NATURAL SCIENCES MANAGERS',
       410:'MGR-PROPERTY, REAL ESTATE, AND COMMUNITY ASSOCIATION',
       420:'MGR-SOCIAL AND COMMUNITY SERVICE MANAGERS',
       425:'MGR-EMERGENCY MANAGEMENT DIRECTORS',
       430:'MGR-MISCELLANEOUS MANAGERS, INCLUDING FUNERAL SERVICE',
       500:'BUS-AGENTS AND BUSINESS MANAGERS OF ARTISTS, PERFORMERS,',
       510:'BUS-BUYERS AND PURCHASING AGENTS, FARM PRODUCTS',
       520:'BUS-WHOLESALE AND RETAIL BUYERS, EXCEPT FARM PRODUCTS',
       530:'BUS-PURCHASING AGENTS, EXCEPT WHOLESALE, RETAIL, AND FARM',
       540:'BUS-CLAIMS ADJUSTERS, APPRAISERS, EXAMINERS, AND',
       565:'BUS-COMPLIANCE OFFICERS',
       600:'BUS-COST ESTIMATORS',
       630:'BUS-HUMAN RESOURCES WORKERS',
       640:'BUS-COMPENSATION, BENEFITS, AND JOB ANALYSIS SPECIALISTS',
       650:'BUS-TRAINING AND DEVELOPMENT SPECIALISTS',
       700:'BUS-LOGISTICIANS',
       710:'BUS-MANAGEMENT ANALYSTS',
       725:'BUS-MEETING, CONVENTION, AND EVENT PLANNERS',
       726:'BUS-FUNDRAISERS',
       735:'BUS-MARKET RESEARCH ANALYSTS AND MARKETING SPECIALISTS',
       740:'BUS-BUSINESS OPERATIONS SPECIALISTS, ALL OTHER',
       800:'FIN-ACCOUNTANTS AND AUDITORS',
       810:'FIN-APPRAISERS AND ASSESSORS OF REAL ESTATE',
       820:'FIN-BUDGET ANALYSTS',
       830:'FIN-CREDIT ANALYSTS',
       840:'FIN-FINANCIAL ANALYSTS',
       850:'FIN-PERSONAL FINANCIAL ADVISORS',
       860:'FIN-INSURANCE UNDERWRITERS',
       900:'FIN-FINANCIAL EXAMINERS',
       910:'FIN-CREDIT COUNSELORS AND LOAN OFFICERS',
       930:'FIN-TAX EXAMINERS AND COLLECTORS, AND REVENUE AGENTS',
       940:'FIN-TAX PREPARERS',
       950:'FIN-FINANCIAL SPECIALISTS, ALL OTHER',
       1005:'CMM-COMPUTER AND INFORMATION RESEARCH SCIENTISTS',
       1006:'CMM-COMPUTER SYSTEMS ANALYSTS',
       1007:'CMM-INFORMATION SECURITY ANALYSTS',
       1010:'CMM-COMPUTER PROGRAMMERS',
       1020:'CMM-SOFTWARE DEVELOPERS, APPLICATIONS AND SYSTEMS SOFTWARE',
       1030:'CMM-WEB DEVELOPERS',
       1050:'CMM-COMPUTER SUPPORT SPECIALISTS',
       1060:'CMM-DATABASE ADMINISTRATORS',
       1105:'CMM-NETWORK AND COMPUTER SYSTEMS ADMINISTRATORS',
       1106:'CMM-COMPUTER NETWORK ARCHITECTS',
       1107:'CMM-COMPUTER OCCUPATIONS, ALL OTHER',
       1200:'CMM-ACTUARIES',
       1220:'CMM-OPERATIONS RESEARCH ANALYSTS',
       1240:'CMM-MISCELLANEOUS MATHEMATICAL SCIENCE OCCUPATIONS,',
       1300:'ENG-ARCHITECTS, EXCEPT NAVAL',
       1310:'ENG-SURVEYORS, CARTOGRAPHERS, AND PHOTOGRAMMETRISTS',
       1320:'ENG-AEROSPACE ENGINEERS',
       1340:'ENG-BIOMEDICAL AND AGRICULTURAL ENGINEERS',
       1350:'ENG-CHEMICAL ENGINEERS',
       1360:'ENG-CIVIL ENGINEERS',
       1400:'ENG-COMPUTER HARDWARE ENGINEERS',
       1410:'ENG-ELECTRICAL AND ELECTRONICS ENGINEERS',
       1420:'ENG-ENVIRONMENTAL ENGINEERS',
       1430:'ENG-INDUSTRIAL ENGINEERS, INCLUDING HEALTH AND SAFETY',
       1440:'ENG-MARINE ENGINEERS AND NAVAL ARCHITECTS',
       1450:'ENG-MATERIALS ENGINEERS',
       1460:'ENG-MECHANICAL ENGINEERS',
       1520:'ENG-PETROLEUM, MINING AND GEOLOGICAL ENGINEERS, INCLUDING',
       1530:'ENG-MISCELLANEOUS ENGINEERS, INCLUDING NUCLEAR ENGINEERS',
       1540:'ENG-DRAFTERS',
       1550:'ENG-ENGINEERING TECHNICIANS, EXCEPT DRAFTERS',
       1560:'ENG-SURVEYING AND MAPPING TECHNICIANS',
       1600:'SCI-AGRICULTURAL AND FOOD SCIENTISTS',
       1610:'SCI-BIOLOGICAL SCIENTISTS',
       1640:'SCI-CONSERVATION SCIENTISTS AND FORESTERS',
       1650:'SCI-MEDICAL SCIENTISTS, AND LIFE SCIENTISTS, ALL OTHER',
       1700:'SCI-ASTRONOMERS AND PHYSICISTS',
       1710:'SCI-ATMOSPHERIC AND SPACE SCIENTISTS',
       1720:'SCI-CHEMISTS AND MATERIALS SCIENTISTS',
       1740:'SCI-ENVIRONMENTAL SCIENTISTS AND GEOSCIENTISTS',
       1760:'SCI-PHYSICAL SCIENTISTS, ALL OTHER',
       1800:'SCI-ECONOMISTS',
       1820:'SCI-PSYCHOLOGISTS',
       1840:'SCI-URBAN AND REGIONAL PLANNERS',
       1860:'SCI-MISCELLANEOUS SOCIAL SCIENTISTS, INCLUDING SURVEY',
       1900:'SCI-AGRICULTURAL AND FOOD SCIENCE TECHNICIANS',
       1910:'SCI-BIOLOGICAL TECHNICIANS',
       1920:'SCI-CHEMICAL TECHNICIANS',
       1930:'SCI-GEOLOGICAL AND PETROLEUM TECHNICIANS, AND NUCLEAR',
       1965:'SCI-MISCELLANEOUS LIFE, PHYSICAL, AND SOCIAL SCIENCE',
       2000:'CMS-COUNSELORS',
       2010:'CMS-SOCIAL WORKERS',
       2015:'CMS-PROBATION OFFICERS AND CORRECTIONAL TREATMENT',
       2016:'CMS-SOCIAL AND HUMAN SERVICE ASSISTANTS',
       2025:'CMS-MISCELLANEOUS COMMUNITY AND SOCIAL SERVICE',
       2040:'CMS-CLERGY',
       2050:'CMS-DIRECTORS, RELIGIOUS ACTIVITIES AND EDUCATION',
       2060:'CMS-RELIGIOUS WORKERS, ALL OTHER',
       2100:'LGL-LAWYERS, AND JUDGES, MAGISTRATES, AND OTHER JUDICIAL',
       2105:'LGL-JUDICIAL LAW CLERKS',
       2145:'LGL-PARALEGALS AND LEGAL ASSISTANTS',
       2160:'LGL-MISCELLANEOUS LEGAL SUPPORT WORKERS',
       2200:'EDU-POSTSECONDARY TEACHERS',
       2300:'EDU-PRESCHOOL AND KINDERGARTEN TEACHERS',
       2310:'EDU-ELEMENTARY AND MIDDLE SCHOOL TEACHERS',
       2320:'EDU-SECONDARY SCHOOL TEACHERS',
       2330:'EDU-SPECIAL EDUCATION TEACHERS',
       2340:'EDU-OTHER TEACHERS AND INSTRUCTORS',
       2400:'EDU-ARCHIVISTS, CURATORS, AND MUSEUM TECHNICIANS',
       2430:'EDU-LIBRARIANS',
       2440:'EDU-LIBRARY TECHNICIANS',
       2540:'EDU-TEACHER ASSISTANTS',
       2550:'EDU-OTHER EDUCATION, TRAINING, AND LIBRARY WORKERS',
       2600:'ENT-ARTISTS AND RELATED WORKERS',
       2630:'ENT-DESIGNERS',
       2700:'ENT-ACTORS',
       2710:'ENT-PRODUCERS AND DIRECTORS',
       2720:'ENT-ATHLETES, COACHES, UMPIRES, AND RELATED WORKERS',
       2740:'ENT-DANCERS AND CHOREOGRAPHERS',
       2750:'ENT-MUSICIANS, SINGERS, AND RELATED WORKERS',
       2760:'ENT-ENTERTAINERS AND PERFORMERS, SPORTS AND RELATED',
       2800:'ENT-ANNOUNCERS',
       2810:'ENT-NEWS ANALYSTS, REPORTERS AND CORRESPONDENTS',
       2825:'ENT-PUBLIC RELATIONS SPECIALISTS',
       2830:'ENT-EDITORS',
       2840:'ENT-TECHNICAL WRITERS',
       2850:'ENT-WRITERS AND AUTHORS',
       2860:'ENT-MISCELLANEOUS MEDIA AND COMMUNICATION WORKERS',
       2900:'ENT-BROADCAST AND SOUND ENGINEERING TECHNICIANS AND RADIO',
       2910:'ENT-PHOTOGRAPHERS',
       2920:'ENT-TELEVISION, VIDEO, AND MOTION PICTURE CAMERA OPERATORS',
       3000:'MED-CHIROPRACTORS',
       3010:'MED-DENTISTS',
       3030:'MED-DIETITIANS AND NUTRITIONISTS',
       3040:'MED-OPTOMETRISTS',
       3050:'MED-PHARMACISTS',
       3060:'MED-PHYSICIANS AND SURGEONS',
       3110:'MED-PHYSICIAN ASSISTANTS',
       3120:'MED-PODIATRISTS',
       3140:'MED-AUDIOLOGISTS',
       3150:'MED-OCCUPATIONAL THERAPISTS',
       3160:'MED-PHYSICAL THERAPISTS',
       3200:'MED-RADIATION THERAPISTS',
       3210:'MED-RECREATIONAL THERAPISTS',
       3220:'MED-RESPIRATORY THERAPISTS',
       3230:'MED-SPEECH-LANGUAGE PATHOLOGISTS',
       3245:'MED-OTHER THERAPISTS, INCLUDING EXERCISE PHYSIOLOGISTS',
       3250:'MED-VETERINARIANS',
       3255:'MED-REGISTERED NURSES',
       3256:'MED-NURSE ANESTHETISTS',
       3258:'MED-NURSE PRACTITIONERS, AND NURSE MIDWIVES',
       3260:'MED-HEALTH DIAGNOSING AND TREATING PRACTITIONERS, ALL',
       3300:'MED-CLINICAL LABORATORY TECHNOLOGISTS AND TECHNICIANS',
       3310:'MED-DENTAL HYGIENISTS',
       3320:'MED-DIAGNOSTIC RELATED TECHNOLOGISTS AND TECHNICIANS',
       3400:'MED-EMERGENCY MEDICAL TECHNICIANS AND PARAMEDICS',
       3420:'MED-HEALTH PRACTITIONER SUPPORT TECHNOLOGISTS AND',
       3500:'MED-LICENSED PRACTICAL AND LICENSED VOCATIONAL NURSES',
       3510:'MED-MEDICAL RECORDS AND HEALTH INFORMATION TECHNICIANS',
       3520:'MED-OPTICIANS, DISPENSING',
       3535:'MED-MISCELLANEOUS HEALTH TECHNOLOGISTS AND TECHNICIANS',
       3540:'MED-OTHER HEALTHCARE PRACTITIONERS AND TECHNICAL',
       3600:'HLS-NURSING, PSYCHIATRIC, AND HOME HEALTH AIDES',
       3610:'HLS-OCCUPATIONAL THERAPY ASSISTANTS AND AIDES',
       3620:'HLS-PHYSICAL THERAPIST ASSISTANTS AND AIDES',
       3630:'HLS-MASSAGE THERAPISTS',
       3640:'HLS-DENTAL ASSISTANTS',
       3645:'HLS-MEDICAL ASSISTANTS',
       3646:'HLS-MEDICAL TRANSCRIPTIONISTS',
       3647:'HLS-PHARMACY AIDES',
       3648:'HLS-VETERINARY ASSISTANTS AND LABORATORY ANIMAL CARETAKERS',
       3649:'HLS-PHLEBOTOMISTS',
       3655:'HLS-HEALTHCARE SUPPORT WORKERS, ALL OTHER, INCLUDING',
       3700:'PRT-FIRST-LINE SUPERVISORS OF CORRECTIONAL OFFICERS',
       3710:'PRT-FIRST-LINE SUPERVISORS OF POLICE AND DETECTIVES',
       3720:'PRT-FIRST-LINE SUPERVISORS OF FIRE FIGHTING AND PREVENTION',
       3730:'PRT-FIRST-LINE SUPERVISORS OF PROTECTIVE SERVICE WORKERS,',
       3740:'PRT-FIREFIGHTERS',
       3750:'PRT-FIRE INSPECTORS',
       3800:'PRT-BAILIFFS, CORRECTIONAL OFFICERS, AND JAILERS',
       3820:'PRT-DETECTIVES AND CRIMINAL INVESTIGATORS',
       3840:'PRT-MISCELLANEOUS LAW ENFORCEMENT WORKERS',
       3850:'PRT-POLICE OFFICERS',
       3900:'PRT-ANIMAL CONTROL WORKERS',
       3910:'PRT-PRIVATE DETECTIVES AND INVESTIGATORS',
       3930:'PRT-SECURITY GUARDS AND GAMING SURVEILLANCE OFFICERS',
       3940:'PRT-CROSSING GUARDS',
       3945:'PRT-TRANSPORTATION SECURITY SCREENERS',
       3955:'PRT-LIFEGUARDS AND OTHER RECREATIONAL, AND ALL OTHER',
       4000:'EAT-CHEFS AND HEAD COOKS',
       4010:'EAT-FIRST-LINE SUPERVISORS OF FOOD PREPARATION AND SERVING',
       4020:'EAT-COOKS',
       4030:'EAT-FOOD PREPARATION WORKERS',
       4040:'EAT-BARTENDERS',
       4050:'EAT-COMBINED FOOD PREPARATION AND SERVING WORKERS,',
       4060:'EAT-COUNTER ATTENDANTS, CAFETERIA, FOOD CONCESSION, AND',
       4110:'EAT-WAITERS AND WAITRESSES',
       4120:'EAT-FOOD SERVERS, NONRESTAURANT',
       4130:'EAT-MISCELLANEOUS FOOD PREPARATION AND SERVING RELATED',
       4140:'EAT-DISHWASHERS',
       4150:'EAT-HOSTS AND HOSTESSES, RESTAURANT, LOUNGE, AND COFFEE',
       4200:'CLN-FIRST-LINE SUPERVISORS OF HOUSEKEEPING AND JANITORIAL',
       4210:'CLN-FIRST-LINE SUPERVISORS OF LANDSCAPING, LAWN SERVICE,',
       4220:'CLN-JANITORS AND BUILDING CLEANERS',
       4230:'CLN-MAIDS AND HOUSEKEEPING CLEANERS',
       4240:'CLN-PEST CONTROL WORKERS',
       4250:'CLN-GROUNDS MAINTENANCE WORKERS',
       4300:'PRS-FIRST-LINE SUPERVISORS OF GAMING WORKERS',
       4320:'PRS-FIRST-LINE SUPERVISORS OF PERSONAL SERVICE WORKERS',
       4340:'PRS-ANIMAL TRAINERS',
       4350:'PRS-NONFARM ANIMAL CARETAKERS',
       4400:'PRS-GAMING SERVICES WORKERS',
       4410:'PRS-MOTION PICTURE PROJECTIONISTS',
       4420:'PRS-USHERS, LOBBY ATTENDANTS, AND TICKET TAKERS',
       4430:'PRS-MISCELLANEOUS ENTERTAINMENT ATTENDANTS AND RELATED',
       4460:'PRS-EMBALMERS AND FUNERAL ATTENDANTS',
       4465:'PRS-MORTICIANS, UNDERTAKERS, AND FUNERAL DIRECTORS',
       4500:'PRS-BARBERS',
       4510:'PRS-HAIRDRESSERS, HAIRSTYLISTS, AND COSMETOLOGISTS',
       4520:'PRS-MISCELLANEOUS PERSONAL APPEARANCE WORKERS',
       4530:'PRS-BAGGAGE PORTERS, BELLHOPS, AND CONCIERGES',
       4540:'PRS-TOUR AND TRAVEL GUIDES',
       4600:'PRS-CHILDCARE WORKERS',
       4610:'PRS-PERSONAL CARE AIDES',
       4620:'PRS-RECREATION AND FITNESS WORKERS',
       4640:'PRS-RESIDENTIAL ADVISORS',
       4650:'PRS-PERSONAL CARE AND SERVICE WORKERS, ALL OTHER',
       4700:'SAL-FIRST-LINE SUPERVISORS OF RETAIL SALES WORKERS',
       4710:'SAL-FIRST-LINE SUPERVISORS OF NON-RETAIL SALES WORKERS',
       4720:'SAL-CASHIERS',
       4740:'SAL-COUNTER AND RENTAL CLERKS',
       4750:'SAL-PARTS SALESPERSONS',
       4760:'SAL-RETAIL SALESPERSONS',
       4800:'SAL-ADVERTISING SALES AGENTS',
       4810:'SAL-INSURANCE SALES AGENTS',
       4820:'SAL-SECURITIES, COMMODITIES, AND FINANCIAL SERVICES SALES',
       4830:'SAL-TRAVEL AGENTS',
       4840:'SAL-SALES REPRESENTATIVES, SERVICES, ALL OTHER',
       4850:'SAL-SALES REPRESENTATIVES, WHOLESALE AND MANUFACTURING',
       4900:'SAL-MODELS, DEMONSTRATORS, AND PRODUCT PROMOTERS',
       4920:'SAL-REAL ESTATE BROKERS AND SALES AGENTS',
       4930:'SAL-SALES ENGINEERS',
       4940:'SAL-TELEMARKETERS',
       4950:'SAL-DOOR-TO-DOOR SALES WORKERS, NEWS AND STREET VENDORS,',
       4965:'SAL-SALES AND RELATED WORKERS, ALL OTHER',
       5000:'OFF-FIRST-LINE SUPERVISORS OF OFFICE AND ADMINISTRATIVE',
       5010:'OFF-SWITCHBOARD OPERATORS, INCLUDING ANSWERING SERVICE',
       5020:'OFF-TELEPHONE OPERATORS',
       5030:'OFF-COMMUNICATIONS EQUIPMENT OPERATORS, ALL OTHER',
       5100:'OFF-BILL AND ACCOUNT COLLECTORS',
       5110:'OFF-BILLING AND POSTING CLERKS',
       5120:'OFF-BOOKKEEPING, ACCOUNTING, AND AUDITING CLERKS',
       5130:'OFF-GAMING CAGE WORKERS',
       5140:'OFF-PAYROLL AND TIMEKEEPING CLERKS',
       5150:'OFF-PROCUREMENT CLERKS',
       5160:'OFF-TELLERS',
       5165:'OFF-FINANCIAL CLERKS, ALL OTHER',
       5200:'OFF-BROKERAGE CLERKS',
       5220:'OFF-COURT, MUNICIPAL, AND LICENSE CLERKS',
       5230:'OFF-CREDIT AUTHORIZERS, CHECKERS, AND CLERKS',
       5240:'OFF-CUSTOMER SERVICE REPRESENTATIVES',
       5250:'OFF-ELIGIBILITY INTERVIEWERS, GOVERNMENT PROGRAMS',
       5260:'OFF-FILE CLERKS',
       5300:'OFF-HOTEL, MOTEL, AND RESORT DESK CLERKS',
       5310:'OFF-INTERVIEWERS, EXCEPT ELIGIBILITY AND LOAN',
       5320:'OFF-LIBRARY ASSISTANTS, CLERICAL',
       5330:'OFF-LOAN INTERVIEWERS AND CLERKS',
       5340:'OFF-NEW ACCOUNTS CLERKS',
       5350:'OFF-CORRESPONDENCE CLERKS AND ORDER CLERKS',
       5360:'OFF-HUMAN RESOURCES ASSISTANTS, EXCEPT PAYROLL AND',
       5400:'OFF-RECEPTIONISTS AND INFORMATION CLERKS',
       5410:'OFF-RESERVATION AND TRANSPORTATION TICKET AGENTS AND',
       5420:'OFF-INFORMATION AND RECORD CLERKS, ALL OTHER',
       5500:'OFF-CARGO AND FREIGHT AGENTS',
       5510:'OFF-COURIERS AND MESSENGERS',
       5520:'OFF-DISPATCHERS',
       5530:'OFF-METER READERS, UTILITIES',
       5540:'OFF-POSTAL SERVICE CLERKS',
       5550:'OFF-POSTAL SERVICE MAIL CARRIERS',
       5560:'OFF-POSTAL SERVICE MAIL SORTERS, PROCESSORS, AND',
       5600:'OFF-PRODUCTION, PLANNING, AND EXPEDITING CLERKS',
       5610:'OFF-SHIPPING, RECEIVING, AND TRAFFIC CLERKS',
       5620:'OFF-STOCK CLERKS AND ORDER FILLERS',
       5630:'OFF-WEIGHERS, MEASURERS, CHECKERS, AND SAMPLERS,',
       5700:'OFF-SECRETARIES AND ADMINISTRATIVE ASSISTANTS',
       5800:'OFF-COMPUTER OPERATORS',
       5810:'OFF-DATA ENTRY KEYERS',
       5820:'OFF-WORD PROCESSORS AND TYPISTS',
       5840:'OFF-INSURANCE CLAIMS AND POLICY PROCESSING CLERKS',
       5850:'OFF-MAIL CLERKS AND MAIL MACHINE OPERATORS, EXCEPT POSTAL',
       5860:'OFF-OFFICE CLERKS, GENERAL',
       5900:'OFF-OFFICE MACHINE OPERATORS, EXCEPT COMPUTER',
       5910:'OFF-PROOFREADERS AND COPY MARKERS',
       5920:'OFF-STATISTICAL ASSISTANTS',
       5940:'OFF-MISCELLANEOUS OFFICE AND ADMINISTRATIVE SUPPORT',
       6005:'FFF-FIRST-LINE SUPERVISORS OF FARMING, FISHING, AND',
       6010:'FFF-AGRICULTURAL INSPECTORS',
       6040:'FFF-GRADERS AND SORTERS, AGRICULTURAL PRODUCTS',
       6050:'FFF-MISCELLANEOUS AGRICULTURAL WORKERS, INCLUDING ANIMAL',
       6100:'FFF-FISHING AND HUNTING WORKERS',
       6120:'FFF-FOREST AND CONSERVATION WORKERS',
       6130:'FFF-LOGGING WORKERS',
       6200:'CON-FIRST-LINE SUPERVISORS OF CONSTRUCTION TRADES AND',
       6210:'CON-BOILERMAKERS',
       6220:'CON-BRICKMASONS, BLOCKMASONS, STONEMASONS, AND',
       6230:'CON-CARPENTERS',
       6240:'CON-CARPET, FLOOR, AND TILE INSTALLERS AND FINISHERS',
       6250:'CON-CEMENT MASONS, CONCRETE FINISHERS, AND TERRAZZO',
       6260:'CON-CONSTRUCTION LABORERS',
       6300:'CON-PAVING, SURFACING, AND TAMPING EQUIPMENT OPERATORS',
       6320:'CON-CONSTRUCTION EQUIPMENT OPERATORS, EXCEPT PAVING,',
       6330:'CON-DRYWALL INSTALLERS, CEILING TILE INSTALLERS, AND',
       6355:'CON-ELECTRICIANS',
       6360:'CON-GLAZIERS',
       6400:'CON-INSULATION WORKERS',
       6420:'CON-PAINTERS AND PAPERHANGERS',
       6440:'CON-PIPELAYERS, PLUMBERS, PIPEFITTERS, AND STEAMFITTERS',
       6460:'CON-PLASTERERS AND STUCCO MASONS',
       6515:'CON-ROOFERS',
       6520:'CON-SHEET METAL WORKERS',
       6530:'CON-STRUCTURAL IRON AND STEEL WORKERS',
       6600:'CON-HELPERS, CONSTRUCTION TRADES',
       6660:'CON-CONSTRUCTION AND BUILDING INSPECTORS',
       6700:'CON-ELEVATOR INSTALLERS AND REPAIRERS',
       6710:'CON-FENCE ERECTORS',
       6720:'CON-HAZARDOUS MATERIALS REMOVAL WORKERS',
       6730:'CON-HIGHWAY MAINTENANCE WORKERS',
       6740:'CON-RAIL-TRACK LAYING AND MAINTENANCE EQUIPMENT OPERATORS',
       6765:'CON-MISCELLANEOUS CONSTRUCTION WORKERS, INCLUDING SOLAR',
       6800:'EXT-DERRICK, ROTARY DRILL, AND SERVICE UNIT OPERATORS, AND',
       6820:'EXT-EARTH DRILLERS, EXCEPT OIL AND GAS',
       6830:'EXT-EXPLOSIVES WORKERS, ORDNANCE HANDLING EXPERTS, AND',
       6840:'EXT-MINING MACHINE OPERATORS',
       6940:'EXT-MISCELLANEOUS EXTRACTION WORKERS, INCLUDING ROOF',
       7000:'RPR-FIRST-LINE SUPERVISORS OF MECHANICS, INSTALLERS, AND',
       7010:'RPR-COMPUTER, AUTOMATED TELLER, AND OFFICE MACHINE',
       7020:'RPR-RADIO AND TELECOMMUNICATIONS EQUIPMENT INSTALLERS AND',
       7030:'RPR-AVIONICS TECHNICIANS',
       7040:'RPR-ELECTRIC MOTOR, POWER TOOL, AND RELATED REPAIRERS',
       7100:'RPR-ELECTRICAL AND ELECTRONICS REPAIRERS, TRANSPORTATION',
       7110:'RPR-ELECTRONIC EQUIPMENT INSTALLERS AND REPAIRERS, MOTOR',
       7120:'RPR-ELECTRONIC HOME ENTERTAINMENT EQUIPMENT INSTALLERS AND',
       7130:'RPR-SECURITY AND FIRE ALARM SYSTEMS INSTALLERS',
       7140:'RPR-AIRCRAFT MECHANICS AND SERVICE TECHNICIANS',
       7150:'RPR-AUTOMOTIVE BODY AND RELATED REPAIRERS',
       7160:'RPR-AUTOMOTIVE GLASS INSTALLERS AND REPAIRERS',
       7200:'RPR-AUTOMOTIVE SERVICE TECHNICIANS AND MECHANICS',
       7210:'RPR-BUS AND TRUCK MECHANICS AND DIESEL ENGINE SPECIALISTS',
       7220:'RPR-HEAVY VEHICLE AND MOBILE EQUIPMENT SERVICE TECHNICIANS',
       7240:'RPR-SMALL ENGINE MECHANICS',
       7260:'RPR-MISCELLANEOUS VEHICLE AND MOBILE EQUIPMENT MECHANICS,',
       7300:'RPR-CONTROL AND VALVE INSTALLERS AND REPAIRERS',
       7315:'RPR-HEATING, AIR CONDITIONING, AND REFRIGERATION MECHANICS',
       7320:'RPR-HOME APPLIANCE REPAIRERS',
       7330:'RPR-INDUSTRIAL AND REFRACTORY MACHINERY MECHANICS',
       7340:'RPR-MAINTENANCE AND REPAIR WORKERS, GENERAL',
       7350:'RPR-MAINTENANCE WORKERS, MACHINERY',
       7360:'RPR-MILLWRIGHTS',
       7410:'RPR-ELECTRICAL POWER-LINE INSTALLERS AND REPAIRERS',
       7420:'RPR-TELECOMMUNICATIONS LINE INSTALLERS AND REPAIRERS',
       7430:'RPR-PRECISION INSTRUMENT AND EQUIPMENT REPAIRERS',
       7510:'RPR-COIN, VENDING, AND AMUSEMENT MACHINE SERVICERS AND',
       7540:'RPR-LOCKSMITHS AND SAFE REPAIRERS',
       7560:'RPR-RIGGERS',
       7610:'RPR-HELPERS--INSTALLATION, MAINTENANCE, AND REPAIR WORKERS',
       7630:'RPR-MISCELLANEOUS INSTALLATION, MAINTENANCE, AND REPAIR',
       7700:'PRD-FIRST-LINE SUPERVISORS OF PRODUCTION AND OPERATING',
       7710:'PRD-AIRCRAFT STRUCTURE, SURFACES, RIGGING, AND SYSTEMS',
       7720:'PRD-ELECTRICAL, ELECTRONICS, AND ELECTROMECHANICAL',
       7730:'PRD-ENGINE AND OTHER MACHINE ASSEMBLERS',
       7740:'PRD-STRUCTURAL METAL FABRICATORS AND FITTERS',
       7750:'PRD-MISCELLANEOUS ASSEMBLERS AND FABRICATORS',
       7800:'PRD-BAKERS',
       7810:'PRD-BUTCHERS AND OTHER MEAT, POULTRY, AND FISH PROCESSING',
       7830:'PRD-FOOD AND TOBACCO ROASTING, BAKING, AND DRYING MACHINE',
       7840:'PRD-FOOD BATCHMAKERS',
       7850:'PRD-FOOD COOKING MACHINE OPERATORS AND TENDERS',
       7855:'PRD-FOOD PROCESSING WORKERS, ALL OTHER',
       7900:'PRD-COMPUTER CONTROL PROGRAMMERS AND OPERATORS',
       7920:'PRD-EXTRUDING AND DRAWING MACHINE SETTERS, OPERATORS, AND',
       7930:'PRD-FORGING MACHINE SETTERS, OPERATORS, AND TENDERS, METAL',
       7940:'PRD-ROLLING MACHINE SETTERS, OPERATORS, AND TENDERS, METAL',
       7950:'PRD-MACHINE TOOL CUTTING SETTERS, OPERATORS, AND TENDERS,',
       8030:'PRD-MACHINISTS',
       8040:'PRD-METAL FURNACE OPERATORS, TENDERS, POURERS, AND CASTERS',
       8100:'PRD-MODEL MAKERS, PATTERNMAKERS, AND MOLDING MACHINE',
       8130:'PRD-TOOL AND DIE MAKERS',
       8140:'PRD-WELDING, SOLDERING, AND BRAZING WORKERS',
       8220:'PRD-MISCELLANEOUS METAL WORKERS AND PLASTIC WORKERS,',
       8250:'PRD-PREPRESS TECHNICIANS AND WORKERS',
       8255:'PRD-PRINTING PRESS OPERATORS',
       8256:'PRD-PRINT BINDING AND FINISHING WORKERS',
       8300:'PRD-LAUNDRY AND DRY-CLEANING WORKERS',
       8310:'PRD-PRESSERS, TEXTILE, GARMENT, AND RELATED MATERIALS',
       8320:'PRD-SEWING MACHINE OPERATORS',
       8330:'PRD-SHOE AND LEATHER WORKERS',
       8350:'PRD-TAILORS, DRESSMAKERS, AND SEWERS',
       8400:'PRD-TEXTILE BLEACHING AND DYEING, AND CUTTING MACHINE',
       8410:'PRD-TEXTILE KNITTING AND WEAVING MACHINE SETTERS,',
       8420:'PRD-TEXTILE WINDING, TWISTING, AND DRAWING OUT MACHINE',
       8450:'PRD-UPHOLSTERERS',
       8460:'PRD-MISCELLANEOUS TEXTILE, APPAREL, AND FURNISHINGS',
       8500:'PRD-CABINETMAKERS AND BENCH CARPENTERS',
       8510:'PRD-FURNITURE FINISHERS',
       8530:'PRD-SAWING MACHINE SETTERS, OPERATORS, AND TENDERS, WOOD',
       8540:'PRD-WOODWORKING MACHINE SETTERS, OPERATORS, AND TENDERS,',
       8550:'PRD-MISCELLANEOUS WOODWORKERS, INCLUDING MODEL MAKERS AND',
       8600:'PRD-POWER PLANT OPERATORS, DISTRIBUTORS, AND DISPATCHERS',
       8610:'PRD-STATIONARY ENGINEERS AND BOILER OPERATORS',
       8620:'PRD-WATER AND WASTEWATER TREATMENT PLANT AND SYSTEM',
       8630:'PRD-MISCELLANEOUS PLANT AND SYSTEM OPERATORS',
       8640:'PRD-CHEMICAL PROCESSING MACHINE SETTERS, OPERATORS, AND',
       8650:'PRD-CRUSHING, GRINDING, POLISHING, MIXING, AND BLENDING',
       8710:'PRD-CUTTING WORKERS',
       8720:'PRD-EXTRUDING, FORMING, PRESSING, AND COMPACTING MACHINE',
       8730:'PRD-FURNACE, KILN, OVEN, DRIER, AND KETTLE OPERATORS AND',
       8740:'PRD-INSPECTORS, TESTERS, SORTERS, SAMPLERS, AND WEIGHERS',
       8750:'PRD-JEWELERS AND PRECIOUS STONE AND METAL WORKERS',
       8760:'PRD-MEDICAL, DENTAL, AND OPHTHALMIC LABORATORY TECHNICIANS',
       8800:'PRD-PACKAGING AND FILLING MACHINE OPERATORS AND TENDERS',
       8810:'PRD-PAINTING WORKERS',
       8830:'PRD-PHOTOGRAPHIC PROCESS WORKERS AND PROCESSING MACHINE',
       8850:'PRD-ADHESIVE BONDING MACHINE OPERATORS AND TENDERS',
       8910:'PRD-ETCHERS AND ENGRAVERS',
       8920:'PRD-MOLDERS, SHAPERS, AND CASTERS, EXCEPT METAL AND',
       8930:'PRD-PAPER GOODS MACHINE SETTERS, OPERATORS, AND TENDERS',
       8940:'PRD-TIRE BUILDERS',
       8950:'PRD-HELPERS-PRODUCTION WORKERS',
       8965:'PRD-MISCELLANEOUS PRODUCTION WORKERS, INCLUDING',
       9000:'TRN-SUPERVISORS OF TRANSPORTATION AND MATERIAL MOVING',
       9030:'TRN-AIRCRAFT PILOTS AND FLIGHT ENGINEERS',
       9040:'TRN-AIR TRAFFIC CONTROLLERS AND AIRFIELD OPERATIONS',
       9050:'TRN-FLIGHT ATTENDANTS',
       9110:'TRN-AMBULANCE DRIVERS AND ATTENDANTS, EXCEPT EMERGENCY',
       9120:'TRN-BUS DRIVERS',
       9130:'TRN-DRIVER/SALES WORKERS AND TRUCK DRIVERS',
       9140:'TRN-TAXI DRIVERS AND CHAUFFEURS',
       9150:'TRN-MOTOR VEHICLE OPERATORS, ALL OTHER',
       9200:'TRN-LOCOMOTIVE ENGINEERS AND OPERATORS',
       9240:'TRN-RAILROAD CONDUCTORS AND YARDMASTERS',
       9260:'TRN-SUBWAY, STREETCAR, AND OTHER RAIL TRANSPORTATION',
       9300:'TRN-SAILORS AND MARINE OILERS, AND SHIP ENGINEERS',
       9310:'TRN-SHIP AND BOAT CAPTAINS AND OPERATORS',
       9350:'TRN-PARKING LOT ATTENDANTS',
       9360:'TRN-AUTOMOTIVE AND WATERCRAFT SERVICE ATTENDANTS',
       9410:'TRN-TRANSPORTATION INSPECTORS',
       9415:'TRN-TRANSPORTATION ATTENDANTS, EXCEPT FLIGHT ATTENDANTS',
       9420:'TRN-MISCELLANEOUS TRANSPORTATION WORKERS, INCLUDING BRIDGE',
       9510:'TRN-CRANE AND TOWER OPERATORS',
       9520:'TRN-DREDGE, EXCAVATING, AND LOADING MACHINE OPERATORS',
       9560:'TRN-CONVEYOR OPERATORS AND TENDERS, AND HOIST AND WINCH',
       9600:'TRN-INDUSTRIAL TRUCK AND TRACTOR OPERATORS',
       9610:'TRN-CLEANERS OF VEHICLES AND EQUIPMENT',
       9620:'TRN-LABORERS AND FREIGHT, STOCK, AND MATERIAL MOVERS, HAND',
       9630:'TRN-MACHINE FEEDERS AND OFFBEARERS',
       9640:'TRN-PACKERS AND PACKAGERS, HAND',
       9650:'TRN-PUMPING STATION OPERATORS',
       9720:'TRN-REFUSE AND RECYCLABLE MATERIAL COLLECTORS',
       9750:'TRN-MISCELLANEOUS MATERIAL MOVING WORKERS, INCLUDING',
       9800:'MIL-MILITARY OFFICER SPECIAL AND TACTICAL OPERATIONS',
       9810:'MIL-FIRST-LINE ENLISTED MILITARY SUPERVISORS',
       9820:'MIL-MILITARY ENLISTED TACTICAL OPERATIONS AND AIR/WEAPONS',
       9830:'MIL-MILITARY, RANK NOT SPECIFIED **',
       9920:'UNEMPLOYED AND LAST WORKED 5 YEARS AGO OR EARLIER OR NEVER',}


This set is big.  Narrow it down.

In [None]:
# We only want certain states
d=d[(d['ST']==6)|(d['ST']==8)|(d['ST']==10)|
  (d['ST']==24)|(d['ST']==25)|(d['ST']==34)|(d['ST']==36)|
  (d['ST']==42)|(d['ST']==49)|(d['ST']==53)|(d['ST']==54) ]




## PWGTP - Expanding the Data ##

Okay, here is what's going on.  Each record has a **PWGTP** value. Say it's 13. That means
13 additional records need to be added to the Panda DataFrame.  However, I'm going to
add it using a Numpy array.  So far it seems to be working.  Some of the other methods
tried bumped up against memory and time limitations.

Summary:  For each row, and addition **PWGTP** number of records will be added to the DataFrame.

In [None]:
# What's the most efficient way to do this?
d=d[(d['AGEP']>=20) & (d['AGEP']<=60) & (d['SEX']==1) & 
    (d['RAC1P']==1)][['PWGTP','AGEP','RAC1P','ST','SEX','SCH','SCHL','INCOME','FOD1P','OCCP']]
numberOfRows=d['PWGTP'].sum()

I=0
A=np.zeros((numberOfRows,9),dtype=np.int64)
def f(t):
    global A,I
    z=[int(t[1]),int(t[2]),int(t[3]),int(t[4]),int(t[5]),int(t[6]),int(t[7]),int(t[8]),int(t[9])]
    idx= int(t[0])
    
    for i in range(0,idx):
        A[I]=z
        I+=1

d.fillna(-1, inplace=True) # Can't have NaN when we go to int       
d.apply(f,axis=1);
A=A.astype(int)

d = pd.DataFrame(A,columns = ['AGEP','RAC1P','ST','SEX','SCH','SCHL','INCOME','FOD1P','OCCP'])
d = d[d['INCOME']>=0]  # 

## Overview - General Look ##

It probably makes sense to look at occupational codes, **OCCP**, then, dig into what
people majored in college.  

In [None]:
def percentile(n):
    def percentile_(x):
        return np.percentile(x, n)
    percentile_.__name__ = 'percentile_%s' % n
    return percentile_
g = d.groupby(['OCCP','ST']).INCOME.agg([percentile(75),percentile(50),percentile(35)])
g = g.reset_index()
# We're sorting on 75th percentile
g.sort_values(by=['ST','percentile_75','percentile_50'],ascending=False,inplace=True)


def f(x):
    if x in occp:
        return occp[x]
    return ''

def f2(x):
    if x in states:
        return states[x]
    return ''


g['occp']=g['OCCP'].apply(lambda x: f(x))
g['states']=g['ST'].apply(lambda x: f2(x))


d['occp']=d['OCCP'].apply(lambda x: f(x))
d['states']=d['ST'].apply(lambda x: f2(x))

# Quick look
g.head(8)

## Pennsylvania ##



In [None]:
ss=42 # Select state

# Quick look
g[(g['ST']==ss)&(g['percentile_50']>=75000)].sort_values(
    by=['ST','percentile_75','percentile_50'],ascending=False,inplace=True)
g[(g['ST']==ss)].head()

In [None]:
ss=42 # Select state
t=d[(d['OCCP'].isin(g[g['ST']==ss]['OCCP'].head(50).values)) &
 (d['ST']==ss)]

# Use this to check data
t.head()

tt=t.groupby(['ST','OCCP']).count().reset_index()

tt=tt[['ST','OCCP','SEX']]

tt.columns = ['ST','OCCP','count']
tt[tt['ST']==ss].head(5)

tx=pd.merge(tt, g, left_on = ['OCCP','ST'], right_on = ['OCCP','ST'])
tx.head()

# Note sort Do not do inplace..will screw up graph
tx[(tx['percentile_50']>85000)].sort_values(by=['count','percentile_75'],ascending=False,inplace=False)
tx.head()

In [None]:
plt.style.use('fivethirtyeight')
# Careful - using t and tx. Keep data sorted
ax = sns.stripplot(x="OCCP", y="INCOME", data=t);
ax.set_xticklabels(ax.xaxis.get_majorticklabels(), rotation=90);
ax.set_xlabel('Occupation')
plt.ylim(50000, 800000)
plt.yticks([50000,200000,400000,600000], ['$50K', '$200K', '$400K', '$600K'])

# ss means Select State
# Always sorted by x= above
tx.sort_values(by=['ST','OCCP'],ascending=True,inplace=True)

xtickName=tx[tx['ST']==ss]['occp'].head(50).tolist()
xtickX=tx[tx['ST']==ss]['occp'].head(50).tolist()

plt.xticks(range(0,50),xtickName,fontsize=9)


plt.title("Pennsylvania\nTop Paying Occupations",  fontsize=16,
            fontstyle='italic', fontweight='bold');


## California ##


In [None]:
ss=6 # Select state

# Quick look
g[(g['ST']==ss)&(g['percentile_50']>=75000)].sort_values(
    by=['ST','percentile_75','percentile_50'],ascending=False,inplace=True)
g[(g['ST']==ss)].head()

In [None]:
ss=6 # Select state
t=d[(d['OCCP'].isin(g[g['ST']==ss]['OCCP'].head(50).values)) &
 (d['ST']==ss)]

# Use this to check data
t.head()

tt=t.groupby(['ST','OCCP']).count().reset_index()

tt=tt[['ST','OCCP','SEX']]

tt.columns = ['ST','OCCP','count']
tt[tt['ST']==ss].head(5)

tx=pd.merge(tt, g, left_on = ['OCCP','ST'], right_on = ['OCCP','ST'])
tx.head()

# Note sort Do not do inplace..will screw up graph
tx[(tx['percentile_50']>85000)].sort_values(by=['count','percentile_75'],ascending=False,inplace=False)
tx.head()

In [None]:
plt.style.use('fivethirtyeight')
# Careful - using t and tx. Keep data sorted
ax = sns.stripplot(x="OCCP", y="INCOME", data=t);
ax.set_xticklabels(ax.xaxis.get_majorticklabels(), rotation=90);
ax.set_xlabel('Occupation')
plt.ylim(50000, 800000)
plt.yticks([50000,200000,400000,600000], ['$50K', '$200K', '$400K', '$600K'])

# ss means Select State
# Always sorted by x= above
tx.sort_values(by=['ST','OCCP'],ascending=True,inplace=True)

xtickName=tx[tx['ST']==ss]['occp'].head(50).tolist()
xtickX=tx[tx['ST']==ss]['occp'].head(50).tolist()

plt.xticks(range(0,50),xtickName,fontsize=9)


plt.title("California\nTop Paying Occupations",  fontsize=16,
            fontstyle='italic', fontweight='bold');




## Percentile ##

I don't think averages are any good when dealing with income, because it's hard to see the individual points behind the data.  I'll use percentiles for choosing the tops majors to work with.  This is a first shot.  Perhaps other methods will be necessary.

**75th Percentile**

Currently, getting the data for the graphs below is based on the highest 75th percentile.  A cautionary note: The graphs below list the primary major people chose in college.  This doesn't necessary mean they'll still working in that chosen field.  In addition, some field have very few people.

In [None]:

def percentile(n):
    def percentile_(x):
        return np.percentile(x, n)
    percentile_.__name__ = 'percentile_%s' % n
    return percentile_
g = d.groupby(['FOD1P','ST']).INCOME.agg([percentile(75),percentile(50),percentile(35)])
g = g.reset_index()
g.sort_values(by=['ST','percentile_75','percentile_50'],ascending=False,inplace=True)


def f(x):
    if x in fod1p:
        return fod1p[x]
    return ''

def f2(x):
    if x in states:
        return states[x]
    return ''


g['major']=g['FOD1P'].apply(lambda x: f(x))
g['states']=g['ST'].apply(lambda x: f2(x))


d['major']=d['FOD1P'].apply(lambda x: f(x))
d['states']=d['ST'].apply(lambda x: f2(x))

# Quick look
g.head(10)

## First Look at States ##

Just to recap. The graphs and charts are based on top 75th percentile earnings, in the respective states, based on what people studied in college.  This doesn't mean they currently employed in that industry.  For example, you could study "History" but be employed in finance making upwards of 600K.

**Salaries Cap Out at 600K**

According to the documentation.  Very high earners are pushed into larger groups.  If I understand the documentation correctly, this is done to protect identity.   

In [None]:
# California is 6
ss=6 # Select state
t=d[(d['FOD1P'].isin(g[g['ST']==ss]['FOD1P'].head(50).values)) &
 (d['ST']==ss)]

# Use this to check data
t.head()

tt=t.groupby(['ST','FOD1P','major']).count().reset_index()

tt=tt[['ST','FOD1P','major','SEX']]
tt.columns = ['ST','FOD1P','major','count']
tt[tt['ST']==6].head(5)

tx=pd.merge(tt, g, left_on = ['FOD1P','ST','major'], right_on = ['FOD1P','ST','major'])
tx.head()


In [None]:
plt.style.use('fivethirtyeight')
# Careful - using t and tx. Keep data sorted
ax = sns.stripplot(x="FOD1P", y="INCOME", data=t);
ax.set_xticklabels(ax.xaxis.get_majorticklabels(), rotation=90);
ax.set_xlabel('Primary Major in College')
plt.ylim(50000, 800000)
plt.yticks([50000,200000,400000,600000], ['$50K', '$200K', '$400K', '$600K'])

# ss is Selected State
xtickName=tx[tx['ST']==ss]['major'].head(50).tolist()
xtickX=tx[tx['ST']==ss]['FOD1P'].head(50).tolist()

plt.xticks(range(0,50),xtickName,fontsize=9)


plt.title("California\nTop Earners",  fontsize=16,
            fontstyle='italic', fontweight='bold');

In [None]:
ss=42 # Select state
t=d[(d['FOD1P'].isin(g[g['ST']==ss]['FOD1P'].head(50).values)) &
 (d['ST']==ss)]

# Use this to check data
t.head()

tt=t.groupby(['ST','FOD1P','major']).count().reset_index()

tt=tt[['ST','FOD1P','major','SEX']]
tt.columns = ['ST','FOD1P','major','count']
tt[tt['ST']==6].head(5)

tx=pd.merge(tt, g, left_on = ['FOD1P','ST','major'], right_on = ['FOD1P','ST','major'])
tx.head(20)

In [None]:
plt.style.use('fivethirtyeight')
# Careful - using t and tx. Keep data sorted
ax = sns.stripplot(x="FOD1P", y="INCOME", data=t);
ax.set_xticklabels(ax.xaxis.get_majorticklabels(), rotation=90);
ax.set_xlabel('Primary Major in College')
plt.ylim(50000, 800000)
plt.yticks([50000,200000,400000,600000], ['$50K', '$200K', '$400K', '$600K'])

xtickName=tx[tx['ST']==ss]['major'].head(50).tolist()
xtickX=tx[tx['ST']==ss]['FOD1P'].head(50).tolist()

plt.xticks(range(0,50),xtickName,fontsize=9)


plt.title("Pennsylvania\nTop Earners",  fontsize=16,
            fontstyle='italic', fontweight='bold');

In [None]:
ss=34 # Select state
t=d[(d['FOD1P'].isin(g[g['ST']==ss]['FOD1P'].head(50).values)) &
 (d['ST']==ss)]

# Use this to check data
t.head()

tt=t.groupby(['ST','FOD1P','major']).count().reset_index()

tt=tt[['ST','FOD1P','major','SEX']]
tt.columns = ['ST','FOD1P','major','count']
tt[tt['ST']==6].head(5)

tx=pd.merge(tt, g, left_on = ['FOD1P','ST','major'], right_on = ['FOD1P','ST','major'])
tx.head(20)

In [None]:
plt.style.use('fivethirtyeight')
# Careful - using t and tx. Keep data sorted
ax = sns.stripplot(x="FOD1P", y="INCOME", data=t);
ax.set_xticklabels(ax.xaxis.get_majorticklabels(), rotation=90);
ax.set_xlabel('Primary Major in College')
plt.ylim(50000, 800000)
plt.yticks([50000,200000,400000,600000], ['$50K', '$200K', '$400K', '$600K'])

xtickName=tx[tx['ST']==ss]['major'].head(50).tolist()
xtickX=tx[tx['ST']==ss]['FOD1P'].head(50).tolist()

plt.xticks(range(0,50),xtickName,fontsize=9)


plt.title("New Jersey\nTop Earners",  fontsize=16,
            fontstyle='italic', fontweight='bold');

In [None]:
ss=10 # Select state
t=d[(d['FOD1P'].isin(g[g['ST']==ss]['FOD1P'].head(50).values)) &
 (d['ST']==ss)]

# Use this to check data
t.head()

tt=t.groupby(['ST','FOD1P','major']).count().reset_index()

tt=tt[['ST','FOD1P','major','SEX']]
tt.columns = ['ST','FOD1P','major','count']
tt[tt['ST']==6].head(5)

tx=pd.merge(tt, g, left_on = ['FOD1P','ST','major'], right_on = ['FOD1P','ST','major'])
tx.head(20)

In [None]:
plt.style.use('fivethirtyeight')
# Careful - using t and tx. Keep data sorted
ax = sns.stripplot(x="FOD1P", y="INCOME", data=t);
ax.set_xticklabels(ax.xaxis.get_majorticklabels(), rotation=90);
ax.set_xlabel('Primary Major in College')
plt.ylim(50000, 800000)
plt.yticks([50000,200000,400000,600000], ['$50K', '$200K', '$400K', '$600K'])

xtickName=tx[tx['ST']==ss]['major'].head(50).tolist()
xtickX=tx[tx['ST']==ss]['FOD1P'].head(50).tolist()

plt.xticks(range(0,50),xtickName,fontsize=9)


plt.title("Delaware/DE\nTop Earners",  fontsize=16,
            fontstyle='italic', fontweight='bold');

In [None]:
# Hack for showcode
s="<br>"*1000
HTML(s)