# Comparison of Employment Across College Majors

Comparing employment statistics across majors for recent graduates vs. general population. <br>
The first thing that we need to do is import the data, and make sure that the Major Categories are the same in both tables.

In [1]:
import pandas as pd
import numpy as np


#setup Dataframes for use in this project
f_ages = 'all-ages.csv'
f_recent = 'recent-grads.csv'

all_ages = pd.read_csv(f_ages)
recent_grads = pd.read_csv(f_recent)

Want to make sure that the Major categories include the same set of values

In [2]:
set(all_ages['Major_category'].unique()) == set(recent_grads['Major_category'].unique())

True

In [3]:
cats = all_ages['Major_category'].unique()

aa_cat_counts = {}
rg_cat_counts = {}

for cat in cats:
    aa_cat_counts[cat] = all_ages[all_ages['Major_category'] == cat]['Total'].sum()
    rg_cat_counts[cat] = recent_grads[recent_grads['Major_category'] == cat]['Total'].sum()
aa_cat_counts

{'Agriculture & Natural Resources': 632437,
 'Arts': 1805865,
 'Biology & Life Science': 1338186,
 'Business': 9858741,
 'Communications & Journalism': 1803822,
 'Computers & Mathematics': 1781378,
 'Education': 4700118,
 'Engineering': 3576013,
 'Health': 2950859,
 'Humanities & Liberal Arts': 3738335,
 'Industrial Arts & Consumer Services': 1033798,
 'Interdisciplinary': 45199,
 'Law & Public Policy': 902926,
 'Physical Sciences': 1025318,
 'Psychology & Social Work': 1987278,
 'Social Science': 2654125}

Check out how many more people are in the *all ages* table vs the *recent grads* table

In [7]:
for cat in cats:
    print(cat,': ',aa_cat_counts[cat]-rg_cat_counts[cat])

Agriculture & Natural Resources :  556817.0
Biology & Life Science :  884324.0
Engineering :  3038430.0
Humanities & Liberal Arts :  3024867.0
Communications & Journalism :  1411221.0
Computers & Mathematics :  1482370.0
Industrial Arts & Consumer Services :  804006.0
Education :  4140989.0
Law & Public Policy :  723819.0
Interdisciplinary :  32903.0
Health :  2487629.0
Social Science :  2124159.0
Physical Sciences :  839839.0
Psychology & Social Work :  1506271.0
Arts :  1448735.0
Business :  8556365.0


In [4]:
low_wage_proportion = recent_grads['Low_wage_jobs'].sum()/recent_grads['Total'].sum()
low_wage_proportion

0.09858891195563152

## Comparison of Unemployment Rates for Recent Grads vs General Population

In [9]:
majors = recent_grads['Major'].unique()

lower = []

rg_lower_count = 0
for major in majors:
    aa = float(all_ages[all_ages['Major'] == major]['Unemployment_rate'])
    rg = float(recent_grads[recent_grads['Major'] == major]['Unemployment_rate'])
    if rg < aa:
        rg_lower_count += 1
        lower.append(major)
        
print(rg_lower_count)

44


**Note to Self:**
In order to compare aa and rg, they must be cast to *float*. Initially, they are series objects, and cannot be directly compared to each other.

The following 44 Majors have lower unemployment for recent grads than for general population.

In [10]:
print(lower)

['PETROLEUM ENGINEERING', 'METALLURGICAL ENGINEERING', 'ASTRONOMY AND ASTROPHYSICS', 'ENGINEERING MECHANICS PHYSICS AND SCIENCE', 'INDUSTRIAL AND MANUFACTURING ENGINEERING', 'ARCHITECTURAL ENGINEERING', 'COURT REPORTING', 'MATERIALS ENGINEERING AND MATERIALS SCIENCE', 'MISCELLANEOUS FINE ARTS', 'INDUSTRIAL PRODUCTION TECHNOLOGIES', 'MATHEMATICS', 'PHYSICS', 'ENGINEERING AND INDUSTRIAL MANAGEMENT', 'MATHEMATICS AND COMPUTER SCIENCE', 'GENERAL AGRICULTURE', 'MISCELLANEOUS ENGINEERING TECHNOLOGIES', 'GENETICS', 'UNITED STATES HISTORY', 'PHYSICAL SCIENCES', 'MILITARY TECHNOLOGIES', 'CHEMISTRY', 'ELECTRICAL, MECHANICAL, AND PRECISION TECHNOLOGIES AND PRODUCTION', 'BOTANY', 'HUMAN RESOURCES AND PERSONNEL MANAGEMENT', 'GEOSCIENCES', 'SOCIAL PSYCHOLOGY', 'AREA ETHNIC AND CIVILIZATION STUDIES', 'SPECIAL NEEDS EDUCATION', 'NEUROSCIENCE', 'MULTI/INTERDISCIPLINARY STUDIES', 'ATMOSPHERIC SCIENCES AND METEOROLOGY', 'SOIL SCIENCE', 'MATHEMATICS TEACHER EDUCATION', 'HEALTH AND MEDICAL PREPARATORY PROG