# What are the top 3 fields of study that yielded the highest median employment income 2 years after graduation in each year the study was conducted?


**Introduction:** Median employment income of graduates 2 years after graduation in Canada varies across the different fields of study. The purpose of this analysis was to identify the top 3 fields of study that yielded the highest median employement income 2 years after graduation.

## Import Data

In [0]:
import pandas as pd

In [0]:
data = pd.read_csv("graduate_income.csv")

In [0]:
data.head()

Unnamed: 0,REF_DATE,GEO,DGUID,Educational qualification,Field of study,Gender,Age group,Status of student in Canada,Characteristics after graduation,Graduate statistics,UOM,UOM_ID,SCALAR_FACTOR,SCALAR_ID,VECTOR,COORDINATE,VALUE,STATUS,SYMBOL,TERMINATED,DECIMALS
0,2010,Canada,2016A11124,"Total, educational qualification","Total, field of study","Total, gender",15 to 64 years,Canadian and international students,All graduates,Number of graduates,Number,223,units,0,v1007918655,1.1.1.1.1.1.1.1,321930.0,,,,0
1,2010,Canada,2016A11124,"Total, educational qualification","Total, field of study","Total, gender",15 to 64 years,Canadian and international students,Graduates with no income information,Number of graduates,Number,223,units,0,v1007918656,1.1.1.1.1.1.2.1,57460.0,,,,0
2,2010,Canada,2016A11124,"Total, educational qualification","Total, field of study","Total, gender",15 to 64 years,Canadian and international students,Full-time students,Number of graduates,Number,223,units,0,v1007918657,1.1.1.1.1.1.3.1,50920.0,,,,0
3,2010,Canada,2016A11124,"Total, educational qualification","Total, field of study","Total, gender",15 to 64 years,Canadian and international students,Graduates reporting employment income,Number of graduates,Number,223,units,0,v1007918658,1.1.1.1.1.1.4.1,213545.0,,,,0
4,2010,Canada,2016A11124,"Total, educational qualification","Total, field of study","Total, gender",15 to 64 years,Canadian and international students,"Graduates reporting wages, salaries and commis...",Number of graduates,Number,223,units,0,v1007918659,1.1.1.1.1.1.5.1,190120.0,,,,0


## Filter Data

In [0]:
#How many years of data are there? 
print(data['REF_DATE'].unique())

[2010 2011 2012 2013 2014 2015]


In [0]:
#Filter for the relevant data and rename columns of interest
data_filtered =  data[(data['GEO']=='Canada') &
                      (data['Field of study'] != "Total, field of study") &
                      (data['Gender']== "Total, gender")&
                      (data['Age group']=="15 to 64 years")&
                      (data['Status of student in Canada']=="Canadian and international students") &
                      (data['Characteristics after graduation']=="Graduates reporting employment income")&
                      (data['UOM'] =="2017 constant dollars")]

data_filtered.rename(columns = {'REF_DATE':'Year', 'VALUE':'Median Employment Income'}, inplace = True) 

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,


In [0]:
data_filtered.head()

Unnamed: 0,Year,GEO,DGUID,Educational qualification,Field of study,Gender,Age group,Status of student in Canada,Characteristics after graduation,Graduate statistics,UOM,UOM_ID,SCALAR_FACTOR,SCALAR_ID,VECTOR,COORDINATE,Median Employment Income,STATUS,SYMBOL,TERMINATED,DECIMALS
5389,2010,Canada,2016A11124,"Career, technical or professional training sho...",Education [1],"Total, gender",15 to 64 years,Canadian and international students,Graduates reporting employment income,Median employment income,2017 constant dollars,387,units,0,v1007944759,1.4.2.1.1.1.4.2,30400.0,,,,0
5505,2010,Canada,2016A11124,"Career, technical or professional training sho...",Education [13],"Total, gender",15 to 64 years,Canadian and international students,Graduates reporting employment income,Median employment income,2017 constant dollars,387,units,0,v1007939971,1.4.3.1.1.1.4.2,30400.0,,,,0
5621,2010,Canada,2016A11124,"Career, technical or professional training sho...","Visual and performing arts, and communications...","Total, gender",15 to 64 years,Canadian and international students,Graduates reporting employment income,Median employment income,2017 constant dollars,387,units,0,v1007944885,1.4.4.1.1.1.4.2,27000.0,,,,0
5747,2010,Canada,2016A11124,"Career, technical or professional training sho...",Communications technologies/technicians and su...,"Total, gender",15 to 64 years,Canadian and international students,Graduates reporting employment income,Median employment income,2017 constant dollars,387,units,0,v1007939593,1.4.5.1.1.1.4.2,25700.0,,,,0
5871,2010,Canada,2016A11124,"Career, technical or professional training sho...",Visual and performing arts [50],"Total, gender",15 to 64 years,Canadian and international students,Graduates reporting employment income,Median employment income,2017 constant dollars,387,units,0,v1007944003,1.4.6.1.1.1.4.2,28400.0,,,,0


In [0]:
#Divide the dataset into the different years in which data was collected (2010, 2011, 2012, 2013, 2014, 2015)

data2010 = data_filtered[(data['REF_DATE']==2010)]
               
data2011 = data_filtered[(data['REF_DATE']==2011)]

data2012 = data_filtered[(data['REF_DATE']==2012)]

data2013 = data_filtered[(data['REF_DATE']==2013)]

data2014 = data_filtered[(data['REF_DATE']==2014)]

data2015 = data_filtered[(data['REF_DATE']==2015)]

  
  after removing the cwd from sys.path.
  
  
  # Remove the CWD from sys.path while we load stuff.
  if sys.path[0] == '':


### 2010 Top 3 Fields of Study That Yielded Highest Income

In [0]:
#To find the top 3 fields that yielded the highest median employment income, order median employment income in descending order
#Then select only the first 3 rows, which represent the 3 highest median employment income in that year
#Only columns of interest were extracted

data2010_descending = data2010[['Year','Field of study','Median Employment Income']].sort_values('Median Employment Income', ascending=False)   
data2010_descending.head(n=3)

Unnamed: 0,Year,Field of study,Median Employment Income
48855,2010,Health and related fields [10],198200.0
48975,2010,"Dental, medical and veterinary residency progr...",198200.0
66432,2010,"Business, management, marketing and related su...",114000.0


### 2011 Top 3 Fields of Study That Yielded Highest Income

In [0]:
data2011_descending = data2011[['Year','Field of study','Median Employment Income']].sort_values('Median Employment Income', ascending=False)   
data2011_descending.head(n=3)

Unnamed: 0,Year,Field of study,Median Employment Income
391392,2011,"Dental, medical and veterinary residency progr...",193600.0
391268,2011,Health and related fields [10],193600.0
408998,2011,"Business, management, marketing and related su...",104300.0


### 2012 Top 3 Fields of Study That Yielded Highest Income

In [0]:
data2012_descending = data2012[['Year','Field of study','Median Employment Income']].sort_values('Median Employment Income', ascending=False)   
data2012_descending.head(n=3)

Unnamed: 0,Year,Field of study,Median Employment Income
752584,2012,Health and related fields [10],164500.0
752707,2012,"Dental, medical and veterinary residency progr...",164500.0
770017,2012,"Business, management, marketing and related su...",103200.0


### 2013 Top 3 Fields of Study That Yielded Highest Income



In [0]:
data2013_descending = data2013[['Year','Field of study','Median Employment Income']].sort_values('Median Employment Income', ascending=False)
data2013_descending.head(n=3) 

Unnamed: 0,Year,Field of study,Median Employment Income
1123808,2013,"Dental, medical and veterinary residency progr...",146600.0
1123686,2013,Health and related fields [10],146600.0
1130490,2013,Security and protective services [43],109300.0


### 2014 Top 3 Fields of Study That Yielded Highest Income

In [0]:
data2014_descending = data2014[['Year','Field of study','Median Employment Income']].sort_values('Median Employment Income', ascending=False)   
data2014_descending.head(n=3)

Unnamed: 0,Year,Field of study,Median Employment Income
1495901,2014,Health and related fields [10],145900.0
1496027,2014,"Dental, medical and veterinary residency progr...",145900.0
1513439,2014,"Business, management, marketing and related su...",103800.0


### 2015 Top 3 Fields of Study That Yielded Highest Income

In [0]:
data2015_descending = data2015[['Year','Field of study','Median Employment Income']].sort_values('Median Employment Income', ascending=False) 
data2015_descending.head(n=3)

Unnamed: 0,Year,Field of study,Median Employment Income
1871136,2015,"Dental, medical and veterinary residency progr...",139500.0
1871010,2015,Health and related fields [10],139500.0
1856053,2015,Public administration and social service profe...,105800.0


**Conclusion:**
Although **"Health and releated fields"**, and **"Dental, medical, and veterinary residency programs"** tied for being the field of study that yielded the highest median employment income 2 years after graudation in Canada in all the years in which this study was done, **"Health and related fields"**, **"Dental, Medical and veterinary residency programs"**, and **"Business, management, marketing and related support services"** were consistently the top 3 fields of study that yielded the highest median empyloemtn income in all years of this study.