# Where it Pays to Attend College

##### Report by Matt Wong, Rhea Chen, ZK Lin, and Demi Tu

## Problem Overview

Our research fit into the broader problem domain because we are addressing problems that are not only close to home, but are also issues on a national level.

First of all, does it matter where students go to college? According to [this article](https://www.theatlantic.com/business/archive/2012/05/does-it-matter-where-you-go-to-college/257227/), the answer is yes when it comes to students’ future paychecks, and this is supported by years of research. Of course, factors that differ among individual students will make the outcome to vary, but [this research paper](https://cdn.theatlantic.com/static/mt/assets/business/Ehrenberg-JHR_Does_It_Pay_to_Attend_an_Elite_Private_College.pdf) supports the statement that the more elite a school, the better its alums’ paychecks. The rankings of the universities matter a great deal, and the effect is more evident over time.

Settling on a college is not the only difficult decision that needs to be made by students. According to the book named “The Undecided College Student: An Academic And Career Advising Challenge” written by Virginia N. Gordon, about 20 to 50 percent of students enter college as “undecided”, and an estimated 75 percent of students change their major at least once before graduation. It is also important to note that “decided” students are not necessarily basing their decision of major on factual research. According to a College Student Journal survey, that is later cited in [this journal](https://dus.psu.edu/mentor/2013/06/disconnect-choosing-major/), of more than 800 students who were asked to elaborate on their career decision-making process, factors that played a role included a general interest the student had in the subject he or she chose; family and peer influence; and assumptions about introductory courses, potential job characteristics, and characteristics of the major.

While students might fall into a never-ending cycle when choosing colleges and majors, another problem that we are addressing, on the other hand, concerns the topic of student loan debt which directly relates to our target variable, salary. According to a [study](https://evolllution.com/attracting-students/todays_learner/newly-released-student-loan-data-bust-several-myths-about-student-loan-repayment/) that analyzes student loan data, “even among those who borrowed only for their undergraduate education...only half of students had paid off all their federal student loans 20 years after beginning college in 1995-96.” Instead, average borrowers in “this group still owed approximately $10,000 in principle and interest, about half of what was originally borrowed, 20 years after beginning college.” The fact that it is very difficult for recent graduates to find well-paying jobs at the outset of their careers, significantly limited their ability to repay their student loans.

## Purpose



## Variables

### Data from Kaggle

- `School Name` - Name of college
- `School Type` - Type of college
- `Region` - Location of college by region
- `Undergraduate Major` - Undergraduate major
- `Starting Median Salary` - Median salary at the start of career
- `Mid-Career Median Salary` - Median salary at the middle of career
- `Percent change from Starting to Mid-Career Salary` - Change in salary (measured in percentage) from starting to mid-career
- `Mid-Career 10th Percentile Salary` - Mid-career salary of the 10th percentile
- `Mid-Career 25th Percentile Salary` - Mid-career salary of the 25th percentile
- `Mid-Career 75th Percentile Salary` - Mid-career salary of the 75th percentile
- `Mid-Career 90th Percentile Salary` - Mid-career salary of the 90th percentile

### External Data

- `name` - Institution name
- `location` - Location of college
- `rank` - National rank of college
- `description` - Snippet of text overview from U.S. News
- `tuition_and_fees` - Combined tuition and fees
- `in_state` - In-state tuition
- `undergrad_enrollment` - Number of enrolled undergratuate students

In [5]:
# Set up
import pandas as pd

# Import data
major = pd.read_csv('./data/degrees-that-pay-back.csv')
college_type = pd.read_csv('./data/salaries-by-college-type.csv')
college_region = pd.read_csv('./data/salaries-by-region.csv')
college_ranking = pd.read_csv('./data/national-universities-rankings.csv')