PredictingWages_Regression

Python implementation of a case study in Module 2 of the MITProfessionalX course "Data Science: Data to insights".

The case study is: "Module 2 Case Study - Regression and prediction". This case study is about doing linear regression in R on wages data. I did it in python using two different libraries.

Points of interest

The analysis is relatively simple (linear regression), but it might be interesting to see how to do it using the two libraries, sklearn and patsy + statsmodels.

Also, it is interesting how features were created from the existing ones in the "flexible" model by calculating interactions between existing features. For this task patsy was really handy.

Cross validating does not make fully sense in this case but it is interesting to see anyway.

Project description

Our goals are:

Predict wages using various characteristics of workers.
Assess the predictive performance using adjusted MSE and R^2 , and out-of-sample MSE and R^2.

The data

Data is from the March Supplement of the U.S. Current Population Survey, year 2012.

Focus on the single (never married) workers with education levels equal to high-school, some college, or college graduates.
The sample is of size n ≈ 4,000.
The outcome Y is hourly wage, and X are various characteristics of workers.

The notebook

Linear Regression.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.Rhistory		.Rhistory
.gitignore		.gitignore
Linear Regression.ipynb		Linear Regression.ipynb
README.md		README.md
data.csv		data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.Rhistory

.Rhistory

.gitignore

.gitignore

Linear Regression.ipynb

Linear Regression.ipynb

README.md

README.md

data.csv

data.csv

Repository files navigation

PredictingWages_Regression

Points of interest

Project description

The data

The notebook

About

Releases

Packages

Languages

aless80/PredictingWages_Regression

Folders and files

Latest commit

History

Repository files navigation

PredictingWages_Regression

Points of interest

Project description

The data

The notebook

About

Resources

Stars

Watchers

Forks

Languages