Skip to content

kevinidea/Predict_Income

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Predict_Income

Predict a person's income using Sklearn tool and object oriented programming technique in Python

Assumptions:

  1. Training dataset: Adult Data Set available from UCI https://archive.ics.uci.edu/ml/datasets/Adult

  2. Testing dataset: adult.test.txt also available from UCI

  3. Binary classification for outcome: the person either makes <=50K or >50K annually

Project is completed with 4 processes:

  1. Data Preparation: Download, extract, clean and store the data >>> Result: adult.csv, cleanData.csv and cleanData.sqlite

  2. Data Exploration: Visualize the distribution of each variable and also their relationships with income outcome >>> Result: All the PNG pictures

  3. Data Modeling: Create predictive model using logistic regression and improve with random forest classifier

  4. Prediction: Transform the test data to feed in the predictive model

Running this program is simple as 1,2,3:

  1. Pull/download this repo

  2. Make sure you have Python 2, Numpy, Pandas and Sklearn modules

  3. Run the Main.py script (browse to the project directory and run command python Main.py)

About

Predict a person's income using Sklearn tool and object oriented programming technique in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages