Skip to content

cihanyatbaz/Income-Status-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 

Repository files navigation

Income Status Classification

About Us


The Goal and Dataset:

  • The dataset we used for this project is an extraction of 1994 Census database. It consists of demographic features such as age, workclass, education, education-num, marital-status, occupation, relationship, race, sex, capital-gain, capital-loss, hours-per-week, native-country and final-weight which is a combination of some features. Our main goal is to predict whether income exceeds $50K/yr based on census data.

  • For this machine learning project, we used some Supervised Learning Models such as Gradient Boosting, SVM, Logistic Regression, Naive Bayes and Decision Tree.

Dataset

dataset: http://archive.ics.uci.edu/ml/datasets/Census+Income 
Supervised Learning / Binary Classification				

Results:

  • After applying Pre-processing and other steps, we tried to get ROC-AUC scores for different models.

Roc-Auc Compression for Different Models


  • In the end, we tried to get different scores for different situations.

  • Binary Encoding vs One-Hot Encoding

Binary Encoding vs One-Hot Encoding

  • Simple & Distribution Based Imputation Comparison

Simple & Distribution Based Imputation Comparison

  • Column Drop Comparison

Column Drop Comparison


*On Jupyter notebook, you can see more explanation for the project and comparisons of the models.

Thank You

About

Income Status Classification with Machine Learning Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published