Skip to content

IncomePredict is a Python-based data science project that analyzes income levels using data visualization, encoding techniques, and machine learning models.

Notifications You must be signed in to change notification settings

RobbenWijanathan/IncomePredict

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

IncomePredict: A Simple Data Science Project

IncomePredict is a lightweight, Jupyter Notebook Python-based data science project focused on predicting income levels using data visualization, encoding techniques, and machine learning models.

Background

Why Income?

Primary Indicator

Income is a primary indicator of well-being in our daily lives, helping to meet our essential needs.

Various Factors

Income levels are shaped by various factors that influence their distribution among individuals.

Impact

Researching income-influencing factors deepens our understanding of their impact and effectiveness.


Features

1. Data Visualization & Insights

  • Generate graphical representations of income distribution.
  • Understand key variables affecting income levels.

2. Encoding & Decoding

  • Ordinal Encoding for ordered categorical variables.
  • One-Hot Encoding for categorical variables with no inherent order.

3. K-Fold Cross-Validation Prediction

  • Use K-Fold cross-validation to enhance model performance.
  • Evaluate model accuracy with recall and precision metrics.

4. Basic Data Merging & Splitting

  • Combine multiple datasets.
  • Perform train-test splits for machine learning models.

Example Dataset

This project utilizes the income dataset from Selva86's Datasets:

"INCOME","SEX","MARITAL.STATUS","AGE","EDUCATION","OCCUPATION","AREA","DUAL.INCOMES","HOUSEHOLD.SIZE","UNDER18","HOUSEHOLDER","HOME.TYPE","ETHNIC.CLASS","LANGUAGE"
"[75.000-","F","Married","45-54","1 to 3 years of college","Homemaker","10+ years","No","Three","None","Own","House","White",NA
"[75.000-","M","Married","45-54","College graduate","Homemaker","10+ years","No","Five","Two","Own","House","White","English"
"[75.000-","F","Married","25-34","College graduate","Professional/Managerial","10+ years","Yes","Three","One","Rent","Apartment","White","English"
"-10.000)","F","Single","14-17","Grades 9 to 11","Student, HS or College","10+ years","Not Married","Four","Two","Family","House","White","English"
"-10.000)","F","Single","14-17","Grades 9 to 11","Student, HS or College","4-6 years","Not Married","Four","Two","Family","House","White","English"
"[50.000-75.000)","M","Married","55-64","1 to 3 years of college","Retired","10+ years","No","Two","None","Own","House","White","English"
"-10.000)","M","Single","18-24","Graduated High Scool","Unemployed","7-10 years","Not Married","Three","One","Rent","Apartment","White","English"
"[30.000-40.000)","M","Divorced","25-34","1 to 3 years of college","Factory Worker/Laborer/Driver","10+ years","Not Married","One","None","Rent","Apartment","White","English"

License

This project is not currently licensed. Feel free to use or contribute to it, but please contact me for clarification or permission before using it in a commercial setting.


Contributing

Feel free to fork this repository, open issues, or submit pull requests to improve IncomePredict!


Contact

📧 Email: robbenwijanathan@gmail.com 🐙 GitHub: RobbenWijanathan

About

IncomePredict is a Python-based data science project that analyzes income levels using data visualization, encoding techniques, and machine learning models.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published