Skip to content

Lakshmikeerthi-22/Model-to-predict-whether-a-person-makes-over-50k-dollars-a-year

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Classification model to predict whether a person makes over $50k a year

Problem statement :

Create a classification model to predict whether a person makes over $50k a year.

Context :

This data was extracted from the 1994 Census bureau database by Ronny Kohavi and Barry Becker (Data Mining and Visualization, Silicon Graphics).

Details of features :

The columns are described as follows:

  • Age
  • Workclass
  • Fnlwgt
  • Education
  • education_num
  • marital_status
  • occupation
  • relationship
  • race
  • sex
  • capital_gain
  • capital_loss
  • native_country
  • income

Dataset :

https://drive.google.com/file/d/1iT33AiIyE2_vg8eMCtIDPJPJfKkjvdQh/view?usp=sharing

Steps to create the model :

  • Data Collection : Gather a dataset that includes features such as age, education, occupation, marital status, and capital gain/loss, as well as a label indicating whether the person makes over $50k a year.

  • Data Preprocessing : Preprocess the dataset by removing missing values, converting categorical variables to numerical values, and scaling/normalizing the features as needed.

  • Model Selection : Choose a classification algorithm to use for the model, such as logistic regression, decision tree, random forest, KNN or support vector machines (SVM).

  • Model Training : Split the preprocessed dataset into training and testing sets, and use the training set to train the chosen classification model.

  • Model Evaluation : Use the trained model to make predictions on the testing set, and evaluate its performance using metrics such as accuracy, precision, recall, and F1 score.

Results :

image