Perform Big Data Analytics on UCI Census Income Dataset for Income Prediction.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
DataAnalysis.R
Decisiontreebigdata.R
MyData.csv
README.md
Readme.txt
census.csv
census2.csv

README.md

Census-Income-BigData-Analytics

Perform Big Data Analytics on UCI Census Income Dataset for Income Prediction.

Steps to run the code:

  1. For DataAnalysis, we have used the DataAnalysis.R file which uses the census.csv file as an input and contains 48842 rows.
  2. For Bigdata perspective, we have replicated the census.csv to twice its size ie 97684 and named it census2.csv. Decisiontreebgdata.R file uses census2.csv as an input.This is also using the Mydata.csv which is the processed and cleaned form of census2.csv.

Reference: 1.https://mathematicaforprediction.wordpress.com/2014/03/30/classification-and-association-rules-for-census-income-data/ 2.https://www.knowbigdata.com/blog/predicting-income-level-analytics-casestudy-r 3.https://archive.ics.uci.edu/ml/datasets/Census+Income 4.http://scg.sdsu.edu/dataset-adult_r/