Projects containing regression as the primary method to do prediction.
The Assignment is based on Cross-section data originating from the March 1988 Current Population Survey by the US Census Bureau. The sample consists of men aged 18 to 70 with positive annual income greater than USD 50 in 1992, who are not self-employed nor working without pay. Wages are deflated by the deflator of Personal Consumption Expenditure for 1992. The Data A data frame containing 28,155 observations on 7 variables.
- Wage : Wage (in dollars per week).
- Education: Number of years of education.
- Experience : Number of years of potential work experience.
- Ethnicity: Factor with levels "cauc" and "afam" (African-American).
- Smsa : Factor. Does the individual reside in a Standard Metropolitan Statistical Area (SMSA)?
- Region: Factor with levels "northeast", "midwest", "south", "west".
- Parttime: Factor. Does the individual work part-time? Project: Build an appropriate model to predict Wages based on the data provided in the dataset. The assignment should include exploratory data analysis, data preparation , building predictive model for Wages and measurement of accuracy using the correct metrics as per the algorithm used.