Skip to content

Utilized Random Forest Classifier to identify features (education, health, age, sex, race.. that will lead to high wages. Utilized techniques like Feature Engineering (StringIndexing, One-Hot Encoding, VectorAssembler) and Modeling (Random Forest Classifier with CV and hyper parameter optimization (MaxDepth, No of trees))

License

harshdarji23/Wage-Analysis

Repository files navigation

Wage-Analysis

Wage analysis is a process of comparing the salaries based on the attributes attached to the employee. Of course, there are several factors like the company, location which contributes to the wage. However, we will analyze the Mid-Atlantic wage dataset, which is available https://rdrr.io/cran/ISLR/man/Wage.html

Read the entire articel and analysis on Medium @ https://medium.com/towards-artificial-intelligence/will-your-education-pay-you-well-d6aeb44248fa

Libraries Used:

  1. Pyspark
  2. Spark ML
  3. Pandas
  4. Numpy
  5. Spark SQL

Techniques:

  1. Feature Engineering: StringIndexing, One-Hot Encoding, VectorAssembler
  2. Modeling: Random Forest Classifier with CV and hyper parameter optimization (MaxDepth, No of trees)

About

Utilized Random Forest Classifier to identify features (education, health, age, sex, race.. that will lead to high wages. Utilized techniques like Feature Engineering (StringIndexing, One-Hot Encoding, VectorAssembler) and Modeling (Random Forest Classifier with CV and hyper parameter optimization (MaxDepth, No of trees))

https://medium.com/towards-artificial-intelligence/will-your-education-pay-you-well-d6aeb44248fa

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published