Dataset IBM HR Analytics Employee Attrition & Performance. This is a fictional data set created by IBM data scientists.
I analyzed IBM's dataset on HR Analytics. The data consists of nearly 1,470 employees with information related to their job satisfaction, work life balance, tenure, experience, salary, and demographic data. Below is a brief overview and summary statistics of the data.
I used and compared models such as:
- KNN method
- Decision trees
The best score for the decision tree is 0.85, for knn - 0.84. The best of these models is the decision tree.
What can be added:
- Visualization for primary data analysis
- LogisticRegression Model
- RandomForestClassifier Model
- Compare all models
- Hypothesis Testing
- Independent T-Test
- Chi-Square Test of Independence