Skip to content

Latest commit

 

History

History
35 lines (32 loc) · 1.2 KB

README.md

File metadata and controls

35 lines (32 loc) · 1.2 KB

Employee-Absenteeism

Problem Statement

XYZ is a courier company. As we appreciate that human capital plays an important role in collection, transportation and delivery. The company is passing through genuine issue of Absenteeism. The company has shared it dataset and requested to have an answer on the following areas:

  1. What changes company should bring to reduce the number of absenteeism?
  2. How much losses every month can we project in 2011 if same trend of absenteeism continues?

It is a regression Problem.

All the steps implemented in this project

  1. Data Pre-processing.
  2. Data Visualization.
  3. Outlier Analysis.
  4. Missing value Analysis.
  5. Feature Selection.
  • Correlation analysis.
  • Chi-Square test.
  • Analysis of Variance(Anova) Test
  • Multicollinearity Test.
  1. Feature Scaling.
  • Normalization.
  1. Splitting into Train and Test Dataset.
  2. Dimensionality Reduction using PCA technique.
  3. Hyperparameter Optimization.
  4. Model Development I. Linear Regression
    II. Decision Tree III. Random Forest IV. XGBOOST
  5. Model Performance- Without PCA.
  6. Model Performance- With PCA.
  7. Conclusion
  8. Python Code
  9. R code.

You can view the Project Report for more details