Skip to content

The OSEMN Data Science Pipeline, Logistic Regression, SVM, kNN, and Random Forest

Notifications You must be signed in to change notification settings

RalphGradien/Employee-Turnover-and-HR-data-Exploration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Employee-Turnover-and-HR-data-Exploration

Project Overview

The project aims to understand factors contributing to employee turnover and create a predictive model to identify employees at risk of leaving the company.

OSEMN Data Science Pipeline

  1. Obtaining the Data:

    • Downloaded the dataset from Kaggle.
    • Imported the data into the working environment.
  2. Scrubbing the Data:

    • Checked for missing values (dataset was clean).
    • Examined the dataset for readability and appropriate feature names.
    • Converted categorical features (department, salary) to numeric types.
  3. Exploratory Data Analysis (EDA):

    • Conducted statistical overview and summary.
    • Explored correlations among features using a correlation matrix and heatmap.
    • Analyzed turnover patterns in relation to department, salary, promotion, years at the company, project count, evaluation, average monthly hours, etc.
  4. Modeling the Data:

    • Split the data into training and testing sets.
    • Implemented various machine learning models (Logistic Regression, SVM, kNN, Random Forest).
    • Evaluated model performance using training and testing scores.
  5. Interpreting the Data:

    • Summarized findings from EDA.
    • Highlighted trends related to turnover, satisfaction, salary, project count, and evaluations.
    • Raised questions for further consideration about the impact of losing employees and factors affecting satisfaction and turnover.

Conclusion and Questions

  • Noted trends related to working hours, salary, promotion, and project count.
  • Highlighted correlations between turnover, satisfaction, and salary.
  • Posed questions about the impact of losing employees and factors influencing turnover and satisfaction.

Note: The code sections may need to be reformatted and executed in a Python environment for full functionality.