Skip to content

Analytics Vidhya πŸ”Ž Job-A-Thon Nov 2021 | Employee Attrition Prediction

Notifications You must be signed in to change notification settings

Ambatkar/Employee-Attrition-Prediction-Job-A-Thon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Data-Science-Job-A-Thon πŸŒοΈβ€β™€οΈ

Analytics Vidhya πŸ”Ž Job-A-Thon Nov 2021 | Employee Attrition Prediction

Problem :

You are working as a data scientist with HR Department of a large insurance company focused on sales team attrition. Insurance sales teams help insurance companies generate new business by contacting potential customers and selling one or more types of insurance. The department generally sees high attrition and thus staffing becomes a crucial aspect.

To aid staffing, you are provided with the monthly information for a segment of employees for 2016 and 2017 and tasked to predict whether a current employee will be leaving the organization in the upcoming two quarters (01 Jan 2018 - 01 July 2018) or not, given:

Demographics of the employee (city, age, gender etc.) Tenure information (joining date, Last Date) Historical data regarding the performance of the employee (Quarterly rating, Monthly business acquired, designation, salary)

Approach :

πŸ›€ CLEANING the Data was not much of an issue and Data only had Nan values of employee not leaving on specific day of the logs written.

Adding Features before EDA

  • Total Number of working Months
  • Employee Attrited
  • Converting Education to variables (Master - 3, Bachelors - 2, College - 1)
  • Have employee got Promotion
  • Sum of Total Bussiness
  • Mean of Quraterly rating

πŸ“… EDA Reviewing the Data at first we can see that

  • Employee working for the long time are not leaving the Job.
  • Employee Aged 21-24 and 48+ have higher attrition rates.
  • Employee with Relatively less salaried, left the company.
  • Male having higher Bussiness tend to work.

πŸ‘·β€β™‚οΈ Feature Engineering Done One Hot Encoding of Gender and City. Added 6 Months to the the total Working for the future Predictions

Using Sklearn Library Trainied SVM, Logistic Regression, AdaBoostClassifier and RandomForestRegressor Among these Model SVM with GridSearchCV Performed the Best which gave 76.78%

*Ending Note and What Did I learn:

  • I Scored 65.05% Score on Leader Board on SVM with GridSearchCV but My earlier Predictions with Default SVM model (With Education Added in features gave 70.10% on LeaderBoard)
  • In the code file i have not used Employee Education Feature, Initially it did gave a better results in the compitition but Final board says different
  • I could have use more efficient Algorithms Like GradientBoost or CatBoost

About

Analytics Vidhya πŸ”Ž Job-A-Thon Nov 2021 | Employee Attrition Prediction

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published