# Employee Attrition Prediction: Final Model

## Introduction
This notebook presents the final Random Forest model for predicting employee attrition. The objective is to provide actionable insights for HR teams to reduce attrition rates, improve employee satisfaction, and lower costs associated with turnover.

The notebook is structured as follows:
1. Data Preparation
2. Final Model Implementation
3. Model Evaluation
4. Feature Importance Analysis
5. Prediction Pipeline
6. Key Findings
7. Model Saving

**Key business question:** How can we use machine learning to predict and mitigate employee attrition effectively?

## Data Preparation

The data has been pre-engineered to ensure quality and relevance. Here, we load the dataset and prepare it for training and testing. The target variable is `Attrition`, representing whether an employee left the company (1) or stayed (0).

Steps:
1. Load the dataset.
2. Split the data into training and testing sets.
3. Scale or normalize features if necessary.

## Final Model

We use the Random Forest Classifier as the final model due to its robust performance during experimentation. This model balances accuracy, interpretability, and efficiency, making it a suitable choice for real-world applications.

The model is trained using the following hyperparameters:
- Number of trees (`n_estimators`): 300
- Maximum depth of trees (`max_depth`): 20
- Minimum samples required to split a node (`min_samples_split`): 2
- Minimum samples required in a leaf node (`min_samples_leaf`): 1
- Class weight handling (`class_weight`): Balanced (subsampled)

These parameters were fine-tuned through grid search cross-validation to achieve optimal performance.

## Model Evaluation

We evaluate the model using the following metrics:
1. **ROC-AUC Score:** Measures the model's ability to distinguish between classes.
2. **Classification Report:** Provides precision, recall, and F1-score for both classes.
3. **Confusion Matrix:** Highlights the true positives, false positives, true negatives, and false negatives.

These metrics help assess the model's real-world usability and identify potential areas for improvement.

## Feature Importance Analysis

Understanding feature importance helps us:
1. Interpret the model's decisions.
2. Identify key drivers of employee attrition.
3. Provide actionable insights for HR teams.

The plot below shows the most important features influencing the model's predictions. Features with higher importance values contribute more to the prediction of attrition.

## Prediction Pipeline

To use this model in production, we define a prediction pipeline. This pipeline takes input features and outputs the probability of attrition.

The pipeline ensures that the model is easy to deploy and use in real-world scenarios. Below, we demonstrate how to use the pipeline to make predictions with sample input data.

## Key Findings

### Summary of Results:
- **ROC-AUC (Test Set):** 0.980
- **Accuracy:** 93%
- **Precision/Recall (Attrition Class):** High precision and recall for predicting attrition.

### Feature Insights:
The most important predictors of attrition include:
1. **Work-Life Impact (OT_WorkLifeImpact):** Employees with poor work-life balance are at the highest risk of attrition.
2. **Overtime (OverTime_Yes):** Employees working overtime are significantly more likely to leave.
3. **Stock Option Level:** Employees with higher stock options are less likely to leave.
4. **Marital Status (Single):** Single employees show a higher risk of attrition compared to their married counterparts.
5. **Job Level:** Employees in lower job levels tend to leave more frequently.
6. **Total Working Years:** Employees with fewer years of experience show higher attrition risk.
7. **Age:** Younger employees are more likely to leave the organization.
8. **Distance from Home:** Longer commutes are associated with higher attrition risk.
9. **Seniority Impact:** Lower seniority levels correlate with higher attrition.
10. **Environment Satisfaction:** Employees dissatisfied with their work environment are more likely to leave.

### Business Recommendations:
1. **Work-Life Balance Programs:** Invest in flexible work arrangements and workload management to improve work-life balance.
2. **Manage Overtime:** Monitor and reduce excessive overtime to minimize burnout and dissatisfaction.
3. **Incentives:** Offer attractive stock options or equivalent benefits to retain key talent.
4. **Support Younger and Single Employees:** Provide targeted programs and career development opportunities for younger and single employees.
5. **Address Commute Challenges:** Explore remote work or relocation support for employees with long commutes.
6. **Employee Satisfaction Initiatives:** Conduct regular surveys to measure satisfaction and implement actionable improvements.