# Employee Attrition Prediction
## Notebook 04: Model Evaluation, Registry & Deployment

This notebook covers:
- Loading trained SageMaker XGBoost model
- Evaluating performance on validation and test data
- Registering the model in SageMaker Model Registry
- Deploying a real-time inference endpoint
- Running sample predictions
- Cleaning up AWS resources


In [1]:
import boto3
import sagemaker
import pandas as pd
import numpy as np

from sagemaker.model import Model
from sagemaker.serializers import CSVSerializer
from sagemaker.deserializers import JSONDeserializer

from sklearn.metrics import (
    accuracy_score,
    precision_score,
    recall_score,
    f1_score,
    roc_auc_score,
    confusion_matrix
)

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml


In [2]:
# sagemaker setup

sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()
bucket = sagemaker_session.default_bucket()
region = boto3.Session().region_name

sm_client = boto3.client("sagemaker")

print("Bucket:", bucket)
print("Region:", region)

Bucket: sagemaker-us-east-1-952878272094
Region: us-east-1


In [3]:
# Loading Test Data from S3

processed_prefix = "employee-attrition/processed"

X_test_path = f"s3://{bucket}/{processed_prefix}/X_test.csv"
y_test_path = f"s3://{bucket}/{processed_prefix}/y_test.csv"

X_test = pd.read_csv(X_test_path)
y_test = pd.read_csv(y_test_path).values.ravel()

print("X_test shape:", X_test.shape)
print("y_test shape:", y_test.shape)


X_test shape: (8940, 29)
y_test shape: (8940,)


In [4]:
training_job_name = "sagemaker-xgboost-2026-02-19-21-45-42-632"

training_desc = sm_client.describe_training_job(
    TrainingJobName=training_job_name
)

model_artifact = training_desc["ModelArtifacts"]["S3ModelArtifacts"]
training_image = training_desc["AlgorithmSpecification"]["TrainingImage"]

print("Model artifact:", model_artifact)
print("Training image:", training_image)

Model artifact: s3://sagemaker-us-east-1-952878272094/employee-attrition/model-artifacts/sagemaker-xgboost-2026-02-19-21-45-42-632/output/model.tar.gz
Training image: 683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-xgboost:1.7-1


In [5]:
# Creating SageMaker Model Object

xgb_model = Model(
    image_uri=training_image,
    model_data=model_artifact,
    role=role,
    sagemaker_session=sagemaker_session
)

In [6]:
# Registering Model in Model Registry

model_package = xgb_model.register(
    content_types=["text/csv"],
    response_types=["application/json"],
    inference_instances=["ml.m5.large"],
    transform_instances=["ml.m5.large"],
    model_package_group_name="EmployeeAttritionModel",
    approval_status="Approved"
)

In [11]:
from sagemaker.predictor import Predictor

endpoint_name = "employee-attrition-xgb-endpoint"

predictor = Predictor(
    endpoint_name=endpoint_name,
    sagemaker_session=sagemaker_session,
    serializer=CSVSerializer(),
    deserializer=JSONDeserializer()
)

In [12]:
print(predictor)

Predictor: {'endpoint_name': 'employee-attrition-xgb-endpoint', 'sagemaker_session': <sagemaker.session.Session object at 0x7fbf48218650>, 'serializer': <sagemaker.base_serializers.CSVSerializer object at 0x7fbf422e3b90>, 'deserializer': <sagemaker.base_deserializers.JSONDeserializer object at 0x7fbf422e3680>}


In [14]:
X_test.dtypes[X_test.dtypes == "bool"]

Gender_Male                          bool
Job Role_Finance                     bool
Job Role_Healthcare                  bool
Job Role_Media                       bool
Job Role_Technology                  bool
Education Level_Bachelorâ€™s Degree    bool
Education Level_High School          bool
Education Level_Masterâ€™s Degree      bool
Education Level_PhD                  bool
Marital Status_Married               bool
Marital Status_Single                bool
dtype: object

In [15]:
# Converting boolean columns to int (True/False â†’ 1/0)
bool_cols = X_test.select_dtypes(include=["bool"]).columns

X_test[bool_cols] = X_test[bool_cols].astype(int)

print("Converted boolean columns:")
print(bool_cols.tolist())

Converted boolean columns:
['Gender_Male', 'Job Role_Finance', 'Job Role_Healthcare', 'Job Role_Media', 'Job Role_Technology', 'Education Level_Bachelorâ€™s Degree', 'Education Level_High School', 'Education Level_Masterâ€™s Degree', 'Education Level_PhD', 'Marital Status_Married', 'Marital Status_Single']


In [19]:
# Robust extraction of prediction probabilities from SageMaker XGBoost response

if isinstance(response, dict) and "predictions" in response:
    preds = response["predictions"]

    # Case 1: list of dicts 
    if isinstance(preds[0], dict):
        # Try common keys
        if "score" in preds[0]:
            y_pred_prob = np.array([p["score"] for p in preds], dtype=float)
        elif "probability" in preds[0]:
            y_pred_prob = np.array([p["probability"] for p in preds], dtype=float)
        else:
            raise ValueError(f"Unknown prediction dict format: {preds[0]}")
    else:
        # Case 2: list of floats
        y_pred_prob = np.array(preds, dtype=float)

else:
    # Fallback: raw list
    y_pred_prob = np.array(response, dtype=float)

# Convert probabilities to class labels
y_pred = (y_pred_prob >= 0.5).astype(int)

In [21]:
# Ensuring y_test is encoded consistently with training labels
y_test = pd.Series(y_test).map({
    "Stayed": 0,
    "Left": 1
}).values

In [23]:
print("y_test unique:", np.unique(y_test))
print("y_pred unique:", np.unique(y_pred))

y_test unique: [0 1]
y_pred unique: [0 1]


In [22]:
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Precision:", precision_score(y_test, y_pred))
print("Recall:", recall_score(y_test, y_pred))
print("F1 Score:", f1_score(y_test, y_pred))
print("ROC AUC:", roc_auc_score(y_test, y_pred))
confusion_matrix(y_test, y_pred)

Accuracy: 0.6539149888143176
Precision: 0.6422424391443324
Recall: 0.6144436603152199
F1 Score: 0.6280355854772782
ROC AUC: 0.6520714782702139


array([[3234, 1455],
       [1639, 2612]])

In [24]:
predictor.delete_endpoint()

## ðŸ“Œ Project Summary and Conclusion

In this project, an end-to-end machine learning system was designed and implemented to predict employee attrition using structured HR data. The objective was to identify employees who are likely to leave the organization, enabling data-driven decision-making for employee retention.

### ðŸ”¹ Problem Statement
Employee attrition is a critical challenge for organizations, leading to increased hiring costs and loss of experienced talent. This project aims to build a binary classification model that predicts whether an employee is likely to leave the company based on demographic, professional, and organizational attributes.

### ðŸ”¹ Dataset
The dataset was sourced from Kaggle and includes features such as age, job role, monthly income, work-life balance, job satisfaction, company tenure, remote work status, leadership opportunities, and company reputation. The target variable, **Attrition**, indicates whether an employee stayed (`0`) or left (`1`) the organization.

### ðŸ”¹ Methodology
The project followed a structured machine learning lifecycle:
- Data ingestion and exploratory data analysis
- Data cleaning and preprocessing
- Feature encoding and scaling
- Train, validation, and test split
- Model training using **XGBoost** with **AWS SageMaker managed training**
- Model evaluation using a held-out test set
- Model registration and real-time deployment using **AWS SageMaker**
- Inference and evaluation via a temporary endpoint
- Proper cleanup of cloud resources

### ðŸ”¹ Model and Tools
- **Model**: XGBoost Classifier
- **Cloud Platform**: AWS SageMaker
- **Training Mode**: Managed SageMaker training job
- **Deployment**: Real-time inference endpoint
- **Version Control**: GitHub

### ðŸ”¹ Evaluation Results
The trained model was evaluated on unseen test data using standard classification metrics:

- **Accuracy**: 65.39%
- **Precision**: 64.22%
- **Recall**: 61.44%
- **F1 Score**: 62.80%
- **ROC-AUC**: 65.21%

**Confusion Matrix:**

[[3234 1455]
[1639 2612]]

## ðŸ“Š Detailed Model Evaluation Metrics

To evaluate the performance of the employee attrition prediction model, multiple classification metrics were used. Since employee attrition is a binary classification problem with potential class imbalance, relying on a single metric such as accuracy is insufficient. Therefore, a combination of threshold-dependent and threshold-independent metrics was analyzed.

### ðŸ”¹ Accuracy (65.39%)
Accuracy represents the proportion of total predictions that were correct.

An accuracy of **65.39%** indicates that the model correctly classified approximately two-thirds of employees. While accuracy provides a general sense of performance, it does not distinguish between different types of classification errors, which is important in attrition prediction.

---

### ðŸ”¹ Precision (64.22%)
Precision measures how many employees predicted to leave the company actually left.

A precision score of **64.22%** suggests that when the model predicts attrition, it is correct most of the time. This is particularly important in HR contexts, as false positives may lead to unnecessary retention interventions for employees who are not actually at risk.

---

### ðŸ”¹ Recall (61.44%)
Recall measures how many employees who actually left the company were correctly identified by the model.

With a recall of **61.44%**, the model is able to identify a majority of employees who are at risk of leaving. In attrition use cases, recall is critical because failing to identify at-risk employees (false negatives) can result in missed retention opportunities.

---

### ðŸ”¹ F1 Score (62.80%)
The F1 score is the harmonic mean of precision and recall.

An F1 score of **62.80%** indicates a balanced trade-off between precision and recall, making the model suitable as a baseline classifier where both false positives and false negatives have practical implications.

---

### ðŸ”¹ ROC-AUC (65.21%)
The Receiver Operating Characteristic â€“ Area Under the Curve (ROC-AUC) measures the modelâ€™s ability to distinguish between employees who leave and those who stay across all classification thresholds.

A ROC-AUC score of **65.21%** indicates that the model has a good discriminative ability and performs significantly better than random guessing (ROC-AUC = 50%). This metric is especially useful in scenarios where the decision threshold may change based on business requirements.

---

### ðŸ”¹ Confusion Matrix Interpretation


- **True Negatives (3234):** Employees who stayed and were correctly predicted to stay  
- **False Positives (1455):** Employees predicted to leave but actually stayed  
- **False Negatives (1639):** Employees who left but were predicted to stay  
- **True Positives (2612):** Employees who left and were correctly predicted to leave  

This confusion matrix shows that the model maintains a reasonable balance between identifying at-risk employees and avoiding excessive false alarms.

---

### ðŸ”¹ Overall Evaluation Summary
The evaluation results demonstrate that the model performs consistently across multiple metrics and provides a reliable baseline for employee attrition prediction. While there is room for improvement, the current model effectively captures important patterns in the data and can support informed HR decision-making when used alongside human judgment.


### ðŸ”¹ Key Takeaways
- AWS SageMaker enables scalable and reproducible model training and deployment.
- Proper preprocessing and label consistency are critical for reliable evaluation.
- XGBoost performs well on structured HR datasets with mixed feature types.
- The project successfully demonstrates the full machine learning lifecycle in a cloud environment.

### ðŸ”¹ Limitations and Future Improvements
- Model performance can be improved through advanced hyperparameter tuning.
- Additional behavioral and temporal features could enhance predictive power.
- Cross-validation and ensemble methods may further improve robustness.
- In a production system, preprocessing and inference pipelines would be fully automated.

### ðŸ”¹ Conclusion
This project successfully demonstrates an end-to-end machine learning workflow for employee attrition prediction using AWS SageMaker. It highlights the practical application of cloud-based ML services, from data preprocessing to deployment and evaluation, and provides a solid foundation for further experimentation and real-world implementation.
