# ðŸ”® Model Prediction

This notebook loads the trained model and applies it to new employee data to predict low job satisfaction. It supports HR decision-making and validates model deployment readiness.

In [3]:
# ðŸ”® Load Model and Data

import pandas as pd
import joblib

# Load model
model = joblib.load('../../src/models/rf_model.pkl')

# Load new or test data
new_data = pd.read_csv('../../data/processed/employee_data_cleaned.csv')
new_data.head()

Unnamed: 0,EmpNumber,Age,Gender,EducationBackground,MaritalStatus,EmpDepartment,EmpJobRole,BusinessTravelFrequency,DistanceFromHome,EmpEducationLevel,...,EmpRelationshipSatisfaction,TotalWorkExperienceInYears,TrainingTimesLastYear,EmpWorkLifeBalance,ExperienceYearsAtThisCompany,ExperienceYearsInCurrentRole,YearsSinceLastPromotion,YearsWithCurrManager,Attrition,PerformanceRating
0,0,40,1,1,1,3,9,1,5,4,...,3,20,2,3,18,13,1,12,0,4
1,1,30,1,2,0,3,9,2,27,5,...,4,10,2,2,8,7,7,7,0,4
2,2,52,1,2,1,3,4,2,3,4,...,1,34,3,4,34,6,1,16,0,4
3,3,25,0,3,2,3,9,2,26,1,...,2,6,5,2,6,5,1,4,0,4
4,4,34,1,4,2,3,9,2,2,3,...,3,6,5,3,6,5,1,4,0,4


### ðŸ§¹ Prepare Features

We drop identifier columns and ensure the feature set matches the training pipeline.

In [4]:
# ðŸ§¹ Prepare Input Features

X_new = new_data.drop(columns=['EmpNumber', 'EmpJobSatisfaction'], errors='ignore')
X_new.head()

Unnamed: 0,Age,Gender,EducationBackground,MaritalStatus,EmpDepartment,EmpJobRole,BusinessTravelFrequency,DistanceFromHome,EmpEducationLevel,EmpEnvironmentSatisfaction,...,EmpRelationshipSatisfaction,TotalWorkExperienceInYears,TrainingTimesLastYear,EmpWorkLifeBalance,ExperienceYearsAtThisCompany,ExperienceYearsInCurrentRole,YearsSinceLastPromotion,YearsWithCurrManager,Attrition,PerformanceRating
0,40,1,1,1,3,9,1,5,4,4,...,3,20,2,3,18,13,1,12,0,4
1,30,1,2,0,3,9,2,27,5,3,...,4,10,2,2,8,7,7,7,0,4
2,52,1,2,1,3,4,2,3,4,3,...,1,34,3,4,34,6,1,16,0,4
3,25,0,3,2,3,9,2,26,1,3,...,2,6,5,2,6,5,1,4,0,4
4,34,1,4,2,3,9,2,2,3,4,...,3,6,5,3,6,5,1,4,0,4


### ðŸ“ˆ Predict Low Satisfaction

We apply the trained model to predict which employees are at risk of low satisfaction.

In [5]:
# ðŸ“ˆ Predict and Append Results

new_data['PredictedLowSatisfaction'] = model.predict(X_new)
new_data[['EmpNumber', 'PredictedLowSatisfaction']].head()

Unnamed: 0,EmpNumber,PredictedLowSatisfaction
0,0,1
1,1,0
2,2,1
3,3,0
4,4,0


### ðŸ’¾ Save Predictions

We save the prediction results to a CSV file for HR review or dashboard integration.

In [6]:
# ðŸ’¾ Save Output

output_path = '../../data/processed/predictions.csv'
new_data.to_csv(output_path, index=False)

print("âœ… Predictions saved to:", output_path)

âœ… Predictions saved to: ../../data/processed/predictions.csv
