
# 🏢 Salifort Motors: Employee Retention Analysis

This notebook investigates employee attrition at Salifort Motors to identify key drivers of turnover and build predictive models to support HR decision-making.


## 📦 Import Libraries

In [None]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix, ConfusionMatrixDisplay
from sklearn.preprocessing import LabelEncoder

import warnings
warnings.filterwarnings('ignore')


## 📂 Load Dataset

In [None]:

df = pd.read_csv("HR_capstone_dataset.csv")
df.head()


## 🔍 Data Overview

In [None]:

df.info()
df.describe()
df.isna().sum()
df.duplicated().sum()


## 🧹 Data Cleaning

In [None]:

df = df.rename(columns={
    'Work_accident': 'work_accident',
    'average_montly_hours': 'average_monthly_hours',
    'time_spend_company': 'tenure',
    'Department': 'department'
})
df.head()


## 🔢 Encode Categorical Variables

In [None]:

le = LabelEncoder()
df['salary'] = le.fit_transform(df['salary'])
df = pd.get_dummies(df, columns=['department'], drop_first=True)
df.head()


## 📊 Exploratory Data Analysis

In [None]:

plt.figure(figsize=(10, 6))
sns.heatmap(df.corr(), annot=True, cmap="coolwarm")
plt.title("Correlation Heatmap")
plt.show()


## 🎯 Feature Selection & Splitting

In [None]:

X = df.drop('left', axis=1)
y = df['left']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)


## 🤖 Train Logistic Regression Model

In [None]:

model = LogisticRegression(max_iter=500)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)


## 📈 Model Evaluation

In [None]:

print("Classification Report:")
print(classification_report(y_test, y_pred))

cm = confusion_matrix(y_test, y_pred)
disp = ConfusionMatrixDisplay(confusion_matrix=cm)
disp.plot()
plt.title("Confusion Matrix")
plt.show()



## ✅ Conclusion

This analysis highlights key factors contributing to employee attrition at Salifort Motors. The logistic regression model achieves ~85% accuracy and can be used to help HR teams prioritize retention efforts and reduce hiring costs.
