We will use 360-degree data from the employees of the company to predict the attrition of them. After that we will use classification with deep learning for a better understanding of the data.

We will use the following data:

- ID: unique employee ID
- TotalMonthsOfExp: total months of experience
- TotalOrgsWorked: total organizations worked so far
- MonthsInOrg: months in current organization
- LastPayIncrementBand: last pay increment band in a scale of 1 to 5 (1 = highest, 5 = lowest)
- AverageFeedback: average feedback of the 360-degree feedback survey (1 = highest, 5 = lowest)
- LastPromotionYears: how many years since the last promotion
- Attrition: attrition status (1 = yes, 0 = no)

#### 1. Preprocessing Attrition Data

In [22]:
# load the dataset and analyze it
import pandas as pd
import numpy as np
import tensorflow as tf

attrition_data = pd.read_csv('./dataset.csv')

print("attrition dataset loaded:")

attrition_data.head()

attrition dataset loaded:


Unnamed: 0,EmployeeID,TotalMonthsOfExp,TotalOrgsWorked,MonthsInOrg,LastPayIncrementBand,AverageFeedback,LastPromotionYears,Attrition
0,1,110,4,9,5,4,4,1
1,2,103,3,51,1,4,2,0
2,3,41,4,16,5,4,4,1
3,4,32,4,17,5,2,3,0
4,5,80,3,16,3,4,2,0


In [23]:
# correlation analysis of the target attribute
attrition_data.corr()['Attrition']

EmployeeID             -0.036630
TotalMonthsOfExp        0.019702
TotalOrgsWorked         0.008706
MonthsInOrg             0.012605
LastPayIncrementBand    0.108528
AverageFeedback        -0.008253
LastPromotionYears      0.765641
Attrition               1.000000
Name: Attrition, dtype: float64

In [24]:
# convert the DataFrame to a numpy array
converted_attrition_data = attrition_data.to_numpy().astype(float)

# create X_train with the first 7 attributes
X_train = converted_attrition_data[:, 1:7]

# create Y_train with attrition attribute
Y_train = converted_attrition_data[:, 7]

# convert Y_train to one-hot-enconding
Y_train = tf.keras.utils.to_categorical(Y_train, 2)

print("X_train Shape: ", X_train.shape)
print("Y_train Shape: ", Y_train.shape)

X_train Shape:  (1000, 6)
Y_train Shape:  (1000, 2)


#### 2. Building The Attrition Model

In [25]:
from tensorflow import keras

# setup huperparameters for deep learning
EPOCHS = 100
BATCH_SIZE = 100
VERBOSE = 1
NB_CLASSES = 2
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2

# create keras model
model = tf.keras.models.Sequential()

# add first hidden dense layer
model.add(
  keras.layers.Dense(
    N_HIDDEN,
    input_shape=(6,),
    name='Dense-Layer-1',
    activation='relu',
  )
)

# add second hidden dense layer
model.add(
  keras.layers.Dense(
    N_HIDDEN,
    name='Dense-Layer-2',
    activation='relu',
  )
)

# add a final layer with softmax
model.add(
  keras.layers.Dense(
    NB_CLASSES,
    name='Final',
    activation='softmax',
  )
)

# compile the model
model.compile(
  optimizer='adam',
  loss='categorical_crossentropy',
  metrics=['accuracy'],
)

# fit parameters to the model
model.fit(
  X_train,
  Y_train,
  epochs=EPOCHS,
  batch_size=BATCH_SIZE,
  verbose=VERBOSE,
  validation_split=VALIDATION_SPLIT,
)

model.save(filepath='./attrition_model.h5')

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

#### 3. Predicting Attrition

In [26]:
def get_prediction_result(results: list):
  new_list = []

  for result in results:
    if result == 0:
      new_list.append('No')
    else:
      new_list.append('Yes')

  return new_list

In [27]:
# defining a test data to predict a single employee attrition
TOTAL_MONTHS_OF_EXP = 40
TOTAL_ORGS_WORKED = 4
MONTHS_IN_ORG = 20
LAST_PAY_INCREMENT_BAND = 5
AVERAGE_FEEDBACK_BAND = 4
LAST_PROMOTION_YEARS = 4

TEST_DATA = [
  [
    TOTAL_MONTHS_OF_EXP, 
    TOTAL_ORGS_WORKED, 
    MONTHS_IN_ORG, 
    LAST_PAY_INCREMENT_BAND, 
    AVERAGE_FEEDBACK_BAND, 
    LAST_PROMOTION_YEARS
  ]
]

single_employee_prediction = model.predict(TEST_DATA)
single_employee_prediction_result = np.argmax(single_employee_prediction, axis = 1)

print('Will the employee leave the company?', get_prediction_result(single_employee_prediction_result)[0])

Will the employee leave the company? Yes


In [28]:
# defining a test data to predict multiple employees attrition
TEST_DATA = [
  [111, 5, 85, 3, 2, 2],
  [31, 2, 15, 4, 1, 4],
  [61, 4, 24, 1, 4, 3],
  [77, 4, 35, 3, 1, 1],
  [81, 5, 7, 1, 2, 3],
  [113, 4, 112, 5, 4, 1],
  [101, 2, 48, 5, 1, 4],
  [45, 4, 22, 5, 3, 1],
  [25, 2, 2, 2, 3, 2],
  [97, 3, 15, 3, 2, 4]
]

multiple_employees_prediction = model.predict(TEST_DATA)
multiple_employees_prediction_result = np.argmax(multiple_employees_prediction, axis = 1)

print('Will the employees leave the company?', get_prediction_result(multiple_employees_prediction_result))

Will the employees leave the company? ['No', 'Yes', 'No', 'No', 'No', 'No', 'Yes', 'No', 'No', 'Yes']
