<a href="https://colab.research.google.com/github/abdyraman/hr-deep-learning/blob/main/deep_hr.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Employee retention strategies are integral to the success and well-being of a company. There are often many reasons why employees leave an organization, and in this case study, I will explore some of the key drivers of employee attrition. Employee attrition measures how many workers have left an organization and is a common metric companies use to assess their performance. While turnover rates vary from industry to industry, the [Bureau of Labor Statistics reported](https://www.bls.gov/news.release/jolts.t18.htm#) that among voluntary separations the overall turnover rate was 25% in 2020.


In this notebook, I will explore [IBM's dataset](https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset) on HR Analytics. The data consists of nearly 1,500 current and former employees with information related to their job satisfaction, work life balance, tenure, experience, salary, and demographic data.

**Employee Attrition Analysis**

In [10]:
import pandas as pd
import numpy as np
import hvplot.pandas  # Import hvplot for DataFrame plotting
import holoviews as hv
import panel as pn
pn.extension('tabulator')


In [11]:
df_full = pd.read_csv('WA_Fn-UseC_-HR-Employee-Attrition.csv')

**Data cleaning**

In [12]:
# remove 4 columns
df = df_full.drop(['Over18', 'EmployeeNumber','EmployeeCount','StandardHours'],axis=1)

In [18]:
#Checking the unique answer points per feature
unique_counts_objects = df.select_dtypes('object').nunique()

# Looping through each categorical variable and printing its unique values and counts
for i in unique_counts_objects.index:
    unique_values = df[i].value_counts()
    print(f'Unique values of {i}:')
    print(unique_values)
    print()

Unique values of Attrition:
Attrition
No     1233
Yes     237
Name: count, dtype: int64

Unique values of BusinessTravel:
BusinessTravel
Travel_Rarely        1043
Travel_Frequently     277
Non-Travel            150
Name: count, dtype: int64

Unique values of Department:
Department
Research & Development    961
Sales                     446
Human Resources            63
Name: count, dtype: int64

Unique values of EducationField:
EducationField
Life Sciences       606
Medical             464
Marketing           159
Technical Degree    132
Other                82
Human Resources      27
Name: count, dtype: int64

Unique values of Gender:
Gender
Male      882
Female    588
Name: count, dtype: int64

Unique values of JobRole:
JobRole
Sales Executive              326
Research Scientist           292
Laboratory Technician        259
Manufacturing Director       145
Healthcare Representative    131
Manager                      102
Sales Representative          83
Research Director             

In [25]:
#Checking on numeric datatypes details
num=df.select_dtypes(include=['int64','float64'])
num.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Age,1470.0,36.92381,9.135373,18.0,30.0,36.0,43.0,60.0
DailyRate,1470.0,802.485714,403.5091,102.0,465.0,802.0,1157.0,1499.0
DistanceFromHome,1470.0,9.192517,8.106864,1.0,2.0,7.0,14.0,29.0
Education,1470.0,2.912925,1.024165,1.0,2.0,3.0,4.0,5.0
EnvironmentSatisfaction,1470.0,2.721769,1.093082,1.0,2.0,3.0,4.0,4.0
HourlyRate,1470.0,65.891156,20.329428,30.0,48.0,66.0,83.75,100.0
JobInvolvement,1470.0,2.729932,0.711561,1.0,2.0,3.0,3.0,4.0
JobLevel,1470.0,2.063946,1.10694,1.0,1.0,2.0,3.0,5.0
JobSatisfaction,1470.0,2.728571,1.102846,1.0,2.0,3.0,4.0,4.0
MonthlyIncome,1470.0,6502.931293,4707.956783,1009.0,2911.0,4919.0,8379.0,19999.0


In [19]:
idf=df.interactive()

**Descriptive statistics**

Text data analysis- categorical values

Numeric Data Analysis

In [16]:
# Define Panel widgets
age_slider = pn.widgets.IntSlider(name='Age', start=18, end=100, step=5, value=50)
age_slider

In [27]:
years_with_company=pn.widgets.IntSlider(name='YearsAtCompany', start=0, end=40, step=3, value=5)
years_with_company

In [22]:
# Define Panel buttons for gender selection
male_button = pn.widgets.Button(name='Male', button_type='primary')
female_button = pn.widgets.Button(name='Female', button_type='primary')

# Pane to display filtered data
filtered_data_pane = pn.pane.DataFrame(width=400)

# Function to filter data based on selected gender
def filter_data(gender):
    filtered_data = df[df['Gender'] == gender]
    filtered_data_pane.object = filtered_data

# Set up button click events
male_button.on_click(lambda event: filter_data('Male'))
female_button.on_click(lambda event: filter_data('Female'))

# Display buttons and filtered data pane
pn.Column(
    male_button, 
    female_button, 
    filtered_data_pane
).servable()

In [36]:
# Define a Toggle button to filter by Attrition "Yes"
attrition_toggle = pn.widgets.Toggle(name='Show Attrition: Yes Only', button_type='success')

# Pane to display filtered data
filtered_data_pane = pn.pane.DataFrame(width=400)

# Function to filter data based on toggle state
def filter_data(event):
    if attrition_toggle.value:
        # Show only rows where Attrition is 'Yes'
        filtered_data = df[df['Attrition'] == 'Yes']
    else:
        # Show an empty DataFrame or any other default state (like all data if preferred)
        filtered_data = pd.DataFrame(columns=df.columns)
    
    filtered_data_pane.object = filtered_data

# Set up the toggle to call filter_data whenever its state changes
attrition_toggle.param.watch(filter_data, 'value')

# Display the toggle button and filtered data pane
pn.Column(attrition_toggle, filtered_data_pane).servable()

**Analysis**

In [37]:
# Create a color mapping for departments
# Create the bar chart for departments
def create_department_chart():
    # Count the number of employees in each department
    department_counts = df['Department'].value_counts().reset_index()
    department_counts.columns = ['Department', 'Count']
    
    # Create a bar chart using hvplot with custom colors
    bar_chart = department_counts.hvplot.bar(
        x='Department', 
        y='Count', 
        title='Number of Employees by Department', 
        xlabel='Department', 
        ylabel='Count',
        color=department_counts['Department'].map(color_mapping)
    )
    return bar_chart

# Create a Panel layout for the dashboard
template = pn.template.FastListTemplate(
    title='Department Visualization Dashboard', 
    sidebar=[
        pn.pane.Markdown("# Employee Departments"), 
        pn.pane.Markdown("#### This dashboard shows the number of employees in each department.")
    ],
    main=[
        create_department_chart()  # Call the function to display the chart
    ],
    accent_base_color="#88d8b0",
    header_background="#88d8b0",
)

# Serve the template
template.servable()