# Creating a Human Resources analytics dashboard using Tableau 

## 📖 Background
In this first ever BI competition, you will be exploring a dataset for a fictitious software company called Atlas Labs. You'll have to import, analyze, and visualize the data using Tableau.

The end goal will be a clean, self-explanatory, and interactive dashboard for the HR team of Atlas Labs. By conducting a thorough exploratory data analysis and creating visualizations, you'll dive deeper into attrition and what factors impact attrition. This analysis will help the organization determine what action they will need to take to retain more employees.

## 💪 Challenge
Using Tableau, create an interactive dashboard to summarize your research. Things to consider:

1. Make your dashboard stand out! Give it a unique look and feel.
2. Visualize the biggest differences in salary, attrition, job satisfaction level, and work-life balance across demographics and other dimensions. Don't feel limited by these, you're encouraged to use your skills to consolidate as much information as possible. 
3. Create a screenshot of your (main) Tableau dashboard, and past that in the designated field.
4. Summarize your findings in an executive summary.

## 💾 The base data

The following files were created.

### Employee (`employee.csv`)
|   Column name  |   Description | 
|---------------|-----------|
| Employee ID |	A unique ID that identifies an employee, connects to the **Performance Rating** table  |
|FirstName | First name of an employee  |
|LastName |	Last name / surname of an employee |
|Gender| Self-defined employee gender identity|
|Age|Current age of an employee|
|BusinessTravel|Frequency of business travel|
|Department | Most recent department that employee belongs/belonged to |
|DistanceFromHome (KM)|Kilometer distance between an employee’s home and their office|
|State|State where the employee lives|
|Ethincity| Self-defined employee ethnicity|
|Education | A unique ID that identifies an employees education level, connects to the **Education Level** table |
|EducationField|Employee field of study|
|JobRole |	Most recent department that employee belongs/belonged to |
|MaritalStatus|Current/latest employee marital status|
|Salary | Most recent record of employee salary |
|StockOptionLevel|The banding level for stock options that the employee has|
|OverTime|Indicates whether an employee is expected to work overtime in their role|
|HireDate|Date the employee joined the company|
|Attrition|Indicates whether an employee has left the organization|
|YearsAtCompany|Number of years since the employee joined the organization|
|YearsInMostRecentRole|Number of years the employee has been in their most recent role|
|YearsSinceLastPromotion|Number of years since the employee last got promoted|
|YearsWithCurrManager|Number of years the employee has been with their current manager|

### Performance Rating (`performance_rating.csv`)
|   Column name  |   Description | 
|---------------|-----------|
| PerformanceID | A unique id that identifies a performance review|
| EmployeeID |	A unique ID that identifies an employee, connects to the **Employee** table  |
|ReviewDate | Date an employees' review took place  |
|EnvironmentSatisfaction |	Rating for employees' satisfaction with their environment  |
|JobSatisfaction |	Rating for employees' satisfaction with their job role |
|RelationshipSatisfaction|Rating for employees' satisfaction with their relationships at work|
|WorkLifeBalance|Rating for employees' satisfaction with their relationships at work|
|SelfRating|Rating for employees' performance based on their own view|
|ManagerRating|Rating for employees' performance based on their manager’s view|
|TrainingOpportunitiesWithinYear|Number of training opportunities offered in the last 12 months|
|TrainingOpportunitiesTaken|Number of training opportunities taken|



### Education Level (`education_level.csv`)
|   Column name  |   Description | 
|---------------|-----------|
| Education Level ID | A unique id that identifies a education level
| Education Level |	A unique ID that identifies an employee, connects to the **Employee** table|


## Broad approach and data preparation

Instead of building an interactive Tableau dashboard and making it 'stand out', I approached this from a data analyst presenting findings to Atlas' senior management. The _story_ format was therefore used instead of dashboards to better explain the analysis and sub-analyses, and how the observations are connected.

Certain assumptions were made about the company, to choose the sub-analyses to focus on. I assumed that Atlas is a company that is primarily organised by business functions. The Tech and Sales Departments are of particular importance to Atlas Labs' business performance. And therefore Atlas' senior management would be more concerned with the attrition trends there, rather than HR. I also assumed that Atlas' senior management's awareness of its HR trends is relatively low. 

Data exploration was mostly done using Tableau, and SQL to a smaller extent.

Data cleaning was minimal as the datasets were almost complete. Some manipulation was done to prepare the data for analysis. Key ones include:
1. Deriving the 'Year of Attrition' from the 'Attrition', 'HireDate' and 'YearsAtCompany' fields. (Using the DATEADD function.)
2. Deriving the staff strength per year after accounting for attrition that year, from the 'HireDate' and new 'Year of Attrition' fields.
3. Deriving the age of employees when they left the company, from the 'Attrition', 'Age' and new 'Year of Attrition' fields. As the raw data depicts current age of the employee (assumed as of 2022). (Using the MAKEDATE and DATEDIFF functions.)

Connections of the 3 datasets were straightforward and done through Tableau Public. (Employee ID in `employee.csv` on Employee ID in `performance_rating.csv`, Education in `employee.csv` on Education Level ID in `education_level.csv`)

## Link to Tableau story
Please click [here](https://public.tableau.com/app/profile/qs1873/viz/HRAnalyticsMay2023/WhyarepeopleleavingAtlas?publish=yes) for the Tableau story. View in full screen mode for the best experience.

## Executive summary

1. Attrition at Atlas Labs is steadily rising, and a recent spike in attrition is a concern. 
2. Younger and fresher employees have been leaving. Attrition has also been highest amongst those with lower remuneration.
3. The attrition in the Tech Department is the highest, but attention should also be paid to attrition trends in the Sales and HR Departments.
4. Of the job roles in Atlas, attrition is the highest amongst Data Scientists, Sales Executives, Software Engineers and Sales Representatives.
5. While it is unclear if remuneration is a main factor, it is clearer that dissatisfaction over work-life balance, working conditions and management could have driven much of these resignations. 

## Detailed findings

### A. Who has left?

1. Attrition at Atlas Labs is steadily rising. To date, 16% of all employees on record (depicted in pale red in all dashboards) have left Atlas Labs. Atlas' staff strength has grown steadily over its past 10 years of existence, and attrition has grown at a comparable trajectory as well. However, its spike in recent years should be a concern and looked into, even if it is in line with the Great Resignation phenomenon in the US at the height of the COVID-19 pandemic.
2. Most employees who have quit are relatively young in age and have spent a few years in Atlas. 2 in 3 employees left before they turned 24 years old, and 3 in 5 employees left spending less than 3 years in Atlas.
3. There are no significant attrition trends to flag based on other demographic factors, such as ethnicity, gender, education level and distance from home.
4. Attrition is highest amongst those who have been paid less than 100k, and amongst those who do not own stock options. But this is no surprise, given that most younger and fresher employeees have lower remuneration. 
5. Attrition is also highest in the Tech Department, followed by Sales and HR. But Sales has an outsized share of all attrition, considering that Sales has half of Tech's total employees on record. A staggering 19% of all HR employees have also quit. In The spike in attrition over the recent years also appears to have been due to an increase in Tech and Sales employees quitting, above their respective trendlines. Of the job roles in these departments, attrition is the highest amongst Data Scientists, Sales Executives, Software Engineers and Sales Representatives, with a spike in attrition in the latter two job roles over 2021-2022. Specific attention should be paid to these roles, as they appear to be the key engine of Atlas' business.  
 
### B. Why?

5. It is unclear whether remuneration is a main factor, since there are no survey results related to remuneration. A comparison with industry benchmarks could also be done.
6. But what's clearer is that dissatisfaction with work conditions is a likely driver. Employees' views on work satisfaction and work-life balance have worsened, amongst both current and former employees. Of the aforementioned job roles with the highest attrition, Sales Representatives are the least satisfied with their work environment and job in recent years, although their work-life balance appear to have improved.
7. 30% of all employees who frequently OT have quit, as compared to the 10% who do not. Software Engineers appear to have th highest incidence of employees who frequently OT, amongst those who have quit.
8. Specifically, dissatisfaction with managers could have driven the HR Department's higher attrition rate, at least over 2018-2022.


## Personal takeaways from this exercise

### Overall

1. Such exercises helped me further internalise the analysis process and practise Tableau much better than going for courses or refreshers, even if the datasets are not directly related to 'core work'.
2. Important to budget time for data exploration, preparation and analysis, especially where there are multiple fields, that are also related and therefore require sub-analyses. Spent at least 50%-70% of the time doing so. 
3. Important to keep in mind the key business question(s) to be addressed. Then generate hypotheses (the 'framework') along the way while exploring the data, and figure out how to test them
4. Understanding business context quite important, otherwise difficult to decide what to focus on or how to pitch.
5. Statistics expertise beyond the basics might have been useful - could only rely on Tableau to generate trendlines, without really understanding how the trendlines are derived and whether used correctly.

### HR data and analysis

1. Relatively clean data that is consistent over time is crucial for meaningful comparison and analysis.
2. 'Whys' of attrition generally found through org health surveys, including exit interviews. Comparison with industry standards (e.g. attrition rate, salaries) useful but circumstantial. E.g. remuneration is always a factor, but how much is it really a factor?
3. Job role field provides a good basis for sub-analysis - already so, even in a company like Atlas that already has job roles quite cleanly organised into functional departments.

### Visualisation

1. Good old bar, line & area charts, and highlight tables probably sufficient for most visualisations presenting macro-analyses (i.e. doesn't need a deep-dive into individual records). In some cases, I could have simply used enlarged text. Avoided pie charts because they are terrible for comparison.
2. Colour selection important - should be easy on the eyes, consistent in meaning. Especially where multiple factors need to be introduced. E.g. in this case, those who have quit vs those who have stayed, and different job roles.
3. Prioritise necessity over comprehensiveness in presenting data. (Unless the intent is to allow senior management to explore the data and come to their own conclusions. In which case, interactive dashboards would be much better. E.g. [this](https://public.tableau.com/app/profile/techknomatic.services.pvt.ltd./viz/AttritionDashboard_15576673167290/AttritionAnalysis), or [this](https://public.tableau.com/app/profile/omkarvichare/viz/AttritionAnalysis_16181752081830/AttritionTrends).) 

### Tableau-specific

1. Initial data exploration, to understand the data structure (the data types of the fields, the completeness of records), not easily done with Tableau. Prefer either Excel (filters and pivot tables) or SQL. 
2. Found data analysis easier with Tableau since it is visuals-based. Found it useful to have a default 'analysis' sheet with two template bar charts, one with column as employee count, and another with column as percentage of employees across table. I would then select each field in employee.csv as a row, and layer on the 'attrition' field in colours. Also helps shorten the prep for the final product, since some of these vizes form the basis for it.
3. Found it difficult to format labels to my liking, e.g. labelling only certain points on the graph. Not sure if it's due to limited functions in Tableau Public, or I'm too noob. Seems to be the former based on guides.
4. A downside to the story format is that it limits the space or 'real estate' per storypoint. Therefore need to decide whether it's better to break a point into two storypoints.
5. Tried to use the free ChatGPT version (GPT-3.5) to write code for calculated fields, but they turned out to be code that isn't supported by latest versions of Tableau Public. 


## Workings

**Basic facts**
- Total employees on record: 1467 (excluding 3 employees with no unique Employee ID - filtered out as negligible)
- **Attrition: 230 or 16.16% of employees on record have left, leaving 1237 today.**

**Demographic factors**
- (Current) Age: 18-51 yo. ~51% are below 27 years old (Gen Z). Highest % of attrtion amongst 27-29 yrs (27%).
- **Age when left company (Calculation): Taking current age (assumed to be at 2022) minus year of attrition (also another calculation), we are able to derive at what age these employees left the company. Based on the data, 64% actually left the company at 23 years old and below.** 
- Ethnicity: Predominantly white (59%). No discernable difference in attrition amongst ethnicities - all about 20% +- 4%
- Gender: Roughly equal, slightly more females than males. No discernable difference in attrition - quite equal amongst male, female and non-binary.
- Marital Status: Higher % of singles have left vs divorced and married. (Expected.)
- Education Level: Mostly Bachelors (39%), followed by Masters (27%). No significant difference amongst different levels.
- State: Attrition % fairly equal across 3 states (CA, IL, NY)

**Job factors**
- **Education Field: Highest % of attrition amongst technical degree (32%) and HR (26%), but these are the two lowest. Computer science, information systems and marketing graduates make up top 3 (in this order) in company (76.5%). But attrition for comp science and info systems is amongst lowest 3. Marketing (4th of 8 at 20%).**
- **Job Role: Attrition % particularly high amongst Sales Reps (39.76%), Recruiters (37.5%), Data Scientists (23.75%), Sales Executives (17.54%) and Software Engineers (15.99%). Of these job roles, to be concerned about Sales Execs (22%), Software Engineers (20%) and Data Scientists (18%) as they are disproportionately larger than other job roles (total 12).**
- Department: Attrition % fairly equal across the 3 functional departments. (Job roles are generally split by department.)
- **Hire Date: Attrition fairly constant across year cohorts, ranging from 11 (2017) to 28 (2020) from 2012-2022. As compared to other cohorts, 2013 and 2016 cohorts had lower average number of years per employee who left, than the linear trend from 2012-2022. 2017 and 2019 cohorts slightly above average.**
- **Year of Attrition (Calculation): Adding Years at Company to Hire Date, we are able to obtain the year of attrition per employee. Data shows that attrition has been increasing exponentially, from 2 in 2015 and peaking at 60 in 2021. While this is partly due to a steady expansion in manpower over the years (from 151 in 2012 to 1237 in 2022), the number of attrition over 2021-2022 is above the linear average.** 
- **Overtime: 30% of all employees who have 'Yes' in OT have left, vs 10% for 'No'. (About 28% of all employees on record OT.)**
- Distance from Home: Attrition fairly constant across bins of 5km.
- **Stock Option Level: Most employees have no stock option, and amongst this group attrition % is highest (25%). But attrition % for Level 3 (18%) is much higher than those in Levels 1 (9%) and 2 (7%).**
- **Salary: Most of the company (64%) is paid <100k a year. Attrition % is generally higher for those who are paid less, and lower for those who are paid higher. (Except for those paid >500k a year (15%), but this is a very small number of employees. Company has been retaining all employees who are paid 400k-500k.)**
- **Years in company: No one has stayed more than 10 years. (Sign of a relatively younger company?) Higher % of attrition amongst those with company for 2 years or less (>20% per cohort), followed by 3-5 years (11-17%), then 6-10 years (<11%).**
- Years in Most Recent Role: Significant number of employees in record are within 1 year of their new role (34%). Highest attrition amongst those who are within 1 year in their new role (28% of all employees within 1 year of their new role, 10% of all employee in records).
- Years since Last Promotion: Significant number of employees in record are within 1 year of their last promotion (21%). Highest attrition amongst those who are also within 1 year of their last promotion (37.87%, 7.77%).
- **Years with Current Manager: Significant number of employees in record are within 1 year with current manager (32%). Highest attrition amongst those who are also within 1 year with current manager (27%, 9%).**

**Survey**
- 6699 survey results over 2013-2022. Number of survey results increased linearly over the years, probably reflective of the steady growth of manpower. (NMI per year appears roughly constant.)
- 2261 of 6699 survey results (~33%) are of employees who have left. we have survey results of all the 237 former employees.
- 1040 of 1237 employees who have not left have responded to the survey.
- At macro level, no strong difference between average survey results of existing employees and that of ex-employees for following fields, across all years:
1. Job satisfaction (1-5) - Ex-employees higher by 0.032 points 
2. Environmental satisfaction (1-5) - Ex-employees higher by 0.013 points
3. Relationship satisfaction (1-5) - Ex-employees higher by 0.025 points
4. Work life balance (1-5) - Ex-employees higher by 0.007 points
5. Manager rating (1-4) - Ex-employees higher by 0.004 points
6. Self rating (1-4) - Ex-employees lower by 0.016 points
7. Training opportunities within year (0-3) - Ex-employees lower by 0.01 points 
8. Training opportunities taken (0-3) - Ex-employees lower by 0.014 points
- And average and median survey results for fields 1-6 are at least 'neutral' or 'meets expectations'.
- To note that based on results for fields 7-8, half of all training opportunities available are taken.