### Effective Dashboarding

Humans are highly visual creatures, while data, even when nicely aggregated, is often captured in very structured but not very visual way.

Dashboards exist to help convey data in a summarized manner.  There are three types of dashboards:   
📊 Analytical dashboards, which helps the user identify patterns or trends  
📊 Operational dashboards, to give a summary of the current state of affairs  
📊 Strategic dashboards, which monitor certain KPIs  

Here, we'll create an Analytical Dashboard to study patterns and trends in employee attrition based off HR data sourced from [Kaggle](https://www.kaggle.com/shaktikumarmishraa/Hr-employee-attrition/metadata).  A good employee is a valuable resource, and it is important for a company to know what could potentially cause a talented person to leave.

We'll create a cube from the underlying CSV and use atoti to create our dashboard.  We'll create a session and fix our port so our dashboard is always available from the same place, and we'll set up a place to store our content metadata so our dashboard is stored for future sessions.

<div style="text-align: center;" ><a href="https://www.atoti.io/?utm_source=gallery&utm_content=hr-dashboard" target="_blank" rel="noopener noreferrer"><img src="https://data.atoti.io/notebooks/banners/discover.png" alt="Try atoti"></a></div>

In [1]:
import atoti as tt

In [2]:
session = tt.create_session(
    config={
        "port": 9090,
        "user_content_storage": "./content",
    },
)

In [3]:
hr_attrition = session.read_csv(
    "s3://data.atoti.io/notebooks/hr-dashboard/HR-Employee-Attrition.csv",
    keys={"EmployeeNumber"}
)

In [4]:
cube = session.create_cube(hr_attrition)

Now that we have a cube created, we have two choices going forward.  We could create all our visualizations here within our notebook then publish them to the dashboard page, or we could let users go directly to our dashboarding webapp and create all the visuals there.

If we want to directly access our dashboard, either to start the story or continue from wherever it was last left off, we can do so by running our next cell to expose the link.

In [5]:
session.link()

Open the notebook in JupyterLab with the atoti extension enabled to see this link.

<img src="https://data.atoti.io/notebooks/hr-dashboard/sessionlink.gif" alt="Direct Dashboard Access" width="70%">

Suppose instead we wanted to create a few visualizations up front to help get started.  We can run the following link to create a few visuals.

Since this is HR data related to attrition, it would be nice to have a few visuals summarizing what type of employee data we have, such as:  for how many departments?  what educational background? what is the gender ratio?

<img src="https://data.atoti.io/notebooks/hr-dashboard/publishtoapp.gif" alt="Publish to App" width="70%">

Now, some people argue against the use of pie charts, but if it communicates the data effectively, use it.  Here it is trying to demonstrate gender composition.  We see immediately there are more men than women in each department, with sales being closest to equal.

💡 Know your audience.  If the dashboard consumer won't engage with a particular type of chart, or if the chart itself is confusing the data, don't use them!

In [6]:
session.visualize("Employee breakdown by gender across departments")

When using bar charts, there are two primary choices for visualising--showing the "true" number of data, or showing how it relates as a portion of a 100%.

In the first example, the user can get a sense of the relative size of each department as well as their breakdown (there are more members in R&D than sales; Life Sciences graduates are a large component in both Sales and R&D; the # of Life Sciences graduates in R&D nearly equals the total numer of people in Sales).  However, a smaller department like HR can get lost in this.

In [7]:
session.visualize("Educational background across departments overall")

In the second example,  Sales, R&D, and HR are all equally in focus, and one can easily see the educational background for each department, but their relative size in comparision to each other is not available.  The size of each department is broken out into a separate visualization.  While a user can derive the same information as the first example, it may require more mental gymnastics.

💡 Know your intent.  Too many dimensions in one visual may be confusing, but a story spread across too many visuals can make it hard to piece together.

In [8]:
session.visualize("Educational background across departments by %")

In [9]:
session.visualize("Employoee distribution across departments")

We can also play with the colors used in a visualization, using colors to help group or classify data.  For example, if we want to see how tenure in each role tracks with overall age, we could choose to use a single color for all dots.  Or we could use colors to distinguish between males vs females, or based off their departments.  We can see the rough trend, and also see if there is any deviation based off a subcategory.  Of course, if there are too many colors, the message can get overwhelming and hard to parse.  Also, some individuals may struggle to distinguish between certain colors due to color-blindness or color vision deficiency.

💡 Be careful with color.  Colors can make a visual compelling, but too many colors can dilute the message.  Also, be aware of accessibility issues when using color based delineations.

In [10]:
session.visualize("Average age vs tenure by job role")

In [11]:
session.visualize("Job roles per department")

Once we have a sense of the employee demographics, we can begin to study the data around which employees left and which ones have stayed.  This could be a second page in a dashboard, or a continuation of the same dashboard, depending on user preference.

💡 Avoid cluttering a dashboard.  A user should be able to get a sense of the information it is trying to convey in five seconds or less.

In [12]:
session.visualize("Attrition rates by job titles")

In [13]:
session.visualize("Attrition based off travel frequency")

In [14]:
session.visualize("Tenure vs job satisfaction across role")

Beyond the impact of 'obvious' factors like job role, travel frequency, salary, and job satisfaction on attrition, what about more niche or lesser considered reasons for attrition like commute distance or number of previously held positions?  Should these be included in the dashboard?  If we include it, it seems like commuting distance is not appreciably impactful, 

💡 Arrange your information like an inverted pyramid, with far reaching or important pieces of information on top, and getting more granular or niche as you go

In [15]:
cube.hierarchies["DistanceHome"] = [hr_attrition["DistanceFromHome"]]

In [16]:
session.visualize("Average distance from home for attrited employees")

In [17]:
session.visualize("Average distance from home for retained employees")

In [18]:
session.visualize("Tenure and years since last promotion based off attrition")

In [19]:
session.visualize("Tenure and years since last promotion for attrited employees")

In [20]:
session.visualize("Tenure and years since last promotion for retained employees")

In [21]:
cube.hierarchies["Tenure"] = [hr_attrition["YearsAtCompany"]]

In [22]:
session.visualize("Employee departure based off Tenure")

Once we have enough visuals to analyze the problem, we can assemble a dashboard.  From there, users can customize the dashboards to further investigate, or we can use this data to train a model to see if we can predict who will leave the company--though, that is a topic for another notebook.

<div style="text-align: center;" ><a href="https://www.atoti.io/?utm_source=gallery&utm_content=hr-dashboard" target="_blank" rel="noopener noreferrer"><img src="https://data.atoti.io/notebooks/banners/discover-try.png" alt="Try atoti"></a></div>