# Week 2 - Visualize Results

You’ll consider the tradeoffs involved with building a BI visualization, and you’ll practice creating charts and visualizations. You’ll also explore effective ways to organize elements within a dashboard. Finally, you’ll identify factors that contribute to processing speed and how to maximize dashboard performance.

## Learning Objectives

- Understand how to gather requirements from stakeholders to build a dashboard.

- Identify obstacles and limitations that dashboards must overcome.

- Learn strategies for answering questions with the appropriate dashboard information.

- Understand a project’s scalability.

- Explain the difference between high granularity and high detail.
    
- Explore solutions if processing speed could be improved or made more efficient.
    
- Identify contributing factors to processing speed.
    
- Set privacy restrictions based on what's appropriate for internal/external availability.
    
- Translate business needs into dashboard parameters.

## Communicate Clearly With Visuals

### Designing Dashboards

So, pre-aggregation is a term
BI professionals use to describe
the process of performing calculations on
the data while it's still in the database.
This means reducing the number of
rows or the size of the dataset
before it's used in analysis
or a dashboard. Think of it this way:
if you pre-aggregate the data,
it will be in a state that's
closer to what you ultimately need.
This is because some of
the necessary calculations will
happen before the data is aggregated
in the database and sent to
the data visualization tool. That's the trade-off.
Your pipeline will involve more steps,
but your users will get to
the information they need more quickly. 

Also, note that pre-aggregation comes with another cost:
pre-aggregated data is less flexible.
Imagine that another stakeholder wants to
represent sales data by store size.
If you had already used
pre-aggregation to combine the data by region
— thereby putting all stores together, no matter their size —
you'd have a problem. 

Project Scope | Dashboard Scope
-----------|-------------
Refers to the overall project goals, resources, deliverables, deadlines, collaborators, and stakeholders.| Refers to the breadth of what a dashboard is tracking, including the amount of time and how many metrics it includes.
Determined by team leadership including project sponsors and managers.| Determined by BI teams as they consider project and user requirements.
Outlined at the very beginning of a project to determine the overarching aspects of the project. | Outlined as part of the dashboard creation process based on the specific reporting needs.
Involves working with key sponsors and stakeholders to better understand and align on the entire project and its goals. | Involves choosing KPIs, how much time should be represented, and how to make important data available and understandable to decision makers through the dashboard.

### How To Choose a Chart

Dimensions are qualitative data types that can be used to categorize data. Measures are quantitative data types that can be either discrete or continuous, and encoding is the act of translating dimensions and measures into visualizations. 

Dimensions are inherently qualitative data—this means that they are subjective and explanatory measures of a quality or characteristic. Basically, this is data that records observations about the quality of the data. For example:

    Customer names

    Product names

    Geographic locations 

    Observations

    Interviews

    Reviews

These examples are descriptive; they indicate characteristics of the data that aren’t necessarily represented by numerical data.

Measures, on the other hand, are quantitative. Measures are what you will use to actually count the data and track changes over time. This data can be discrete or continuous—basically, this means they can be represented by numbers with limited or unlimited values. For example:

    Temperature

    Revenue

    Distance

    Weight

    Time

As you have been learning, encoding is the act of translating the information represented by your dimensions and measures into visualizations. The artistic elements you choose communicate things about your data:

- Line: Lines in visualizations can be curved or straight; thick or thin; vertical, horizontal, or diagonal. They add visual form to your data and help build the structure for your visualization.

- Shape: Shapes are a great way to add eye-catching contrast—especially size contrast—to your data story.

- Color: Color can help differentiate different elements of visualization and communicate insights.

- Space: Space is the area between, around, and in objects. There should always be space in data visualizations so that the visualization isn’t too cluttered.

- Movement: Movement is used to create a sense of flow or action in a visualization.

## Considerations When Laying Out a Dashboard

### Processing Speed and Dashboard Elements

You've probably figured out that
processing speed describes how
quickly a program can update and
load a specific amount of data.
If the load is too high,
then processing speeds will be slow,
and the tool might even crash.
This can make it difficult or
frustrating to work with the dashboard.
Of course, the greatest contributors to high loads and
slow processing speeds are the volume of
data and the number of measures and dimensions included. 

As a rule, you should begin broadly,
then narrow your scope.
In other words, identify the priority KPIs,
then consider and refine supporting information.
Along the way, you might
find that a metric you were originally
asked to track is no longer
relevant to your stakeholders' business question.
In this case, you can remove it from your dashboard,
which will help things move more quickly.
In addition, you can optimize
processing speeds by changing
calculations in your database.
This enhances dashboard efficiency
because back-end servers are more
powerful than front-end servers
and can process more data faster.

Preload less data and
there will be less strain on the dash.
But keep in mind that preloading may mean
that the insights aren't as current as they could be.
Other strategies for speeding up your dash include
filtering data early on and pre-aggregation. 

One of the primary ways you can work to optimize your processing speed is by reducing the processing load. You can do this by:

- Pre-aggregating: This is the process of performing calculations on data while it is still in the database. Pre-aggregating data will transform data into a state that’s closer to what you ultimately need because some necessary calculations will happen before the data is sent to the data visualization tool. The trade-off is that your pipeline will involve more steps and your dataset uploaded into the visualization tool will be less flexible , but your users will get the information they need more quickly.

- Using JOINs: JOINS are used to combine rows from two or more tables based on a related column. This basically merges tables together before they’re ever used in the dashboard. This can save a lot of processing load in the actual dashboard. However, if you are trying to join a full table, it can be more of a burden to the system. This is caused by the dimensionality of the tables. For example, joining a one million row table with a 100 million row table will most likely generate a lot of overhead every time the dash is updated. So it’s important to think carefully about how you use JOINs to reduce processing load!

- Filtering: Filtering is the process of showing only the data that meets a specified criteria while hiding the rest. Filtering the data early in your dashboard’s processing means that it doesn’t have to sort through data that isn’t actually going to be used. The tradeoff of this is that this means less data is available for your users to view on their own.

- Linking to external locations: In cases where you have data in your dashboard that you can provide context for outside of the dashboard and which can help cut down on the processing load, you can link out to that location for users to explore on their own.

- Avoiding user-defined functions: Users making requests of your dashboard can add a lot of load to the processing work it’s doing. Consider the kinds of questions that users might have when designing the dashboard so that you can address them without the users themselves having to input functions repeatedly.

- Deciding between data views and tables: Tables contain actual data. Data views are the result of a stored data query that preserves business logic and can be queried like a database. Data views often require much less processing load because they don’t contain actual data, just a view of the data. This makes them less flexible, so you’ll want to consider how interactive you need the data in your dashboard to be.


### Privacy 

There are several types of privacy permissions,
but we're going to focus on three main levels:
public availability,
object level permission, and row-level permission. 

If your dashboard is publicly available,
it's accessible to anyone.
Use this unrestricted setting to
share dashboard with the general public.

The next is object- level permission.
This privacy setting controls
the availability of a single item,
such as a table, dataset, or single visualization.
You'll probably employ object level permission the most,
due to their simplicity.
If you give a user access to an object,
revoking that access is as
easy as removing their permission. 

Row-level permission is a privacy setting that controls
the availability of specific rows of a table or dataset.
This type of privacy setting is a bit more complex,
because it must be set up in
the database rather than the visualization tool. 



## Glossary

Dimension (visualization): A qualitative data type that can be used to categorize data

Encoding: The process of translating dimensions and measures into visual representations of the data

Measure: A quantitative data type that can be either discrete or continuous

Object-level permission: A privacy setting that controls the availability of a single item in a dashboard

Pre-aggregation: The process of performing calculations on data while it is still in the database

Processing speed: How quickly a program can update and load a specified amount of data 

Public availability: A privacy setting that allows anyone to access a dashboard

Row-level permission: A privacy setting that controls the availability of specific rows of a table or dataset in a dashboard

Trade-off: Balancing various factors, often by prioritizing one element while sacrificing another, in order to arrive at the best possible result