# Lab 4 - Visualization Design Principles

In this lab, we will cover visualization design principles set forth by Edward Tufte. We will cover concepts of graphical integrity, lie factor, data-ink ratio, data density, chart junk, etc. and briefly introduce the visualization analysis that will be covered in the next module. For this lab, we will refer to the slides in the file  [L4_DataViz_Principles.pdf](L4_DataViz_Principles.pdf).


## Graphical Excellence and Design Principles 
Graphical excellence is the well-designed presentation of interesting data. It's a matter of
substance, of statistics, and of design. Graphical excellence consists of complex ideas
communicated with clarity, precision, and efficiency.  

Graphical excellence gives to the viewer the **greatest number of ideas** in the **shortest time** with the **least ink** in the **smallest space**. Graphical excellence is nearly always multivariate and requires telling the truth about the data. 

 * Tell the truth: **Graphical Integrity** 

 * Do it effectively with clarity, precision: **Design Aesthetics**	

### Graphical Integrity

It is simply telling the truth about data. **The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the numerical quantities represented.**


Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity.
Data variation, not design variation should be shown. The number of information-carrying (variable) dimensions depicted should not exceed the number of dimensions in the data. Graphics must not quote data out of context.


####  Missing Scales, Scale Distortions 

The data has to be shown in **accurate scales** and within context. The following image does not have a scale or axis; comparisons between the bars can be misleading.

<img src="../images/missing.png">

In the following plot, data is shown with axes that do **not** start at zero!

<img src="../images/dist1.png">
<img src="../images/dist2.png">
<img src="../images/dist3.png">

**Also here:**

<img src="../images/int2.png">

In this following chart, five different scales show price, two different horizontal scales show time.

<img src="../images/scale2.png">

When using part-whole relationships, the parts should add up to 100% and be proportional. 


<img src="../images/int3.png">
<img src="../images/int4.png">
<img src="../images/int5.png">

**Pie chart problems:** Pie charts can be distorted in 3D; perspective skews the relative size and shape of the slices.
It is perceptually harder to judge area vs. length; pie charts make it harder to compare quantities. And it is difficult to see the ranked order of values. 3D effects should be avoided; better yet a bar chart should be used for easier comparison.

<img src="../images/pie1.png">
<img src="../images/pie2.png">
<img src="../images/pie3.png">

---


#### Lie Factor 

Representation of numbers measured on the graphic should be proportional to the numerical quantities represented.
Lie factor is the ratio of size of effect in graphic to size of effect in data.

<img src="../images/lie1.png">

In the above graphic, the proportion of the line elements are exaggearated, the lie factor can be computed as:

Lie factor = [(5.3-0.6)/0.6] / [(27.5-18)/18] = 14.8

Another example shows 30% vs 36% of internet traffic:

<img src="../images/lie2.png">

And yet another shows 35% tax rate vs. 39.6%:

<img src="../images/lie3.png">

In this example, movie ticket price and pop corn price are compared:

<img src="../images/lie4.png">

A movie ticket in 1929 was \\$4.32, a ticket in 2009 was \\$7.20 (66% increase).
Popcorn in 1929 was \\$0.62, popcorn in 2009 was \\$4.75  (666% increase).

In the graphics, the increase in cost is shown in height; but area of images play a larger role in human perception. Graphic shows 166% increase for ticket and 4500% increase for pop corn. 

---


### Design Aesthetics
Design aesthetics are a set of principles to help guide designers in arriving at a visually pleasing result that properly conveys the data.
 

####  Maximize data-ink ratio
Data ink ratio is the proportion of a graphic's ink devoted to the non-redundant display of data. Non-data-ink should be erased within reason. Redundant data-ink should also be erased. 

Data ink ratio = Data ink / Total ink used in graphic


Here are two examples: 

<img src="../images/dink.png">

<img src="../images/dink2.png">

3D almost always is useless; it is non-data ink and it also introduces distortion. It should be avoided. In generla, all unnecessary graphical elements should be avoided. 

**BAD:**
<img src="../images/dink3.png">

**BETTER:**

<img src="../images/dink4.png">

Also:

<img src="../images/dink5.png">


####  Maximize Data Density	
Maximize data density and the size of the data matrix within reason. Sparklines are good examples of increasing data density. Using small multiples also increases data density.

Data Density = #entries in data matrix / area of data graphic 

<img src="../images/dense1.png">

<img src="../images/dense2.png">


####  Small Multiples	
**Small multiples** are series of similar plots using the same scale and axes, allowing them to be easily compared. It uses multiple views to show different partitions/facets of a dataset. Small multiple designs visually enforce comparisons of changes.

<img src="../images/smallmultiples1.png">

<img src="../images/smallmultiples2.png">

####  Avoid Chart Junk
Chart junk is all kinds of extraneous visual elements that detract from the message.

<img src="../images/junk1.png">
<img src="../images/junk2.png">
<img src="../images/junk3.png">
<img src="../images/junk4.png">
<img src="../images/junk5.png">



####  Multifunctioning Elements	
"Mobilize every graphical element, perhaps several times over, to show the data".
In other words, try to make all present graphical elements data	encoding elements.


####  Macro/Micro	
Provide	the	user with both views (overview + detail). Carefully designed view can show a macro structure (overview) as well as micro structure (detail) in one space.

####  Utilize Layering and separation: supported by Gestalt laws (ideal for maps) 
Layering and separation implies using color or other differentiation to separate important classes of information. Maps are often very good examples of this technique. Grouping with colors as well as separation by colors should be employed. 

<img src="../images/sign1.png">
<img src="../images/sign2.png">

---


### Additional	Principles	

 * Above else, show the data : the focus should be on the content of the data, not the visualization technique.
 * Show the context (both for integrity and as a design principle)
 * Avoid separate legends and keys; just have that information in the graphic.
 * Make grids, labeling, etc. very faint so that they recede into background.
 * Use color effectively.
 * Revise and edit. 
 * CRAP: 
    * Contrast

    * Repetition (Repeat some aspect of the design throughout the entire piece.)

    * Alignment (Nothing should be placed on the page arbitrarily. Every item should have a visual connection with something else.)
     
    <img src="../images/align.png">
    
    * Proximity (Group related items together as physical closeness implies a relationship.)
 
* The dimensions of the graphical elements should be equal or less than the dimension of the data they encode.  Example: do not use area to encode a simple numeric attribute.

**BAD:**

<img src="../images/area.png">


**BETTER:**

<img src="../images/area2.png">


##### Subjective Dimensions
 * Aesthetics: Attractive things are perceived as more useful than unattractive ones.
 * Style: Communicates brand, process, who the designer is.
 * Playfulness: Encourages experimentation and exploration.
 * Vividness: Can make a visualization more memorable.

---

### Guides	for	Enhancing	Visual	Quality	

Attractive visualizations:

 * have a properly chosen format and design

 * use words, numbers, and drawing together

 * reflect a balance, a proportion, a sense of relevant scale

 * display an accessible complexity of detail

 * often have a narrative quality, a story to tell about the data

 * are drawn in a professional manner, with the technical details of production done with care

 * avoid content-free decoration, including chart junk

 * induce the viewer to think about the substance, rather than about methodology, graphic design, [or] the technology of graphic productions
 
 * encourage the eye to *compare* different pieces of data
 
 * reveal the data at several levels of detail

---
    
    
### Analysis Questions:

 * Who is the intended audience?
 
 * What information does this visualization represent?
 
 * How many data dimensions does it encode?
 
 * List several tasks, comparisons or evaluations it enables
 
 * What principles of excellence best describe why it is good / bad?
 
 * Can you suggest any improvements?
 
 * Why do you like / dislike this visualization?