# Lab 2 - Color and Preattentive Processing

This is the second lab where we will cover the fundamentals of preattentive vision and color perception. First we will look at color. For this lab, we will refer to the slides in the file  [L2_DataViz_Color.pdf](L2_DataViz_Color.pdf).

## Visual Perception

**Understanding human visual perception is the key to create efficient visualizations**. 

Features that are perceived rapidly (**preattentive processing**) versus features that require **conscious processing** impact the success of the visualization and storytelling. 

A small set of basic visual properties are processed preattentively; **these include shape, size, color, motion, orientation, and spatial grouping.**

Refer to **slide #3** in [L2_DataViz_Preattentive.pdf](L2_DataViz_Preattentive.pdf) for an example of visual properties that *pop out*; that is we notice them immediately. 


<img src="../images/preatt.png">

It is easy for the humans to notice these basic properties **unless** they are combined together like in the example on the bottom right in which case we have to **consciously process the visual input in order to distinguish features.**

Efficient visualizations use these basic properties to encode data types with the correct **visual variables** as we will see in the following modules. 

One of the important aspects of human visual perception is the **ability of perceptual grouping**. Humans tend to group visual items by using a number of grouping principles:

* **Grouping by visual proximity**: Items that are spatially close to each other are grouped together. We also group items not only by their proximity, but also by their density. (**slide #4** in [L2_DataViz_Preattentive.pdf](L2_DataViz_Preattentive.pdf))

* **Grouping by similarity**: Items that look similar are grouped together. This similarity can be shape, color, or size. (**slide #5** in [L2_DataViz_Preattentive.pdf](L2_DataViz_Preattentive.pdf))

* **Grouping by connectedness**: Connectedness is a more powerful organizing principle than proximity, color, size, or shape. Items that are connected to each other by visual elements (such as lines) are grouped together. (**slide #6** in [L2_DataViz_Preattentive.pdf](L2_DataViz_Preattentive.pdf))

* **Grouping by area**: Smaller components of a pattern tend to be perceived as an object, whereas larger components are perceived as the background. (**slide #7** in [L2_DataViz_Preattentive.pdf](L2_DataViz_Preattentive.pdf))

* **Figure/Ground**: Humans tend to give meaning to recognizable patterns and assume them to be foreground objects (figure) as opposed to background (ground). Equally balanced cues for figure and ground can result in a bistable perception. (**slide #8** in [L2_DataViz_Preattentive.pdf](L2_DataViz_Preattentive.pdf))

These grouping principles and the basic visual properties that are perceived preattentively should be considered while creating visualizations. The attention of the user has to be actively directed towards the visual items that carry the message of the visualization without distracting the user by unnecessary visual stimuli. 

We should design visualizations that are 

* efficiently perceivable

* quick to give a message 

* unambiguous

* discriminating visual variables 

Color is one of the most important visual properties that is preattentively processed by human visual system. Next, we will look at the color perception and choice of color schemes for visualization. 

---


# Color Perception

Humans are trichromats; **they have three types of cone cells** (photoreceptors) in the retina of their eyes that help them to perceive about 10 million colors. Three types of cone cells (L,M,S) roughly correspond to the wavelengths of red (long wavelength), green (mid wavelength), and blue (short wavelength) colors. **Cones are less sensitive to light than rod cells** (photoreceptors). Rod cells help humans to see in darker light (night vision). Rods have less role in color vision which is why humans can't see colors well in low light conditions. 


## Color Representation

Colors can be represented in a number of different ways by choosing a color space and the coordinates of a color in that space. The most common representation is the **RGB color space** where a color is represented by **a triple (R,G,B)** usually in the range of [0,255]. 

(0,0,0) is black, (255,255,255) is white. 

RGB is an additive color space. There are other representations such as HSL (hue,saturation, lightness), HSV (hue, saturation, 
value), and HCL (hue, chroma, luminance). 

<img src="../images/rgb.gif"> <img src="../images/rgb_cube.png">

HSV and HSL are cylindrical versions of RGB. HSV colors are represented by hue, saturation and value. HSL colors are represented by, hue, saturation and lightness, except with different definitions of saturation. In HSL, a color of maximum lightness will always be white, regardless of hue and saturation, while in HSV, a color of maximum value will be the most intense color given the hue and saturation (so pure red is a red hue with maximum saturation and value).

<img src="../images/hsl_hsv_models.png">

HSV and HSL suffer from RGB's **lack of perceptual uniformity**, so changing one dimension can result in apparent changes in other dimensions. For example, pure green and pure blue have the same saturation and lightness/value, but green appears to be a much lighter color. Gradients interpolated in the HSV or HSL color space are particularly prone to problems like this when they shift between many different hues.

**HCL is designed to address deficiencies in these models.** The saturation in HSV and HSL is intended to measure the intensity of colorfulness, but different colors appear to the eye to have different intensity, even when they have the same saturation value according to HSL or HSV. HCL is designed with respect to how human eye perceives colors and is known as **perceptually uniform.**

<img src="../images/rgb_rainbow.png">
<img src="../images/hcl_rainbow.png">

---

## Perceptual Distortions in Color


* **Simultaneous contrast** 

Two colors, side by side, interact with one another and change our perception accordingly. The effect of this interaction is called simultaneous contrast. Since we rarely see colors in isolation, simultaneous contrast affects our sense of the color that we see. Refer to **slide #5** in [L2_DataViz_Color.pdf](L2_DataViz_Color.pdf) to see examples of this perceptual distortion.


* **Contrast Sensitivity** 

Human visual perception is more sensitive to changes in luminance. We can perceive the small differences in luminance (e.g. grayscale images) better than those in hue. Refer to **slide #6** in [L2_DataViz_Color.pdf](L2_DataViz_Color.pdf) to see examples of this phenomenon.

* **Color Size illusion** 

Colors have an effect on the perceived size and weight of shapes. For example, red has the highest visual weight, yellow has the least visual weight; using red color can make a shape perceived bigger or more prominent. Refer to **slide #8** in [L2_DataViz_Color.pdf](L2_DataViz_Color.pdf) to see examples of this phenomenon.

Take a look at this [list of optical illusions](https://en.wikipedia.org/wiki/List_of_optical_illusions) that contains color illusions as well as other visual illusions for more examples. 


## Color Blindness 

There are different types of color blindness depending on the type of affected cones in the retina or shifting of their peak sensitivity. 

If the vision is based on only **two types of cones**, it is known as **dichromacy.** 

**About 7% of the male population has one form of color blindness, so it is an important aspect of visualization to choose colorblind safe palettes**. 

Even though people mistakenly think of color blindness usually as seeing the world in black and white, total color blindness (monochromacy) is very **rare**.

Three types of dichromacy; **Protanopia, Deutanopia, and Tritanopia** correspond to the absence of functional L-cones, M-cones, and S-cones, respectively. 

Refer to **slides #9 and #10** in [L2_DataViz_Color.pdf](L2_DataViz_Color.pdf) to see examples of color perception in color blindness. 

---


## Color Maps

A **color map or a color palette** is an ordered collection of colors represented by numeric triples to be used in a particular visualization. 

The following are examples of colormaps. Top left is known as the Rainbow color map, and is one of the most known color maps despite to a number of problems it creates in visual perception as we will see later. 

Top right is a grayscale color map; as we discussed, human visual system is more sensitive to grayscale differences, but using this color map can diminish the visual perception of higher values corresponding to the whiter area of the color map. 

Bottom left is known as the HSL color map (not to be mistaken for the color space), and bottom right is a color palette obtained by changing both luminance and hue components. 

<img src="../images/maps.png">

In this particular example, **worst to best color maps are rainbow, luminance (grayscale), HSL, and the bottom right color map** which has enough detail to show the subtle changes in the surface of the plot, and helps to distinguish peaks easily without creating artificial gradients. 

**So why is rainbow color map bad?** 

Refer to **slide #13** in [L2_DataViz_Color.pdf](L2_DataViz_Color.pdf) to examine the effects of rainbow color map.

<img src="../images/rainbow.png">

 - First, it **can't be used for sequential** data that has an inherent order, because the choice of colors don't lend themselves easily to a natural order. 

 - Second, it has lower resolution than, say the grayscale color map; **small changes can't be easily perceived**. 

 - And third, probably most importantly, it creates **artifical gradients that look like jumps in the data** even though the underlying data changes smoothly. 


So, the rainbow color map is not a good color map unless it is used for qualitative or categorical data. 

## Color Brewer

**[Color Brewer](http://colorbrewer2.org) is a web site that helps to choose color maps for given criteria.** Number of data classes, whether the data is sequential, diverging, or qualitative, colorblind safe, printer friendly, etc. are the criteria you can choose; and it will suggest a good color map to use given your criteria. **Play with this tool to have a grasp before going through the practice notebook.**

In [None]:

library(ggplot2)

# Get color palettes from Color Brewer
library(RColorBrewer)

# display the diverging palette named "BrBG" with eight colors 
display.brewer.pal(n=8, name='BrBG')

In [None]:
# explore all the palettes available 
display.brewer.all()

## Choosing Color Maps

When we choose a color map for the visualization of a particular data set, we need to keep in mind the above discussed issues as well as the **type of data attribute we want to visualize**.

**It is extremely important to choose the right color scheme for the data attributes you want to visualize.** 

### 1- Qualitative color schemes:


Qualitative color schemes use differences in hue to represent nominal differences.
If we want to visualize **categorical/qualitative data** such as classes of **things that do not require numbering or do not have inherent ordering** (types of fruits: apples, oranges, bananas, etc. or types of vehicles: trucks, cars, bicycles, etc.) we should choose a qualitative color scheme with distinct hues, and **not** shades of the same color that could suggest an inherent order in the data. 

In [None]:
# You can also find out the hexadecimal codes of colors like this:

display.brewer.pal(n = 5, name = 'Dark2')
brewer.pal(n = 5, name = 'Dark2')


###  2- Sequential color schemes:


If the data we want to visualize is **numeric and inherently sequential**, we use sequential color schemes (below is from **slide #11** in [L2_DataViz_Color.pdf](L2_DataViz_Color.pdf) showing examples of sequential color schemes). 

<img src="../images/seq.png">

**Sequential data is logically ordered**, and it can be continuous or it can have a stepped sequence. Depending on the data set, **low data values can be represented by light colors and high values represented by dark colors** (e.g. visualization of population density on a map). Transitions between hues may be also used in a sequential scheme, but the light-to-dark progression should dominate the scheme. Other data sets may require the opposite; low values being assigned to darker shades. 

Below is an example of a **choropleth map** that uses a sequential color scheme:

<img src="../images/choro.png">

In [None]:
# display the sequential palette named "Reds" with ten colors 
display.brewer.pal(n=9, name='Reds')

### 3- Diverging color schemes:


Diverging color schemes allow the emphasis of a quantitative data display to be **progressions outward from a critical midpoint of the data range (such as negative and positive values of population growth for example).** A typical diverging scheme pairs sequential schemes based on **two different hues so that they diverge from a shared light color**, for the critical midpoint, toward dark colors of different hues at each extreme. 

Any kind of percent change (both positive and negative) on a choroplethmap is a good example of using this type of color scheme as shown below:


<img src="../images/divex.png">

---

**Let's see how we use this palettes in ggplot2.** 

In [None]:
# First, use default colors: 
g <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) + 
  geom_point(aes(color = Species))
 
g

**Now, choose a palette from Brewer: note that it is a QUALITATIVE color scheme because we are displaying CATEGORIES.** 


In [None]:
g +  scale_color_brewer(palette = "Dark2")

If what you are plotting is **not a scatter plot of points or lines, then use scale_fill_brewer**:


In [None]:
g2 <- ggplot(iris, aes(Species, Sepal.Length)) + 
  geom_boxplot(aes(fill = Species)) + scale_fill_brewer(palette = "Dark2")

g2

And finally, it also helps knowing [what different colors mean in different cultures](../images/culture.png). 

**The following shows how to use Color Brewer palettes in R:**

# Dig Deeper Reading

These are good reads to dig deeper into the choice of color for visualization and understanding color spaces. 

* [HCL-Based Color Palettes in R](https://cran.r-project.org/web/packages/colorspace/vignettes/hcl-colors.pdf)

* [Escaping RGBland: Selecting Colors for Statistical Graphics](http://statmath.wu.ac.at/~zeileis/papers/Zeileis+Hornik+Murrell-2009.pdf)

* [Frequently Asked Questions about Color](http://www.poynton.com/PDFs/ColorFAQ.pdf)


