# Module 2 Practice

In this notebook, we will look at different ways of choosing color schemes for our visualizations. We will use ggplot2 and RColorBrewer libraries.

[This cheat sheet can also be handy.](https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/colorPaletteCheatsheet.pdf)  
  * [Local Mirror](https://indigo.sgn.missouri.edu/static/PDF/colorPaletteCheatsheet.pdf)
  
### Please read the code comments!

This notebook high-lighting various utilities at your disposal for selecting color maps for data visualization.
Please be sure to carefully read the code.
Also remember, that any function you see can be run in the help function.
Just use the notebook menu: `Insert > Insert Cell Above`  and then:
```R
help(function_name)
```

In [None]:
library(ggplot2)
# Color palettes from Color Brewer
library(RColorBrewer)
# Show all palettes with their names: sequential, qualitative, diverging
display.brewer.all()

In [None]:
# display five colors for a qualitative data type, use 'Dark2' palette 
display.brewer.pal(n = 5, name = 'Dark2')


In [None]:
# display color maps with seven colors for a diverging data type, colorblind safe
display.brewer.all(n = 7, type = 'div', colorblindFriendly = TRUE)


The package can provide details of the available color maps.
This includes name (key), max colors, etc.

In [None]:
brewer.pal.info

In [None]:
# This library also contains colorblind safe color maps 
library(dichromat)
colorschemes

In [None]:
# If we want to get more colors than available in the library, we can interpolate like this: 
p <- colorRampPalette(brewer.pal(9,'Blues'))(100)

In [None]:
# Let's use the built-in cars data (mtcars) to visualize some aspects of the data set.
head(mtcars)
# Pick some variables
data=mtcars[ , c(1,3:6)]
 
#Make a plot to show if there's any visible correlation, use rgb() to choose a color and alpha transparency
plot(data , pch=20 , cex=1.5 , col=rgb(0.5, 0.8, 0.9, 0.7))


In [None]:
#Let's compute all the correlations and look at them 
data=cor(mtcars)
data

In [None]:
# Not very useful to look at numbers, let's use a visualization with the ellipse library
library(ellipse)

# This represents correlations as ellipses; slope represents sign,
# thickness represents strength of correlation: thinner is better
plotcorr(data)

In [None]:
# Again not very clear; let's use an adequate color scheme to distinguish between good and weak 
# correlations as well as negative and positive 

# Build a Pannel of 100 colors with Rcolor Brewer
my_colors <- brewer.pal(5, "Spectral")
my_colors=colorRampPalette(my_colors)(100)
 
# Order the correlation matrix
ord <- order(data[1, ])
data_ord = data[ord, ord]
# plot and pick a color from the palette based on the value of correlation [-1,1] --> [0,100]
plotcorr(data_ord , col=my_colors[data_ord*50+50] , mar=c(1,1,1,1)  )


### This is better. 
It's a diverging color scheme for both positive and negative correlations, 
 and we can easily distinguish the strongest correlations by darker color thanks to preattentive 
 processing of color by human visual system. Ordering also helps. 

In [None]:
# Let's look at different ways of manipulating color in ggplot2
# get a small sample from diamonds data set 
dsamp <- diamonds[sample(nrow(diamonds), 1000), ]
head(dsamp)

In [None]:


# plot carat vs price and encode 'cut' variable with color

# default color palette: not a good choice 
(gp <- ggplot(dsamp, aes(x=carat, y=price, color=cut)) + geom_point())

# 'cut' is categorical but it does have an inherent ordering. Let's use a sequential color scheme


In [None]:
gp + scale_colour_brewer()


In [None]:
# This might be better if we want to emphasize the ideal cut 
gp + scale_colour_brewer(type="seq", palette=3)

In [None]:
#Again bad choice 
gp + scale_colour_brewer(palette="Set1")


In [None]:
# We can also assign colors manually using their hexadecimal codes 
gp + scale_color_manual(values=c("#0000FF", "#009F00", "#56B4E9", "#009E73", "#FFFFFF"))


# not a very good color scheme

## <span style="background:yellow">Your Turn</span>

Examine some other potential variables of the Diamond sample, `dsamp`.

First, lets take a look at the assesment of variable type.

In [None]:
str(dsamp)

You may have hear of the **4-C's** of diamond:
  * Carat
  * Cut
  * Color
  * Clarity

All of these affect price of the diamond.  
In the cells below, create 3 visual renderings which you think best convey a relationship of two of the C's in determining price.
In each case, one C is the X-axis value, the other is the color aesthetic.

#### 1)

In [None]:
# Add your code below this line
# ------------------------------







#### 2)

In [None]:
# Add your code below this line
# ------------------------------







#### 3)

In [None]:
# Add your code below this line
# ------------------------------







# SAVE YOUR NOTEBOOK, and then "Close and Halt"