# Module 2 Practice

In this notebook, we will look at different ways of choosing color schemes for our visualizations. We will use ggplot2 and RColorBrewer libraries.

[This cheat sheet can also be handy.](http://www.guianaplants.stir.ac.uk/seminar/materials/colorPaletteCheatsheet.pdf)

**Some of the following code cells require you fill in your code in < YOUR CODE > lines or question marks (???); for others,  run the code cell and study the outputs to understand what it does.**

In [None]:
library(ggplot2)
# Color palettes from Color Brewer
library(RColorBrewer)

In [None]:
# First, show all palettes with their names: sequential, qualitative, diverging (remember how to do that from lab?)
display.brewer.all()

In [None]:
# display five colors for a qualitative data type, use 'Dark2' palette 
display.brewer.pal(n=5, name='Dark2')

In [None]:
# display color maps with seven colors for a diverging data type, colorblind safe 
# (look up the parameters of the function and color scheme names)


display.brewer.all(n=7, type ='div', colorblindFriendly=TRUE)


In [None]:
brewer.pal.info

In [None]:
# The following library also contains colorblind safe color maps 
library(dichromat)
colorschemes

In [None]:
# If we want to get MORE colors than available in the library, we can interpolate colors like this: 
p <- colorRampPalette(brewer.pal(9,'Blues'))(100)
p

In [None]:
# Let's use the cars data to visualize some aspects of the data set.
head(mtcars)
# Pick some variables
data=mtcars[ , c(1,3:6)]
 
#Make a plot to show if there's any visible correlation, use rgb() to choose a color and alpha transparency
plot(data , pch=20 , cex=1.5 , col=rgb(0.5, 0.8, 0.9, 0.7))


In [None]:
#Let's compute all the correlations and look at them 
data=cor(mtcars)
data

### Not very useful to look at numbers, let's use a visualization with the ellipse library.

In [None]:
library(ellipse)

# This represents correlations as ellipses; slope represents sign,
# thickness represents strength of correlation: thinner is better
plotcorr(data)

### Again not very clear; let's use an adequate color scheme to distinguish between good and weak correlations as well as negative and positive.

**So we are talking about a diverging color scheme, right?**  

In [None]:
# Build a panel of 100 colors with Rcolor Brewer
my_colors <- brewer.pal(5, "Spectral")
my_colors=colorRampPalette(my_colors)(100)
 
# Order the correlation matrix
ord <- order(data[1, ])
data_ord = data[ord, ord]

# plot and pick a color from the palette based on the value of correlation [-1,1] --> [0,100]

#  ( STUDY  the following code to figure out what it's doing with the palette!! )
plotcorr(data_ord , col=my_colors[data_ord*50+50] , mar=c(1,1,1,1)  )

### This is better. It's a **diverging** color scheme for both positive and negative correlations, and we can easily distinguish the strongest correlations by darker colors thanks to preattentive processing of color by human visual system. Ordering also helps for easy grouping.

In [None]:
# Let's look at different ways of manipulating color in ggplot2
# get a small sample from diamonds data set 
dsamp <- diamonds[sample(nrow(diamonds), 1000), ]

# plot carat vs price and encode 'cut' variable with color

# default color palette: not a good choice 
(gp <- ggplot(dsamp, aes(x=carat, y=price, color=cut)) + geom_point())

# 'cut' is categorical but it does have an inherent ordering. Let's use a sequential color scheme


In [None]:
gp + scale_colour_brewer()


In [None]:
# This might be better if we want to emphasize the ideal cut 
gp + scale_colour_brewer(type="seq", palette=3)

In [None]:
#Again bad choice 
gp + scale_colour_brewer(palette="Set1")


In [None]:
# We can also assign colors manually using their hexadecimal codes 
gp + scale_color_manual(values=c("#0000FF", "#009F00", "#56B4E9", "#009E73", "#FFFFFF"))


# not a very good color scheme

In [None]:
# Let's create a histogram of carat variable 

(gp2 <- ggplot(data=dsamp, aes(x=carat))+ geom_histogram(binwidth=0.5,aes(fill = ..count..)))

In [None]:
gp2 + scale_fill_gradient("Count", low="blue", high="red")

In [None]:
(gp3 <-ggplot(mtcars, aes(x=wt, y=mpg, color=factor(cyl))) + geom_point())

In [None]:
gp3 +  scale_color_brewer(palette="Reds")