# Nightingale Rose Activity

### Please refer to the in class handout for a brief history of the classic graphic: Nightingale's Rose.  This is a visual display of mortality data from the Crimean War starting in 1854.  

We will be using the **ggplot2**, **HistData** and **reshape** packages.  Also RColorBrewer so we can brew up some nice color schemes.

In [1]:
library(HistData)
library(reshape)
library(ggplot2)
library(RColorBrewer)

This activity is going to involve reshaping our data so let's try to reshape a simple data frame shown below:

In [2]:
df <- data.frame(M = c(5,7), T = c(6,10), W = c(8,15))
df
rownames(df) = c(2009, 2014)
df

M,T,W
5,6,8
7,10,15


Unnamed: 0,M,T,W
2009,5,6,8
2014,7,10,15


### 1.   We would like to make two lines, one for each year with the days on the x-axis.  In Excel you would just highlight this data and click on line chart.  In R we need to reshape the data.  

#### There are 6 data points or observations, so we will need 6 rows in our R data.frame with 3 columns: 
* one for the Year, 
* one for the Day, 
* one for the Value.

Create such a data frame called *df2*:

In [6]:
df2<- data.frame (Year= c(2009,2014), Day= c("M","T"), Value= c(6,3))
df2

Year,Day,Value
2009,M,6
2014,T,3


### If only there was some function that could this for us...
* Add a new variable to **df** called *Year* that contains the rownames of **df**
* Use the **melt** function to melt **df** by Year
* Take a look at **df_melt** and see what it did :O)

In [7]:
# add a variable named Year to df...
df$Year<-rownames(df) 
df_melt <- melt(df, "Year")
df_melt

Year,variable,value
2009,M,5
2014,M,7
2009,T,6
2014,T,10
2009,W,8
2014,W,15


### Now we can plot points with Value on the y-axis and Day on x-axis, color and group by Year, use factor on the Year for discrete colors, and add in a geom_line.

In [None]:
ggplot(df2, aes(Day, Value, color = Year, group = Year)) + geom_line()
ggplot(df2, aes(Day, Value, color = factor(Year), group = Year)) + geom_line()

### 2.  Too cool for school.  HistData contains a data set called *Nightingale*, look at the head and dim of this.

In [None]:
head(Nightingale)
dim(Nightingale)

###  3.  We are only interested in columns 1 and 8-10.  Create a subset data.frame called *Night* that contains only these columns.  Look at the head and dimension.

In [None]:
Night<- Nightingale[ ,c(1,8:10)]
head(Night)

### Now we use the melt function to reshape the data automatically like we did manually above for the lines.
* melt Night by Date

In [None]:
N_melt <- melt(Night, "Date")
head(N_melt)
dim(N_melt)

### 4.  Explain what *melt* did to Night.

Melt took the data and transformed it froma. table format to another table, where each data point has a row.  

In [None]:
names(N_melt) <- c("Date", "Cause", "Deaths")
N_melt$Cause <- sub("\\.rate", "", N_melt$Cause)
N_melt$Regime <- ordered( rep(c(rep('Before', 12), rep('After', 12)), 3), 
                         levels=c('Before', 'After'))
head(N_melt)
dim(N_melt)

###  5.  Ummmm, before and after what?
* What does that **sub** function do?

"Before" is the first year (before sanitary commission allowed them to come in) and "After" for the data after the sanitary commission was allowed in. 

#### Dates and time have their own special syntax in R, below you see code taking a date format and pulling out the month and year.

In [None]:
N_melt$Month <- format(N_melt$Date, "%b %Y")
head(N_melt)

### Now we see before and after as subsets:

In [None]:
Night1 <- subset(N_melt, Date < as.Date("1855-04-01"))
Night2 <- subset(N_melt, Date >= as.Date("1855-04-01"))
head(Night1)
head(Night2)


### Now for the plotting.  The radial plots are referred to as coxcomb plots.

In [None]:
cxc1 <- ggplot(Night1, aes(x = factor(Date), y=Deaths, fill = Cause)) +
# do it as a stacked bar chart first
  geom_bar(width = 1, stat="identity", color="black") +
# set scale so area ~ Deaths    
  scale_y_sqrt() 

cxc1

In [None]:
# A coxcomb plot = bar chart + polar coordinates
cxc1 + coord_polar(start=3*pi/2) + 
 ggtitle("Causes of Mortality in the Army in the East") + 
 xlab("")

# Wooooohoooooo!!!!!  
This graph is so nice, very nice, it's very very impressive.  We have the best graphs in the world.  It's a wonderful thing what we are able to graph.

### 6.  Make an After plot.

## Now make them both together Just like Flo did and choose your favorite color scheme:

In [None]:
display.brewer.all()

In [None]:
# do both together, with faceting
cxc <- ggplot(N_melt, aes(x = factor(Date), y=Deaths, fill = Cause)) +
  geom_bar(width = 1, stat="identity", color="black") + 
  scale_y_sqrt() +
  facet_grid(. ~ Regime, scales="free", labeller=label_both) +
    scale_fill_brewer(palette = "RdBu")

cxc + coord_polar(start=3*pi/2) +
  ggtitle("Causes of Mortality in the Army in the East") + 
  xlab("") 
    

###  7.  What the facet!?  That's beautiful (kind of squashy but still very very nice).  What's the point?  What story does this graphic tell?

### 8.  What if Flo had chosen line graphs?  Would it have been as compelling?

Type Markdown and LaTeX: $\alpha^{2}$

In [None]:
colors <- c("blue", "red", "black")
with(Nightingale, {
  plot(Date, Disease.rate, type="n", col="blue", 
    ylab="Annual Death Rate", xlab="Date", xaxt="n",
    main="Causes of Mortality of the British Army in the East");
# background, to separate before, after
    rect(as.Date("1854/4/1"), -10, as.Date("1855/3/1"), 
    1.02*max(Disease.rate), col="lightgray", border="transparent");
      text( as.Date("1854/4/1"), .98*max(Disease.rate), "Before Sanitary\nCommission", pos=4);
      text( as.Date("1855/4/1"), .98*max(Disease.rate), "After Sanitary\nCommission", pos=4);
# plot the data
  points(Date, Disease.rate, type="b", col=colors[1]);
  points(Date, Wounds.rate, type="b", col=colors[2]);
  points(Date, Other.rate, type="b", col=colors[3])
        }
          )
# add custom Date axis and legend
axis.Date(1, at=seq(as.Date("1854/4/1"), as.Date("1856/3/1"), "4 months"), format="%b %Y")
legend(as.Date("1855/10/20"), 700, c("Disease", "Wounds", "Other"),
      col=colors, fill=colors, title="Cause")

# Bubbles!
Fun bubble scatterplot to finish up.  Read in the crimedata csv and take a look, find the dim.

### 9.  Create scatterplot of burglary against murder.

### 10.  Fund out who that poor fella is way off to the right and remove that row from data frame.  Replot.
Note I named the new dataframe *crime4* and used that below to make bubble chart :O)

# Now make bubbles!!!

In [None]:
symbols(crime4$murder, crime4$burglary, circles = crime4$population)

radius = sqrt(crime4$population/pi)

symbols(crime4$murder, crime4$burglary, circles = radius, inches = 0.35, 
        fg = "white", bg = "red", xlab = "Murder Rates", ylab = "Burglary Rates", main = "Rates per 100,000")

text(crime4$murder, crime4$burglary, crime4$state, cex = 0.5)

### 11. Last question.  What is the radis line of code doing?  Why is it necessary?

Type Markdown and LaTeX: $\alpha^{2}$