## Module 5 Practice - Bubble Charts

In this practice, we will create **bubble charts** in ggplot2 using the **layered grammar approach in ggplot**. We will also see how we could create them using standard R plots and plotly. 

A **bubble chart** is a special type of **scatter** plot where **several attributes** of a data item are encoded into a **point mark** using visual variables such as color, size, and (x,y) position. 

By carefully designing visual channel mapping, we can create efficient visualizations of data with separable channels to represent the most attributes we can. 

A famous interactive and animated bubble chart is the **Gapminder** that you can see and interact with [here](http://gapminder.org/tools). 

**Let's start with the crime data.**

In [None]:
library(ggplot2)
crime = read.csv("/dsa/data/all_datasets/crime.csv")

head(crime)

This is how we would create a bubble chart in **plain R**: first compute the radius of the circles to represent population.


In [None]:
radius <- sqrt(crime$population/pi)

Then, draw symbols (in this case, circles) given the two attributes as coordinates, and the third one as the size.

In [None]:
symbols(crime$murder, crime$burglary, circles=radius, inches=0.25, fg="white", bg="red", xlab="Murder Rate", ylab="Burglary Rate")

# Add state names to the plot
text(crime$murder, crime$burglary, crime$state, cex=0.5)

**It is a short piece of code, but in terms of modularity, it does not have separable components to separate data, transformation, visual mappings, etc.** 

**Let's do the same in Plotly.** 

In [None]:
library(plotly)
plot_ly(crime, x = ~murder, y = ~burglary, type = 'scatter', mode = 'markers', size = ~population,
        sizes = c(10, 50), marker = list(opacity = 0.5, sizemode = 'diameter'),
        hoverinfo = 'text', text = ~paste(population)) %>%
        add_text(text = ~state, textposition = 'middle', size=8) %>%
 layout(title = 'Crime Rates by State',
         xaxis = list(title = 'Murder Rate'),
         yaxis = list(title = 'Burglary Rate'))

**Plotly is a little better; we can have separate *traces* that act like layers (you may have to run the cell twice).** 

**Let's create the same plot with ggplot2.** 

It is a scatter plot, so we'll use `geom_point()`. We will map murder and burglary to position, and population to size.

In [None]:
g <- ggplot(crime, aes(x=murder, y=burglary, size=population)) + 

 geom_point(color="red", alpha=0.5) + 

 scale_size(range = c(1,20)) +

 theme(legend.position="none")

g

We can add state names as a separate layer using `geom_text()`, and the axes labels, too.

In [None]:
g <- g + geom_text(size=2, aes(label=state)) +

 xlab("Murder rate") + ylab("Burglary rate") +

 ylim(200,1400) + xlim(0,11) 

g 

## YOUR TURN:

We can add another visual variable; **let color represent `motor_vehicle_theft`.** Add another geom_point to `g`. **What happens?** 

In [None]:
# adding another geom_point to the plot above: 

 g + geom_point(<YOUR CODE HERE>) +
 
 scale_color_continuous(low="yellow", high="purple")  # give a continuous color palette 



**Rewrite code to have only one `geom_point` to fix it.**

In [None]:
# <YOUR CODE HERE> 

**In ggplot2, you can code different layers of the plot in different independent components and keep adding to the same plot. This approach is easier to understand and code more complex plots as we have seen in some examples in the previous modules.** 