# Bubble Charts in Plotly

Now, we will create interactive bubble charts in Plotly. Again, have these references handy to understand how Plotly functions work:


* **Reference** [Plotly R cheat sheet](https://images.plot.ly/plotly-documentation/images/r_cheat_sheet.pdf)
* **Reference** [Plotly R reference](https://plot.ly/r/reference/)

**Run each cell below and examine the output (you may have to run them twice for the plot to appear).** 


In [None]:
library(ggplot2)
library(plotly)

Let's read the school earnings data set from the online resource and plot the earning gap between genders. 

In [None]:
data <- read.csv("https://raw.githubusercontent.com/plotly/datasets/master/school_earnings.csv")
head(data)
summary(data)
str(data)

**We will use plot_ly function to create a scatter plot and the size of the mark will encode the gap.**

In [None]:
p <- plot_ly(data, x = ~Women, y = ~Men, text = ~School, type = 'scatter', mode = 'markers',
             marker = list(size = ~Gap, opacity = 0.5)) %>%
     layout(title = 'Gender Gap in Earnings per University',
            xaxis = list(showgrid = FALSE), # NO grid for clarity and data-ink ratio 
            yaxis = list(showgrid = FALSE))

p

**Let's use color to encode the gap.**

In [None]:
p2 <- plot_ly(data, x = ~Women, y = ~Men, text = ~School, color = ~Gap, type = 'scatter', mode = 'markers', 
              colors = 'Reds',
              marker = list(size = ~Gap, opacity = 0.5)) %>%
      layout(title = 'Gender Gap in Earnings per University',
             xaxis = list(showgrid = FALSE),
             yaxis = list(showgrid = FALSE))
p2

**We can also encode the states by color, let's add states to the data.** 

In [None]:
data$State <- as.factor(c('Massachusetts', 'California', 'Massachusetts', 'Pennsylvania', 'New Jersey', 'Illinois', 'Washington DC',
                          'Massachusetts', 'Connecticut', 'New York', 'North Carolina', 'New Hampshire', 'New York', 'Indiana',
                          'New York', 'Michigan', 'Rhode Island', 'California', 'Georgia', 'California', 'California'))

p3 <- plot_ly(data, x = ~Women, y = ~Men, type = 'scatter', mode = 'markers', size = ~Gap, color = ~State, 
              colors = 'Paired',
              sizes = c(10, 50), marker = list(opacity = 0.5, sizemode = 'diameter'),
              hoverinfo = 'text', text = ~paste('School:', School, '<br>Gender gap:', Gap)) %>%
      layout(title = 'Gender Gap in Earnings per University',
             xaxis = list(showgrid = FALSE),
             yaxis = list(showgrid = FALSE),
             showlegend = FALSE)

p3

## NOW YOUR TURN: 


A famous interactive and animated bubble chart is the **Gapminder** that you can see and interact with [here](http://gapminder.org/tools). 

**Let's create a Gapminder-like plot by using 2007 data only.**


In [None]:
data <- read.csv("https://raw.githubusercontent.com/plotly/datasets/master/gapminderDataFiveYear.csv")
str(data)

In [None]:
# PICK 2007 DATA ONLY 
data_2007 <- data[which(data$year == 2007),]


# SORT IT 
data_2007 <- data_2007[order(data_2007$continent, data_2007$country),]

In [None]:
slope <- 2.666051223553066e-05
data_2007$size <- sqrt(data_2007$pop * slope)
colors <- c('#4AC6B7', '#1972A4', '#965F8A', '#FF7070', '#C61951')



# X AND Y AXES WILL BE GDP PER CAP AND LIFE EXPECTANCY, RESPECTIVELY. 
# MAP THE CONTINENT ATTRIBUTE TO COLOR VISUAL VARIABLE, 
# AND THE SIZE ATTRIBUTE TO SIZE VISUAL VARIABLE. 

pg <- plot_ly(data_2007, x = ~gdpPercap, y = ~lifeExp, color = ~continent, size = ~size, colors = colors, 
              
              type = 'scatter', mode = 'markers', 

              #sizes = c(min(data_2007$size), max(data_2007$size)),        
              sizes = c(10, 100), # limit the size of the bubbles in order to make them easily visible 
              marker = list(symbol = 'circle', sizemode = 'diameter', line = list(width = 2, color = '#FFFFFF')),

              # add hover text for bubbles               
              text = ~paste('Country:', country, '<br>Life Expectancy:', lifeExp, '<br>GDP:', gdpPercap,
                      '<br>Pop.:', pop)) %>%
      layout(title = 'Life Expectancy v. Per Capita GDP, 2007',
             xaxis = list(title = 'GDP per capita (2000 dollars)',
                      gridcolor = 'rgb(255, 255, 255)',
                      range = c(2.003297660701705, 5.191505530708712),
                      type = 'log',
                      zerolinewidth = 1,
                      ticklen = 5,
                      gridwidth = 2),
             yaxis = list(title = 'Life Expectancy (years)',
                      gridcolor = 'rgb(255, 255, 255)',
                      range = c(36.12621671352166, 91.72921793264332),
                      zerolinewidth = 1,
                      ticklen = 5,
                      gridwith = 2),
             paper_bgcolor = 'rgb(243, 243, 243)',
             plot_bgcolor = 'rgb(243, 243, 243)')
pg