# MPA 5830 - Module 06 (Fall 2021)

# Interactive Graphics with Plotly and Highcharter

Interactive graphics are useful in situations where you would like the user/viewer to see the data values or other details by hovering over or clicking on the graphic. Say, for example, I have a scatterplot and want to make it interactive. How can I do that? 

One crude and fast way to do that is by saving my `ggplot2` object and then using `{plotly}` to add a `ggplotly()` wrapper around the plot. 

In the example below I am saving the plot as `pl01`, then wrapping it in `ggplotly` with `ggplotly(pl01) -> lst`. 

In [None]:
install.packages(c("plotly", "highcharter"))

In [None]:
library(plotly)

ggplot() +
  geom_point(
    data = mpg,
    mapping = aes(
        x = cty, 
        y = hwy, 
        color = trans)
    ) +
  labs(
      x = "City Mileage",
      y = "Highway Mileage",
      color = "Transmission"
  ) -> pl01

lst <- list()

ggplotly(pl01) -> lst

htmltools::tagList(lst)

These plots are useful when presenting data to a live audience (in a talk, or on the web). 

Rather than use `plotly`, I prefer `{highcharter}` since it does a lot of things well with minimal fuss, and yet the resulting plots are aesthetically pleasing. 

Let us stay with the COVID-19 example. Say I want a bar-chart of the total number of cases by state and want to do this via `highcharter`. 

In [None]:
library(highcharter)

readr::read_csv(
    "https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv"
    ) -> covid 

covid %>%
  filter(
      date == "2020-04-17"
  ) %>%
  rename(
      State = state, 
      `Total Cases` = cases
  ) -> tab1

In [None]:
hchart(
    tab1, 
    "bar", 
    hcaes(
        x = State, 
        y = `Total Cases`
        )
    )

Notice the key elements here: The basic function call is `hchart()` and we are specifying that we want a bar-chart, and we are also providing the quantities that should go on the x and y axis, respectively. Note that x actually ends up as the y when you specify a "bar" chart. 

What if I wanted a line-chart, maybe of the number of cases over time? And I wanted this just for a few states? We could do that too, as shown below. Note that I am creating `tab2`, a frequency table of the number of cases by state and date, and then converting total_cases into a logarithmic form (saved as `log_cases`) so that we can compare the rate of change from one date to the next on a common scale.  

In [None]:
covid %>%
  filter(
      state %in% c("Ohio", "Florida", "California", "New Jersey", "Ohio", "New York"),
      date >= "2020-03-01"
  ) %>%
  group_by(state, date) %>%
  mutate(
      log_cases = log(sum(cases))
  ) %>%
    ungroup() -> tab2

In [None]:
head(tab2)

There are duplicate rows per state per date because the counties remain. I will run `distinct()` to get rid of them.

In [None]:
tab2 %>%
    select(state, date, log_cases) %>%
    distinct() -> tab2_nodups

In [None]:
head(tab2_nodups)

In [None]:
hchart(
    tab2_nodups, 
    "line", 
    hcaes(
        x = date, 
        y = log_cases, 
        group = state
        )
    ) 

Now here is a county-level chart that shows the total number of cases as of November 15, 2021. 

The data are stored in `tab3` created as shown below. Pay attention to this creation because we are not just creating a frequency table but also adding in a specific key we are calling `code` because we will need to join these data to the map data.

In [None]:
covid %>%
  group_by(county, state, fips) %>%
  filter(date == "2021-11-15") %>% 
  unite(
      Location, 
      c(county, state), 
      sep = ", ", 
      remove = TRUE
  ) -> tab3

In [None]:
head(tab3)

Here comes the map! 

Note that we are asking forthe `cases` column to be used for the values that will color each county, and we are asking the highcharter map file to be joined with `fips` in the highcharter file and `fips` in `tab3`. 

The county borders will be in steelblue, and there will be 10 values used to create the fill color palette. 

The legend will be aligned right, and set to be horizontal. 

The color palette used will be from the `{viridis}` package.

In [None]:
library(viridis)

hcmap("countries/us/us-all-all", 
      data = tab3,
      name = "COVID-19 Cases", value = "cases",
      joinBy = c("fips", "fips"),
      borderColor = "steelblue"
     ) %>%
  hc_colorAxis(
      stops = color_stops(
          10, 
          rev(magma(10))
          )
      ) %>% 
  hc_legend(
      layout = "horizontal", 
      align = "right",
      floating = TRUE, 
      valueDecimals = 0, 
      valueSuffix = ""
  ) 

Note that `countries/us/us-all-all` indicates that we want counties. If we wanted the states instead it would have been `countries/us/us-all`.  

What if we wanted only Ohio? 

Well, in that case we could subset as shown below. 

In [None]:
tab3 %>%
  filter(
      grepl(", Ohio", Location)
  ) -> tab4

In [None]:
head(tab4)

In [None]:
hcmap("countries/us/us-oh-all", 
      data = tab4,
      name = "COVID-19 Cases", value = "cases",
      joinBy = c("fips", "fips"),
      borderColor = "steelblue"
     ) %>%
  hc_colorAxis(
      stops = color_stops(
          10, 
          rev(magma(10))
      )
  ) %>% 
  hc_legend(
      layout = "horizontal", 
      align = "right",
      floating = TRUE, 
      valueDecimals = 0, 
      valueSuffix = ""
  ) 

There you have it! 

The one downside to these interactive charts is that they are best displayed in html files but in PDF and Word document they lose that interactivity. Hence you see them a lot on blogs and other web-based documents. 

All of these packages have been growing so it is quite likely that as software development continues even that barrier might be eliminated.

------------

# Exercises for Practice 

## Exercise 01
Create a map of all the counties in New York. Be sure to title the map and to fill in each county with the total number of COVID19 cases they have seen to date. In addition, draw county borders in white. Use `theme_map()` and make sure the legend is at the bottom. [**Hint:** You will need to calculate the total number of cases per county and then join the resulting file with the counties data file to get the latitude/longitudes for the counties.]

## Exercise 02 
Run the following code chunk to load data on the murder, assault and rape rates per 100,000 persons. `Urbanpop` is the percent of the state population that lives in an urban area. 

In [None]:
library(tidyverse)

data(USArrests)
names(USArrests)
USArrests$statename <- rownames(USArrests)

head(USArrests)

Now create a state-level map of the 50 states making sure to use `UrbanPop` to fill each state. Title the map and place the legend at the bottom. 

## Exercise 03

Use the `USArrests` data to draw scatterplots of (a) `Murder` versus `UrbanPop`, (b) `Assault` versus `UrbanPop`, and (c) `Rape` versus `UrbanPop`. Save each of these scatterplots by name and then use `patchwork` to create a single canvas that includes all three plots. Make sure you label the x-axis, y-axis, and title each plot. 

## Exercise 04 

Now create `highcharter` versions of each of the three scatterplots you created in Exercise (3) above. You should end up with three scatterplots, each on its own canvas. 