-
Notifications
You must be signed in to change notification settings - Fork 4
/
adorn-your-plots.Rmd
261 lines (209 loc) · 11.9 KB
/
adorn-your-plots.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
---
title: "How to adorn your plots"
output: rmarkdown::html_vignette
description: |
This vignette covers the core features (SI colors and styles) of the glitr package.
vignette: >
%\VignetteIndexEntry{How to adorn your plots}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
warning = FALSE,
message = FALSE,
fig.retina = 2)
```
### Introduction
This vignette provides best practices for applying SIEI recommended colors to visualizations using the `glitr` package.
### Getting Started
To get started, load the standard OHA-SI libraries used in analysis.
```{r setup, echo = T, eval = T}
library(dplyr)
library(tidyr)
library(forcats)
library(purrr)
library(ggplot2)
library(glitr)
library(systemfonts)
library(scales)
library(ggtext)
library(patchwork)
library(sf)
```
```{r whats_inside, echo = T, eval = T}
# Take a look at what's inside the package
ls('package:glitr')
```
The package can be divided into three main parts i) __colors__, that come as objects (such as `grey30k` or `genoa`), ii) __SI themes__, that can be used to quickly apply SI plot defaults, and iii) __helper functions__, that interpolate palettes or apply palettes to a visualization. This vignette will focus on exploring colors and the SI themes.
### Colors
A number of pre-defined colors come with the `glitr` package. All objects starting with 'grey' belong to a family of gray colors where `grey10k` is the lightest and `grey90k` the darkest. Objects starting with `usaid_` are the official [USAID colors](https://www.usaid.gov/node/194331) while those starting with `wapo_` are Washington Post inspired colors. `seie_` colors are largely out of style but included for posterity. color_` denote objects that can be used to for filling in captions, gridlines, plot text or plot titles. The family of `color_` objects follows the colors recommended in the style guide.
```{r list_colors}
# Colors belonging to greys, usaid_, wapo_ or siei_.
grep("(grey|siei_|wapo_|usaid_)", ls('package:glitr'), value = T)
```
The set of colors that is probably of most interest is the SIEI recommended colors. As you may recall from the [Data Visualization Style Guild](https://usaid-oha-si.github.io/styleguide/), the SI team has created seven core colors:
![#2057a7](https://via.placeholder.com/15/2057a7/000000?text=+) denim (#2057a7)
![#c43d4d](https://via.placeholder.com/15/c43d4d/000000?text=+) old_rose (#c43d4d)
![#8980cb](https://via.placeholder.com/15/8980cb/000000?text=+) moody_blue (#8980cb)
![#e07653](https://via.placeholder.com/15/e07653/000000?text=+) burnt_sienna (#e07653)
![#1e87a5](https://via.placeholder.com/15/1e87a5/000000?text=+) scooter (#1e87a5)
![#f2bc40](https://via.placeholder.com/15/f2bc40/000000?text=+) golden_sand (#f2bc40)
![#287c6f](https://via.placeholder.com/15/287c6f/000000?text=+) genoa (#287c6f)
![#808080](https://via.placeholder.com/15/808080/000000?text=+) trolley_grey (#808080)
```{r show_colors, echo = T, fig.height = 2, fig.width = 4}
# Any of these colors can be called by typing in the name of the color.
# The `show_col` function is from the `scales` package and is a handy way to preview a color.
show_col(genoa)
show_col(c(denim, old_rose, moody_blue, burnt_sienna), borders = F)
```
To access the full list of discrete or continuous palettes, see the si_palettes list or call the color palette directly using the name (`si_palettes$color_name`). Color names that are singular (`genoa`, `old_rose`) are categorical palettes based on suggested color pairs. Color names that are plural (`genoas`) are continuous palettes that can be applied to continuous variables. If you attempt to apply a discrete palette to a continuous variable, the color pairs will be recycled to the length of the vector you are attempting to encode.
```{r palettes, echo = TRUE, fig.height = 3, fig.width = 6}
# Returns the recommended paired colors with genoa as the base
si_palettes$genoa %>% show_col(labels = F, borders = F)
# Returns an interpolated vector of color values around the base genoa color
si_palettes$genoas %>% show_col(labels = F, borders = F)
```
Finally, if you want to create a custom palette from one of the color sets in the si_palettes list, you can do this using the `si_rampr()` function.
```{r custom_palettes, fig.height = 3, fig.width = 6}
# si_rampr takes a palette name and the number of interpolated colors (n) you wish to return as arguments.
denim_pal <- si_rampr(pal_name = "denims", n = 25)
denim_pal
show_col(denim_pal, labels = F, borders = F)
```
### Applying Colors
To add any of these colors to a graph, pass them as arguments in a ggplot2 call. We will work with one of the sample data sets to create a bar graph where positive testing results are colored with scooter.
```{r bar_graph_colors, fig.height = 4, fig.width = 7.5}
# Munge the hts data down to testing yields for a given year
hts_bar <-
hts %>%
filter(period == "FY50",
period_type == "cumulative") %>%
group_by(indicator, prime_partner_name) %>%
summarise(total = sum(value, na.rm = TRUE),
.groups = "drop") %>%
pivot_wider(names_from = indicator,
values_from = total,
names_glue = "{tolower(indicator)}") %>%
mutate(positivity = (hts_tst_pos / hts_tst),
prime_order = fct_reorder(prime_partner_name, hts_tst)) %>%
arrange(prime_order)
# Define testing results to be grey30k and positive results to be scooter.
# Play close attention to where the fill is placed in the geom_col() call.
# If placed inside the aesthetics, you will need to apply scale_fill_identity() to get the colors to render.
p <- hts_bar %>%
ggplot(aes(y = prime_order)) +
geom_col(aes(x = hts_tst), fill = grey30k) +
geom_col(aes(x = hts_tst_pos), fill = scooter) +
labs(x = NULL, y = NULL, title = "HTS_TST_POS FILLED WITH SCOOTER COLOR",
subtitle = "ggplot2 default settings | Not real data.")
p
```
Another way to use colors in plots is to assign color values to a new column in a data frame. For example, say we would like to highlight the two partners that have conducted the greatest number of tests. First, we will create a ranking variable that ranks partners by total testing. Then we will assign a set of colors based on the rankings.
```{r apply_colors, fig.height = 4, fig.width = 7.5}
# Create a rank column, then assign colors based on threshold (<=2)
hts_bar_rnkd <- hts_bar %>%
mutate(hts_rank = dense_rank(desc(hts_tst)),
rnk_color = case_when(
hts_rank <= 2 ~ denim,
TRUE ~ golden_sand
)
)
hts_bar_rnkd %>% filter(hts_rank < 7)
# Assign colors to top two ranked prime partners and plot.
# Use the scale_fill_identity() to let ggplot2 know where fill is from.
hts_bar_rnkd %>%
ggplot(aes(y = prime_order, x = hts_tst, fill = rnk_color)) +
geom_col() +
scale_fill_identity() +
labs(x = NULL, y = NULL, title = "HTS_TST FILLED BASED ON IDENTITY",
subtitle = "ggplot2 default settings | Not real data.")
```
To apply a continuous SIEI palette to a visualization, you can use the `scale_fill_si()` or `scale_color_si()` function. In the example below, we create a heat map, using `geom_tile()`, to show how testing targets vary across modality.
```{r encode_a_heatmap, fig.height = 5, fig.width = 7.5}
hts_hm <- hts %>%
filter(period_type == "targets", period == "FY50") %>%
group_by(prime_partner_name) %>%
mutate(total_targets = sum(value, na.rm = T)) %>%
ungroup() %>%
mutate(partner_order = fct_reorder(prime_partner_name, total_targets))
hts_hm %>%
ggplot(aes(x = modality, y = partner_order, fill = value)) +
geom_tile(color = "white") +
scale_fill_si(palette = "scooters") +
labs(x = NULL, y = NULL, title = "UNIMPRESSIVE HEAT MAP WITH SCOOTERS FILL",
subtitle = "ggplot2 default settings | Not real data.")
```
As is, this graphic is not too informative. Everything is a light shade of scooter. Why is this the case? If we were to look at the distribution of the data, we would see that it skews heavily toward 0. Many targets appear to be on the lower end, but a few outliers are really mucking up the color encoding. We can do a couple things to make this graphic more informative. Because the `scale_fill_si()` function takes the `...` argument, we can pass a transformation option to the plot, as well as define a new set of breaks.
```{r using_dots, fig.height = 5, fig.width = 7.5}
# Log transform data, clean up legend, apply semi-transparency (alpha) and label
hts_hm %>%
ggplot(aes(x = modality, y = partner_order, fill = value)) +
geom_tile(color = "white") +
scale_fill_si(palette = "scooters", alpha = 0.85, trans = "log",
breaks = c(1 * 10^(1:5)),
labels = comma,
name = "Targets") +
labs(x = NULL, y = NULL, title = "LOG-TRANSFORMED COLOR ENCODING",
subtitle = "ggplot2 default settings | Not real data.",
) +
geom_text(aes(label = ifelse(value > 10000, comma(value), NA_real_)),
size = 8/.pt, family = "Source Sans Pro", color = "white") +
scale_x_discrete(guide = guide_axis(n.dodge = 2), position = "top") +
theme_minimal()
```
### SI Themes
In order to create visualizations that appear to belong to the same family ([Think Baldwin Brothers](https://live.staticflickr.com/4111/5195907338_df50404c30_b.jpg)), `glitr` includes nine ggplot themes that simplify a plot down to its core elements. At the base of these themes is the `si_style()`.
```{r list_themes}
grep("(si_style)", ls('package:glitr'), value = T)
```
We can see how this differs from the default ggplot2 theme by applying the theme to the plot above.
```{r si_style, fig.height = 4, fig.width = 7.5}
p + si_style() +
labs(subtitle = "si_style_() | Title, subtitle and caption formatted.")
```
You will notice the si_style strips the background gray color and converts the grid lines to a light shade of gray. As this plot is oriented horizontally, we can remove the extra y gridlines by using the `si_style_xgrid` theme. All of the si themes also come with pre-formatted titles, subtitles, and captions following the style guild standards.
```{r si_style_xgrid, fig.height = 4, fig.width = 7.5}
p +
si_style_xgrid() +
scale_x_continuous(labels = comma) +
labs(caption = paste("Generated on ", Sys.Date(), " by the SI team."),
subtitle = "si_style_xgrid() | Title, subtitle and caption formatted.") +
coord_cartesian(expand = F) # Move names closer to y-axis
```
### Comparing Themes
Below is a summary graphic showing all the available SI themes.
```{r theme_comparison, fig.height = 7.5, fig.width = 7.5}
# Create a list of all themes to loop over
theme_list <- list('si_style()' = si_style(),
'si_style_xline()' = si_style_xline(),
'si_style_xgrid()' = si_style_xgrid(),
'si_style_xyline()' = si_style_xyline(),
'si_style_yline()' = si_style_yline(),
'si_style_ygrid()' = si_style_ygrid(),
'si_style_nolines()' = si_style_nolines(),
'si_style_void()' = si_style_void()
)
# Custom plot function to incorporate themes
custom_plot <- function(x) {
p + {{x}}
}
# Make all the plots
plots <- map2(theme_list, names(theme_list), ~custom_plot(.x) +
labs(subtitle = paste(.y), title = NULL) +
theme(axis.text = element_text(size = 12/.pt)))
# Create a sample map
hts_map <- hts_geo %>%
ggplot() +
geom_sf(aes(fill = prime_partner_name)) +
si_style_map() +
scale_fill_si(palette = "siei", discrete = T) +
theme(legend.position = "none") +
labs(subtitle = 'si_style_map()')
reduce(plots, `+`) +
hts_map +
plot_annotation(subtitle = "A comparison of si_style() themes") +
plot_layout(ncol = 3)
```