Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Percentage-y-scales with ..density.. in geom_histogram() works in facets with natural numbers, but not with (downscaled) decimal numbers #2499

Closed
ManuelNeumann opened this issue Mar 26, 2018 · 3 comments

Comments

@ManuelNeumann
Copy link

I have a problem with percentage y-axes for histograms in facet-pots (using the scales-package). Everything worked just fine when I used the original scale, but it failed when I rescaled the items in an interval between 0 and 1.

Here's my example data:

# ----- Facet Plot ------
library(ggplot2)
library(scales)
library(magrittr)
library(dplyr)
library(reshape2)

# Trial 1 ----
set.seed(1234)
id <- c(1:40)

# Create five items with values between 0 and 5
x1 <- runif(n = 40, min = 0, max = 5) %>% round()
x2 <- runif(n = 40, min = 0, max = 5) %>% round()
x3 <- runif(n = 40, min = 0, max = 5) %>% round()
x4 <- runif(n = 40, min = 0, max = 5) %>% round()
x5 <- runif(n = 40, min = 0, max = 5) %>% round()

# Make data
data <- cbind(id, x1, x2, x3, x4, x5) %>% as.data.frame()

# Make Index
data %<>% mutate(x6 = (x1 + x2 + x3 + x4 + x5)/5)

Now I wanted to make a facet-plot, showing the histograms of each of the items and its respective percentage on the y-axis. Therefore I used aes(y = ..density..) (as I found out on Stack Overlow for the histograms and everything works perfectly fine. This was exactly the plot I needed:

# Long format data frame for facet plot
long.data <- melt(data, id.vars = "id")
head(long.data)

# And make the plot:
ggplot(data = long.data, aes(value)) + 
  geom_histogram(bins = 6,
                 aes(y = ..density..)) +
  facet_wrap(~variable) +
  scale_y_continuous(labels = percent_format()) +
  theme_bw()

# Everything is just fine: The continuous x-scale and a percentage-y-scale.

01-facets

But then, I had to re-scale the items to the interval between 0 and 1 for different models and analyses:

# Trial 2 ----

data %<>% mutate(x1_resc = x1/6,
                 x2_resc = x2/6,
                 x3_resc = x3/6,
                 x4_resc = x4/6,
                 x5_resc = x5/6,
                 x6_resc = (x1_resc + x2_resc + x3_resc + x4_resc + x5_resc)/5)

As I ran the code again (the ggplot-code is identical to the one above), the y-axes of the histograms changed. Instead of the respective percentages, the axes show now percentage values between 0 and 200%.

# Long format
long.data <- melt(data[, c(1, 8:13)], id.vars = "id")
head(long.data)

# And the plot
ggplot(data = long.data, aes(value)) + 
  geom_histogram(bins = 6,
                 aes(y = ..density..)) +
  facet_wrap(~variable) +
  scale_y_continuous(labels = percent_format()) +
  theme_bw()

02-facets

I can work with multiple workarounds (e.g. upscaling my items in the long-format and re-labelling the x-axis of the first plot), but I still hope this is an issue worth noticing.

@hadley
Copy link
Member

hadley commented Apr 27, 2018

That is because the y-axis is a density (the area integrates to 1), not a percentage (the heights sum to 1)

@hadley hadley closed this as completed Apr 27, 2018
@lock
Copy link

lock bot commented Oct 24, 2018

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Oct 24, 2018
@clauswilke
Copy link
Member

clauswilke commented Dec 19, 2018

Since this issue is getting some attention on twitter today: The solution is to use aes(y = stat(width*density)). This converts the density back into a percentage. Reprex follows below.

library(ggplot2)
library(scales)
library(magrittr)
library(dplyr)
library(reshape2)

# Trial 1 ----
set.seed(1234)
id <- c(1:40)

# Create five items with values between 0 and 5
x1 <- runif(n = 40, min = 0, max = 5) %>% round()
x2 <- runif(n = 40, min = 0, max = 5) %>% round()
x3 <- runif(n = 40, min = 0, max = 5) %>% round()
x4 <- runif(n = 40, min = 0, max = 5) %>% round()
x5 <- runif(n = 40, min = 0, max = 5) %>% round()

# Make data
data <- cbind(id, x1, x2, x3, x4, x5) %>% as.data.frame()

# Make Index
data %<>% mutate(x6 = (x1 + x2 + x3 + x4 + x5)/5)

# Long format data frame for facet plot
long.data <- melt(data, id.vars = "id")

# And make the plot:
ggplot(data = long.data, aes(value)) + 
  geom_histogram(bins = 6,
                 aes(y = stat(width*density))) +
  facet_wrap(~variable) +
  scale_y_continuous(labels = percent_format()) +
  theme_bw()

data %<>% mutate(x1_resc = x1/6,
                 x2_resc = x2/6,
                 x3_resc = x3/6,
                 x4_resc = x4/6,
                 x5_resc = x5/6,
                 x6_resc = (x1_resc + x2_resc + x3_resc + x4_resc + x5_resc)/5)

# Long format
long.data <- melt(data[, c(1, 8:13)], id.vars = "id")

# And the plot
ggplot(data = long.data, aes(value)) + 
  geom_histogram(bins = 6,
                 aes(y = stat(width*density))) +
  facet_wrap(~variable) +
  scale_y_continuous(labels = percent_format()) +
  theme_bw()

Created on 2018-12-19 by the reprex package (v0.2.1)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants