Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bars become wider when certain factor levels are excluded despite using scale_x_discrete(drop = FALSE) #5400

Closed
isaacvock opened this issue Aug 28, 2023 · 1 comment

Comments

@isaacvock
Copy link

isaacvock commented Aug 28, 2023

#I found a problem when making bar plots with geom_bar(), geom_col(), and geom_histogram() and using scale_x_discrete(drop=FALSE) to retain x-axis extent of missing levels of a factor.

When passed a data frame with a factor column with several possible levels, some instances of which have been filtered out of the data to be visualized, the scaling of bar widths is inconsistent. If a single factor level is present in the data, and all other levels are missing, the width of the one bar is what it would be if all factor levels are present (that is, it is skinny enough to not overlap with the x-axis extent of bars for other factor levels); this was the expected and desired behavior.

If more than one factor level is present and some are missing, the bars can start to scale so as to force the bars of the present levels to come in close proximity, as if they are adjacent levels; this was unexpected behavior. This problem disappears though if any of the set of present factor levels are adjacent.

Here is the code to reproduce the bug:

# Load dependencies
library(ggplot2)
library(dplyr)
library(patchwork)

# Toy data
data <- tibble(x = as.factor(1:10),
               y = 10)

## Only one factor level present in data
# Bar width is scaled the same as if all levels were present; EXPECTED
a <- ggplot(data %>% filter(x == 5),
       aes(x = x, y = y)) + 
  scale_x_discrete(limits = levels(data$x), drop = FALSE) + 
  geom_bar(stat = "identity")

## 2 factor levels present, 8 missing
# Bar width is larger than expected, overlapping area that would be occupied by other factor level bars; NOT EXPECTED
b <- ggplot(data %>% filter(x %in% c(5, 8)),
       aes(x = x, y = y)) + 
  scale_x_discrete(drop = FALSE) + 
  geom_bar(stat = "identity") 

## 2 factor levels present, 8 missing, played around with position_dodge2()
## Also tried preserve = "total" and position_dodge() with similar preserve args
# Bar width is larger than expected, overlapping area that would be occupied by other factor level bars; NOT EXPECTED
c <- ggplot(data %>% filter(x %in% c(5, 8)),
       aes(x = x, y = y)) + 
  scale_x_discrete(drop = FALSE) + 
  geom_bar(stat = "identity",
           position = position_dodge2(preserve = "single")) 

## 2 factor levels present, 8 missing
# Shows that behavior is to scale bar widths so that bars for present factor levels meet; NOT EXPECTED
d <- ggplot(data %>% filter(x %in% c(2, 8)),
       aes(x = x, y = y)) + 
  scale_x_discrete(drop = FALSE) +
  geom_col()

## 2 adjacent factor levels present, 8 missing
# Bar widths are scaled as if other levels present; EXPECTED
e <- ggplot(data %>% filter(x %in% c(9, 10)),
            aes(x = x, y = y)) + 
  scale_x_discrete(drop = FALSE) +
  geom_col()

## 2 adjacent factor levels present and 1 non-adjacent factor level present
# Bar widths are scaled as if other levels present; EXPECTED
f <- ggplot(data %>% filter(x %in% c(3, 4, 9)),
            aes(x = x, y = y)) + 
  scale_x_discrete(drop = FALSE) +
  geom_col()


(a | b) /
  (c | d) /
  (e | f)
  

repex_plots

I am using ggplot2 version 3.4.2 and R version 4.3.0. Judging by the ggplot2 NEWS file, it doesn't seem like this was a bug fixed in the 3.4.3 patch, so it should be reproducible in the newest version as well. I spent some time searching for other posts about this issue and did not find any, but I apologize if I just missed any such posts.

Best,
Isaac

@isaacvock
Copy link
Author

Realized I managed to miss the very recent issue post describing the same problem: #5396

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant