You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#I found a problem when making bar plots with geom_bar(), geom_col(), and geom_histogram() and using scale_x_discrete(drop=FALSE) to retain x-axis extent of missing levels of a factor.
When passed a data frame with a factor column with several possible levels, some instances of which have been filtered out of the data to be visualized, the scaling of bar widths is inconsistent. If a single factor level is present in the data, and all other levels are missing, the width of the one bar is what it would be if all factor levels are present (that is, it is skinny enough to not overlap with the x-axis extent of bars for other factor levels); this was the expected and desired behavior.
If more than one factor level is present and some are missing, the bars can start to scale so as to force the bars of the present levels to come in close proximity, as if they are adjacent levels; this was unexpected behavior. This problem disappears though if any of the set of present factor levels are adjacent.
Here is the code to reproduce the bug:
# Load dependencies
library(ggplot2)
library(dplyr)
library(patchwork)
# Toy datadata<- tibble(x= as.factor(1:10),
y=10)
## Only one factor level present in data# Bar width is scaled the same as if all levels were present; EXPECTEDa<- ggplot(data %>% filter(x==5),
aes(x=x, y=y)) +
scale_x_discrete(limits= levels(data$x), drop=FALSE) +
geom_bar(stat="identity")
## 2 factor levels present, 8 missing# Bar width is larger than expected, overlapping area that would be occupied by other factor level bars; NOT EXPECTEDb<- ggplot(data %>% filter(x%in% c(5, 8)),
aes(x=x, y=y)) +
scale_x_discrete(drop=FALSE) +
geom_bar(stat="identity")
## 2 factor levels present, 8 missing, played around with position_dodge2()## Also tried preserve = "total" and position_dodge() with similar preserve args# Bar width is larger than expected, overlapping area that would be occupied by other factor level bars; NOT EXPECTEDc<- ggplot(data %>% filter(x%in% c(5, 8)),
aes(x=x, y=y)) +
scale_x_discrete(drop=FALSE) +
geom_bar(stat="identity",
position= position_dodge2(preserve="single"))
## 2 factor levels present, 8 missing# Shows that behavior is to scale bar widths so that bars for present factor levels meet; NOT EXPECTEDd<- ggplot(data %>% filter(x%in% c(2, 8)),
aes(x=x, y=y)) +
scale_x_discrete(drop=FALSE) +
geom_col()
## 2 adjacent factor levels present, 8 missing# Bar widths are scaled as if other levels present; EXPECTEDe<- ggplot(data %>% filter(x%in% c(9, 10)),
aes(x=x, y=y)) +
scale_x_discrete(drop=FALSE) +
geom_col()
## 2 adjacent factor levels present and 1 non-adjacent factor level present# Bar widths are scaled as if other levels present; EXPECTEDf<- ggplot(data %>% filter(x%in% c(3, 4, 9)),
aes(x=x, y=y)) +
scale_x_discrete(drop=FALSE) +
geom_col()
(a|b) /
(c|d) /
(e|f)
I am using ggplot2 version 3.4.2 and R version 4.3.0. Judging by the ggplot2 NEWS file, it doesn't seem like this was a bug fixed in the 3.4.3 patch, so it should be reproducible in the newest version as well. I spent some time searching for other posts about this issue and did not find any, but I apologize if I just missed any such posts.
Best,
Isaac
The text was updated successfully, but these errors were encountered:
#I found a problem when making bar plots with
geom_bar()
,geom_col()
, andgeom_histogram()
and usingscale_x_discrete(drop=FALSE)
to retain x-axis extent of missing levels of a factor.When passed a data frame with a factor column with several possible levels, some instances of which have been filtered out of the data to be visualized, the scaling of bar widths is inconsistent. If a single factor level is present in the data, and all other levels are missing, the width of the one bar is what it would be if all factor levels are present (that is, it is skinny enough to not overlap with the x-axis extent of bars for other factor levels); this was the expected and desired behavior.
If more than one factor level is present and some are missing, the bars can start to scale so as to force the bars of the present levels to come in close proximity, as if they are adjacent levels; this was unexpected behavior. This problem disappears though if any of the set of present factor levels are adjacent.
Here is the code to reproduce the bug:
I am using ggplot2 version 3.4.2 and R version 4.3.0. Judging by the ggplot2 NEWS file, it doesn't seem like this was a bug fixed in the 3.4.3 patch, so it should be reproducible in the newest version as well. I spent some time searching for other posts about this issue and did not find any, but I apologize if I just missed any such posts.
Best,
Isaac
The text was updated successfully, but these errors were encountered: