New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`geom_bar` inconsistently handling date values #2047

Open
nteetor opened this Issue Feb 15, 2017 · 7 comments

Comments

Projects
None yet
5 participants
@nteetor
Contributor

nteetor commented Feb 15, 2017

Description

There are unexpected plot results when specifying the fill aesthetic in geom_bar when the x aesthetic is a Date or POSIXct value. Below I have listed examples using Date and POSIXct objects, respectively. Similar data represented with these two classes results in rather different plots, see below.

Date Examples

The following examples use Date objects produced with make_date().

1 month with 2 fill values

In this example January fill values are TRUE and FALSE, February and March fill values are only FALSE.

library(ggplot2)
library(lubridate)

d1 <- data.frame(
  dates = rep(make_date(year = 2017, month = 1:3), each = 2),
  highlight = c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE)
)

# this unexpectedly changes the width of one of January's bars
ggplot(d1, aes(x = dates)) +
  geom_bar(aes(fill = highlight))
## Warning: position_stack requires non-overlapping x intervals

unnamed-chunk-1-1

1 month with distinct fill value

In this example January fill values are only TRUE, both February and March are only FALSE.

library(ggplot2)
library(lubridate)

d2 <- data.frame(
  dates = make_date(year = 2017, month = 1:3),
  highlight = c(TRUE, FALSE, FALSE)
)

# only one bar for January, but as above it is thin
ggplot(d2, aes(x = dates)) +
  geom_bar(aes(fill = highlight))

unnamed-chunk-2-1

3+ fill values

This example introduces a third fill value to the highlight column.

library(ggplot2)
library(lubridate)

d3 <- data.frame(
  dates = make_date(year = 2017, month = 1:3),
  highlight = c(1, 2, 3)
)

# all bars are now equally thin, additional breaks added to x-axis
ggplot(d3, aes(x = dates)) +
  geom_bar(aes(fill = factor(highlight)))

unnamed-chunk-3-1

POSIXct Examples

The following examples use POSIXct objects produced with make_datetime().

1 month with distinct fill value

In this example, when January fill is only FALSE, February and March are only TRUE, the January bar does not show. The bar may be too thin to see.

library(ggplot2)
library(lubridate)

d4 <- data.frame(
  datetimes = make_datetime(year = 2017, month = 1:3),
  highlight = c(TRUE, FALSE, FALSE)
)

# January bar is no longer thin, instead missing 
ggplot(d4, aes(x = datetimes)) +
  geom_bar(aes(fill = highlight))

unnamed-chunk-4-1

3+ fill values

Similar the 3+ fill value example above, however in this example the bars are not equally thin, they are all missing.

library(ggplot2)
library(lubridate)

d5 <- data.frame(
  datetimes = make_datetime(year = 2017, month = 1:3),
  highlight = c(1, 2, 3)
)

# no bars
ggplot(d5, aes(x = datetimes)) +
  geom_bar(aes(fill = factor(highlight)))

unnamed-chunk-5-1

Conclusion

The geom_bar fill aesthetic is not properly handling Date and POSIXct objects. I am not weighing in yet on whether bars ought to be thin or wide, rather I am hoping to iron out geom_bar so Date and POSIXct values are handled consistently. I hope I have not misunderstood how geom_bar is intended to handle date values.

I will look under the hood to try and identify the problem. For now I am not sure of a work around.

@hadley

This comment was marked as outdated.

Show comment
Hide comment
@hadley

hadley Feb 15, 2017

Member

Looks good - thanks. In the first plot, also note that the gap between the bars is not constant.

(I've deleted the comment thread since it's now no longer relevant)

Member

hadley commented Feb 15, 2017

Looks good - thanks. In the first plot, also note that the gap between the bars is not constant.

(I've deleted the comment thread since it's now no longer relevant)

@nteetor

This comment has been minimized.

Show comment
Hide comment
@nteetor

nteetor Feb 17, 2017

Contributor

I came across a case which touches upon the first plot's inconsistent gaps you pointed out.

library(ggplot2)
library(lubridate)

d6 <- data.frame(
  dates = rep(make_date(year = 2017, month = 1:3), each = 2),
  highlight = c(TRUE, FALSE, TRUE, FALSE, FALSE, FALSE)
)

ggplot(d6, aes(x = dates)) +
  geom_bar(aes(fill = highlight))

unnamed-chunk-4-1

I attempted to calculate the width ahead of time using resolution() and the full dates column. The inconsistent gaps issue cropped up again.

# using d6 data frame from above
ggplot(d6, aes(x = dates)) +
  geom_bar(aes(fill = highlight), width = resolution(as.numeric(d6$dates)))

unnamed-chunk-4-2

Contributor

nteetor commented Feb 17, 2017

I came across a case which touches upon the first plot's inconsistent gaps you pointed out.

library(ggplot2)
library(lubridate)

d6 <- data.frame(
  dates = rep(make_date(year = 2017, month = 1:3), each = 2),
  highlight = c(TRUE, FALSE, TRUE, FALSE, FALSE, FALSE)
)

ggplot(d6, aes(x = dates)) +
  geom_bar(aes(fill = highlight))

unnamed-chunk-4-1

I attempted to calculate the width ahead of time using resolution() and the full dates column. The inconsistent gaps issue cropped up again.

# using d6 data frame from above
ggplot(d6, aes(x = dates)) +
  geom_bar(aes(fill = highlight), width = resolution(as.numeric(d6$dates)))

unnamed-chunk-4-2

@jensenerik0

This comment was marked as outdated.

Show comment
Hide comment
@jensenerik0

jensenerik0 Aug 2, 2017

I think I have a minimal example of the same bug that doesn't use dates.

library(ggplot2)

x <- c(0, 0, 2, 1)
x_shift <- x+1
y_bool <- c(TRUE, TRUE, TRUE, FALSE)

df <- data.frame(
  x, x_shift, y_bool
  )

#This plot returns:
#Warning message:
#position_stack requires non-overlapping x intervals
ggplot(data=df,aes(x=x,fill=y_bool))+geom_bar()

image

#This plot gives no error and looks as expected
ggplot(data=df,aes(x=x_shift,fill=y_bool))+geom_bar()

image

Manually setting "width" in geom_bar can make the issue appear or disappear in both cases.

jensenerik0 commented Aug 2, 2017

I think I have a minimal example of the same bug that doesn't use dates.

library(ggplot2)

x <- c(0, 0, 2, 1)
x_shift <- x+1
y_bool <- c(TRUE, TRUE, TRUE, FALSE)

df <- data.frame(
  x, x_shift, y_bool
  )

#This plot returns:
#Warning message:
#position_stack requires non-overlapping x intervals
ggplot(data=df,aes(x=x,fill=y_bool))+geom_bar()

image

#This plot gives no error and looks as expected
ggplot(data=df,aes(x=x_shift,fill=y_bool))+geom_bar()

image

Manually setting "width" in geom_bar can make the issue appear or disappear in both cases.

@hadley

This comment has been minimized.

Show comment
Hide comment
@hadley

hadley May 9, 2018

Member

Minimal reprex:

library(ggplot2)

df <- data.frame(
  x = c(0, 0, 2, 1), 
  fill = c(TRUE, TRUE, TRUE, FALSE)
)

ggplot(df, aes(x, fill = fill)) + geom_bar()
#> Warning: position_stack requires non-overlapping x intervals

ggplot(df, aes(x, fill = fill)) + geom_bar(width = 0.9)

I think the root cause is that stat_count() is computing the width based on the resolution of an individual group rather than the full dataset.

Member

hadley commented May 9, 2018

Minimal reprex:

library(ggplot2)

df <- data.frame(
  x = c(0, 0, 2, 1), 
  fill = c(TRUE, TRUE, TRUE, FALSE)
)

ggplot(df, aes(x, fill = fill)) + geom_bar()
#> Warning: position_stack requires non-overlapping x intervals

ggplot(df, aes(x, fill = fill)) + geom_bar(width = 0.9)

I think the root cause is that stat_count() is computing the width based on the resolution of an individual group rather than the full dataset.

@thibautjombart

This comment has been minimized.

Show comment
Hide comment
@thibautjombart

thibautjombart May 22, 2018

For what it is worth, this bugs results in issues with the RECON package incidence:

library(incidence)

set.seed(1)

dates <- as.Date("2018-01-01") + sample(1:20, 100, replace = TRUE)
dates_posix <- as.POSIXct(dates)

plot(incidence(dates))

plot(incidence(dates_posix))

thibautjombart commented May 22, 2018

For what it is worth, this bugs results in issues with the RECON package incidence:

library(incidence)

set.seed(1)

dates <- as.Date("2018-01-01") + sample(1:20, 100, replace = TRUE)
dates_posix <- as.POSIXct(dates)

plot(incidence(dates))

plot(incidence(dates_posix))

@hadley

This comment has been minimized.

Show comment
Hide comment
@hadley

hadley May 22, 2018

Member

To be clear, are you stating that this is not a problem with the released version of ggplot2?

Member

hadley commented May 22, 2018

To be clear, are you stating that this is not a problem with the released version of ggplot2?

@woodwards

This comment has been minimized.

Show comment
Hide comment
@woodwards

woodwards Aug 8, 2018

It seem to be an issue with geom_col and geom_bar, not dates or facets as such.

library(ggplot2)
ggplot(mtcars) +
  geom_col(mapping=aes(x=disp, y=mpg)) +
  geom_point(mapping=aes(x=disp, y=mpg))

Created on 2018-08-08 by the reprex package (v0.2.0).

woodwards commented Aug 8, 2018

It seem to be an issue with geom_col and geom_bar, not dates or facets as such.

library(ggplot2)
ggplot(mtcars) +
  geom_col(mapping=aes(x=disp, y=mpg)) +
  geom_point(mapping=aes(x=disp, y=mpg))

Created on 2018-08-08 by the reprex package (v0.2.0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment