Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stat_count gives cryptic error when used on a column of doubles #4609

Closed
arencambre opened this issue Sep 8, 2021 · 7 comments · Fixed by #5520
Closed

stat_count gives cryptic error when used on a column of doubles #4609

arencambre opened this issue Sep 8, 2021 · 7 comments · Fixed by #5520
Labels
bug an unexpected problem or unintended behavior layers 📈

Comments

@arencambre
Copy link

Run this trivial code (csv is attached):

library(tidyverse)

data <- read_csv("data.csv")

data %>%
  ggplot(aes(x=Tenure)) +
  geom_bar()

You get this warning:

Warning message:
Computation failed in `stat_count()`:
Elements must equal the number of rows or 1 

This is the resulting plot:
image

The data:
data.csv

Expected behavior: it should "just work". All the data in this tibble is just doubles. I reviewed the geom_bar documentation, and I see no contraindications for this working.

If I have done something wrong here, then this becomes a feature request for a useful error message or improved documentation.

@clauswilke

This comment was marked as outdated.

@arencambre

This comment was marked as outdated.

@arencambre

This comment was marked as outdated.

@hadley
Copy link
Member

hadley commented Mar 15, 2022

Minimal reprex:

library(ggplot2)

df <- data.frame(x = rep(c(1, 2), 5) + rep(c(0, -2.220446e-16), c(4, 1)))
df
#>    x
#> 1  1
#> 2  2
#> 3  1
#> 4  2
#> 5  1
#> 6  2
#> 7  1
#> 8  2
#> 9  1
#> 10 2
ggplot(df, aes(x)) + geom_bar()
#> Warning: Computation failed in `stat_count()`:
#> Elements must equal the number of rows or 1

Created on 2022-03-15 by the reprex package (v2.0.1)

Since this seems like a FP buglet I think it's worth taking a bit of a look to see what's going wrong.

@hadley hadley added bug an unexpected problem or unintended behavior layers 📈 labels Mar 15, 2022
@yutannihilation
Copy link
Member

It seems the problem is that the criteria of the "same" value differ between vctrs::vec_unique() (which is used in unique0()) and as.factor() (in tapply()).

ggplot2/R/stat-count.r

Lines 79 to 89 in a979ffd

count <- as.numeric(tapply(weight, x, sum, na.rm = TRUE))
count[is.na(count)] <- 0
bars <- data_frame0(
count = count,
prop = count / sum(abs(count)),
x = sort(unique0(x)),
width = width,
flipped_aes = flipped_aes,
.size = length(count)
)

df <- data.frame(x = rep(c(1, 2), 5) + rep(c(0, -2.220446e-16), c(4, 1)))
ggplot2:::unique0(df$x)
#> [1] 1 2 1 2

tapply(rep(1, times = nrow(df)), df$x, sum, na.rm = TRUE)
#> 1 2 
#> 5 5
as.factor(df$x)
#>  [1] 1 2 1 2 1 2 1 2 1 2
#> Levels: 1 2

Created on 2022-07-23 by the reprex package (v2.0.1)

@teunbrand
Copy link
Collaborator

In such case, should we add a tolerance or treat them as unequal? If treated as unequal, we could replace the tapply() by rowsum().

@hadley
Copy link
Member

hadley commented Nov 13, 2023

I'd say we follow whatever vec_unique does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior layers 📈
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants