-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add_count() and add_tally() should error if output column name is already used #4284
Comments
It seems the behavior of
Now, though, under 0.8.0.1, it overwrites the
I like the new option to specify a column name (!) but it seems that existing code that relied on the |
Ran into the problem @nschmucker brings up. I went looking for a mention in |
Now: library(dplyr, warn.conflicts = FALSE)
df <- tibble(g = c(1, 1, 2), n = c(1, 1, 10))
df %>% count(g)
#> Using `n` as weighting variable
#> Error: Column 'n' is already present in output
#> * Use `name = "new_name"` to pick a new name
df %>% add_count(g)
#> Using `n` as weighting variable
#> Error: Column 'n' is already present in output
#> * Use `name = "new_name"` to pick a new name
# Create new variable
df %>% count(g, name = "nn")
#> Using `n` as weighting variable
#> # A tibble: 2 x 2
#> g nn
#> <dbl> <dbl>
#> 1 1 2
#> 2 2 10
df %>% add_count(g, name = "nn")
#> Using `n` as weighting variable
#> # A tibble: 3 x 3
#> g n nn
#> <dbl> <dbl> <dbl>
#> 1 1 1 2
#> 2 1 1 2
#> 3 2 10 10
# Overide existing variable
df %>% count(g, name = "n")
#> Using `n` as weighting variable
#> # A tibble: 2 x 2
#> g n
#> <dbl> <dbl>
#> 1 1 2
#> 2 2 10
df %>% add_count(g, name = "n")
#> Using `n` as weighting variable
#> # A tibble: 3 x 2
#> g n
#> <dbl> <dbl>
#> 1 1 2
#> 2 1 2
#> 3 2 10 Created on 2019-12-31 by the reprex package (v0.3.0) (and similarly for |
I'm wondering if
add_count()
andadd_tally()
should throw an error in case that the output column name (default "n") is already used in a pre-existing column. Otherwise it ends up doing some odd behaviour. For instance:Do we want n to be used as weighting variable? This behaviour seems unexpected to me.
In this case,
add_count()
silently replaces the pre-existing column "n" with its output, which is likely not the user's intent.EDIT: I see that the behaviour with
add_count
is already discussed here. Fair enough on that front.The text was updated successfully, but these errors were encountered: