-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Referencing other columns inside mutate/summarise across is broken in 1.0.4 #5734
Comments
I'm not sure why this worked previously. It seems odd that a function would be instrumented like this. OTOH, functions created from formulas are: library(dplyr, warn.conflicts = FALSE)
storms %>%
mutate(across(c(wind, pressure), ~ . / lat))
#> # A tibble: 10,010 x 13
#> name year month day hour lat long status category wind pressure
#> <chr> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <chr> <ord> <dbl> <dbl>
#> 1 Amy 1975 6 27 0 27.5 -79 tropi… -1 0.909 36.8
#> 2 Amy 1975 6 27 6 28.5 -79 tropi… -1 0.877 35.5
#> 3 Amy 1975 6 27 12 29.5 -79 tropi… -1 0.847 34.3
#> 4 Amy 1975 6 27 18 30.5 -79 tropi… -1 0.820 33.2
#> 5 Amy 1975 6 28 0 31.5 -78.8 tropi… -1 0.794 32.1
#> 6 Amy 1975 6 28 6 32.4 -78.7 tropi… -1 0.772 31.2
#> 7 Amy 1975 6 28 12 33.3 -78 tropi… -1 0.751 30.4
#> 8 Amy 1975 6 28 18 34 -77 tropi… -1 0.882 29.6
#> 9 Amy 1975 6 29 0 34.4 -75.8 tropi… 0 1.02 29.2
#> 10 Amy 1975 6 29 6 34 -74.8 tropi… 0 1.18 29.5
#> # … with 10,000 more rows, and 2 more variables: ts_diameter <dbl>,
#> # hu_diameter <dbl> Created on 2021-02-04 by the reprex package (v0.3.0) |
Sometimes it is useful to change the value of a set of columns (the columns inside the across statement), depending on the value of other columns for the corresponding rows. |
Actually, this can be dealt with in 1.0.4 by doing the following
Now This issue can be closed |
Reopening because the example in the comment above does not work in 1.0.4 |
addresses a change in dplyr 1.0.4 where newly defined columns in a mutate call cannot be accessed in anonymous functions inside of across(). see tidyverse/dplyr#5734.
addresses a change in dplyr 1.0.4 where newly defined columns in a mutate call cannot be accessed in anonymous functions inside of across(). see tidyverse/dplyr#5734.
In dplyr 1.0.3 you can reference other columns in the same data-frame/tibble group by name. This functionality is broken in 1.0.4
To reproduce, the following example works in 1.0.3
In 1.0.4, running the above example results in the following error message
i.e. we can't reference other columns by name. A possible workaround is to use
cur_data()$name_of_column
but this is slower as the following benchmark demonstrates:which results in the following output
TL;DR In dply 1.0.3, using
cur_data()$column_name
to reference columns instead of directly using the column names can be considerably slower. In 1.0.4 referencing columns by name, not using cur_data, is currently broken.The text was updated successfully, but these errors were encountered: