New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support case_when function inside mutate without .$ #1965

Closed
kanaugust opened this Issue Jun 25, 2016 · 8 comments

Comments

Projects
None yet
5 participants
@kanaugust
Copy link

kanaugust commented Jun 25, 2016

This issue has already been discussed in #631 but it's been already closed, so I'd like to bring this up as an issue/enhancement hoping to get it supported.

Here is a sample data:

a <- data_frame(price = c(1000, 500, 100, 10))
> a
Source: local data frame [4 x 1]

  price
  <dbl>
1  1000
2   500
3   100
4    10

And the following command doesn't work.

a %>% mutate(category = case_when(price > 900 ~ "Super Expensive", 
                                  price >= 500 ~ "Expensive", 
                                  price >= 100 ~ "Mild",
                                  TRUE ~ "Cheap"))
Error: object 'price' not found

And, the following command with .$ would work.

a %>% mutate(category = case_when(.$price > 900 ~ "Super Expensive", 
                                  .$price >= 500 ~ "Expensive", 
                                  .$price >= 100 ~ "Mild",
                                  TRUE ~ "Cheap"))
Source: local data frame [4 x 2]

  price        category
  <dbl>           <chr>
1  1000 Super Expensive
2   500       Expensive
3   100            Mild
4    10           Cheap

It would be nice to have case_when function works without .$ notation.

@M-E-Rademaker

This comment has been minimized.

Copy link

M-E-Rademaker commented Jun 28, 2016

agree. I was just going to request the same thing.

@krlmlr

This comment has been minimized.

Copy link
Member

krlmlr commented Jun 29, 2016

Looks like a duplicate of #1719.

@hadley hadley closed this Jun 29, 2016

@bschneidr

This comment has been minimized.

Copy link

bschneidr commented Mar 3, 2017

Although this issue (#1965) and #1719 were closed, the problem of using case_when() within mutate() remains. For example, the following still doesn't work:

iris %>%
    mutate(versicolor_or_virginica = case_when(Species == "versicolor" ~ TRUE,
                                               Species == "virginica" ~ TRUE,
                                               TRUE ~ FALSE))

and the user must instead do the following:

iris %>%
    mutate(versicolor_or_virginica = case_when(.$Species == "versicolor" ~ TRUE,
                                               .$Species == "virginica" ~ TRUE,
                                               TRUE ~ FALSE))

This is true for the latest versions of dplyr available on CRAN and Github.

@krlmlr

This comment has been minimized.

Copy link
Member

krlmlr commented Mar 3, 2017

@bschneidr: Strange, this works for me with a recent-ish GitHub version. Could you please submit your output of reprex::reprex(si = TRUE)?

@bschneidr

This comment has been minimized.

Copy link

bschneidr commented Mar 3, 2017

Absolutely (and thanks for introducing me to reprex!)

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
iris %>% mutate(versicolor_or_virginica = case_when(Species == "versicolor" ~ 
  TRUE, Species == "virginica" ~ TRUE, TRUE ~ FALSE))
#> Error in mutate_impl(.data, dots): object 'Species' not found
Session info
devtools::session_info()
#> Session info --------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.3.2 (2016-10-31)
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United States.1252  
#>  tz       America/New_York            
#>  date     2017-03-03
#> Packages ------------------------------------------------------------------
#>  package    * version date       source        
#>  assertthat   0.1     2013-12-06 CRAN (R 3.3.1)
#>  backports    1.0.5   2017-01-18 CRAN (R 3.3.2)
#>  DBI          0.5-1   2016-09-10 CRAN (R 3.3.1)
#>  devtools     1.12.0  2016-06-24 CRAN (R 3.3.1)
#>  digest       0.6.12  2017-01-27 CRAN (R 3.3.2)
#>  dplyr      * 0.5.0   2016-06-24 CRAN (R 3.3.2)
#>  evaluate     0.10    2016-10-11 CRAN (R 3.3.1)
#>  formatR      1.4     2016-05-09 CRAN (R 3.3.0)
#>  htmltools    0.3.5   2016-03-21 CRAN (R 3.3.1)
#>  knitr        1.15.1  2016-11-22 CRAN (R 3.3.2)
#>  lazyeval     0.2.0   2016-06-12 CRAN (R 3.3.2)
#>  magrittr     1.5     2014-11-22 CRAN (R 3.3.1)
#>  memoise      1.0.0   2016-01-29 CRAN (R 3.3.1)
#>  R6           2.2.0   2016-10-05 CRAN (R 3.3.1)
#>  Rcpp         0.12.9  2017-01-14 CRAN (R 3.3.2)
#>  rmarkdown    1.3     2016-12-21 CRAN (R 3.3.2)
#>  rprojroot    1.2     2017-01-16 CRAN (R 3.3.2)
#>  stringi      1.1.2   2016-10-01 CRAN (R 3.3.1)
#>  stringr      1.2.0   2017-02-18 CRAN (R 3.3.2)
#>  tibble       1.2     2016-08-26 CRAN (R 3.3.1)
#>  withr        1.0.2   2016-06-20 CRAN (R 3.3.0)
#>  yaml         2.1.14  2016-11-12 CRAN (R 3.3.2)
@krlmlr

This comment has been minimized.

Copy link
Member

krlmlr commented Mar 3, 2017

But this has been fixed only in the GitHub version, can you please try this as well?

@bschneidr

This comment has been minimized.

Copy link

bschneidr commented Mar 3, 2017

I've re-installed the GitHub version using the following code:

devtools::install_github("hadley/dplyr")

and the iris example worked (see the reprex below).

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
iris %>% 
as_tibble() %>% 
  mutate(versicolor_or_virginica = case_when(Species == "versicolor" ~ TRUE, 
                                             Species == "virginica" ~ TRUE, 
                                             TRUE ~ FALSE))
#> # A tibble: 150 × 6
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>           <dbl>       <dbl>        <dbl>       <dbl>  <fctr>
#> 1           5.1         3.5          1.4         0.2  setosa
#> 2           4.9         3.0          1.4         0.2  setosa
#> 3           4.7         3.2          1.3         0.2  setosa
#> 4           4.6         3.1          1.5         0.2  setosa
#> 5           5.0         3.6          1.4         0.2  setosa
#> 6           5.4         3.9          1.7         0.4  setosa
#> 7           4.6         3.4          1.4         0.3  setosa
#> 8           5.0         3.4          1.5         0.2  setosa
#> 9           4.4         2.9          1.4         0.2  setosa
#> 10          4.9         3.1          1.5         0.1  setosa
#> # ... with 140 more rows, and 1 more variables:
#> #   versicolor_or_virginica <lgl>
Session info
devtools::session_info()
#> Session info --------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.3.2 (2016-10-31)
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United States.1252  
#>  tz       America/New_York            
#>  date     2017-03-03
#> Packages ------------------------------------------------------------------
#>  package    * version    date       source                       
#>  assertthat   0.1        2013-12-06 CRAN (R 3.3.1)               
#>  backports    1.0.5      2017-01-18 CRAN (R 3.3.2)               
#>  bindr        0.1        2016-11-13 CRAN (R 3.3.2)               
#>  bindrcpp   * 0.1        2016-12-11 CRAN (R 3.3.2)               
#>  DBI          0.5-1      2016-09-10 CRAN (R 3.3.1)               
#>  devtools     1.12.0     2016-06-24 CRAN (R 3.3.1)               
#>  digest       0.6.12     2017-01-27 CRAN (R 3.3.2)               
#>  dplyr      * 0.5.0.9000 2017-03-03 Github (hadley/dplyr@8430adc)
#>  evaluate     0.10       2016-10-11 CRAN (R 3.3.1)               
#>  formatR      1.4        2016-05-09 CRAN (R 3.3.0)               
#>  htmltools    0.3.5      2016-03-21 CRAN (R 3.3.1)               
#>  knitr        1.15.1     2016-11-22 CRAN (R 3.3.2)               
#>  lazyeval     0.2.0      2016-06-12 CRAN (R 3.3.2)               
#>  magrittr     1.5        2014-11-22 CRAN (R 3.3.1)               
#>  memoise      1.0.0      2016-01-29 CRAN (R 3.3.1)               
#>  R6           2.2.0      2016-10-05 CRAN (R 3.3.1)               
#>  Rcpp         0.12.9     2017-01-14 CRAN (R 3.3.2)               
#>  rmarkdown    1.3        2016-12-21 CRAN (R 3.3.2)               
#>  rprojroot    1.2        2017-01-16 CRAN (R 3.3.2)               
#>  stringi      1.1.2      2016-10-01 CRAN (R 3.3.1)               
#>  stringr      1.2.0      2017-02-18 CRAN (R 3.3.2)               
#>  tibble       1.2        2016-08-26 CRAN (R 3.3.1)               
#>  withr        1.0.2      2016-06-20 CRAN (R 3.3.0)               
#>  yaml         2.1.14     2016-11-12 CRAN (R 3.3.2)

Somehow it didn't work earlier this morning when I installed dplyr from GitHub using what I think was the same code, but I'm not sure why (I've tried and can't reproduce the error). Maybe somehow I installed dplyr from GitHub without bindrcpp being installed along with it, although I'm not sure how this could've happened. In any case, I think it's probably safe to say that the GitHub version is working as intended.

I'm sorry if this has needlessly disrupted your day.

@krlmlr

This comment has been minimized.

Copy link
Member

krlmlr commented Mar 3, 2017

No worries, thanks for your feedback.

@lock lock bot locked as resolved and limited conversation to collaborators Jun 8, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.