Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] transmute() fails after pulling data into R #28943

Closed
asfimport opened this issue Jul 6, 2021 · 5 comments
Closed

[R] transmute() fails after pulling data into R #28943

asfimport opened this issue Jul 6, 2021 · 5 comments

Comments

@asfimport
Copy link

NOTE: This issue was originally named "[R] Support namespacing of function maps?". @ianmcook  renamed it to reflect the resolution described in the comments below.

With our NSE function map setup, we translate various functions into Arrow equivalents:

library(arrow)
library(dplyr)

arrow_table <- Table$create(tibble::tibble(
  v = c("A", "B", "C"),
  w = c("a", "b", "c"),
  x = c("d", NA_character_, "f"),
  y = c(NA_character_, "h", "i"),
  z = c(1.1, 2.2, NA)
))

arrow_table %>%
  transmute(str_c(x, y, sep = " ")) %>%
  collect()
#> # A tibble: 3 x 1
#>   `str_c(x, y, sep = " ")`
#>   <chr>                   
#> 1 <NA>                    
#> 2 <NA>                    
#> 3 f i

Which is great, however if the code that is being used uses the sringr:: namespace prefix this errors:

arrow_table %>%
  transmute(stringr::str_c(x, y, sep = " ")) %>%
  collect()
#> Warning: Expression stringr::str_c(x, y, sep = " ") not supported in Arrow;
#> pulling data into R
#> Error: Problem with `mutate()` input `..1`.
#>  `..1 = ..1`.
#> x object 'x' not found

Should we support this (basically also translate stringr::str_c() to what we have for str_c())? Or warn/error more clearly what happened if we don't support any namespace prefixed functions? Something else?

Reporter: Jonathan Keane / @jonkeane
Assignee: Ian Cook / @ianmcook

PRs and other links:

Note: This issue was originally created as ARROW-13262. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Ian Cook / @ianmcook:
This same problem occurs in dbplyr in and other non-data.frame dplyr interface packages. For example, functions from the stringr and lubridate packages don't work with dbplyr if you include stringr:: or lubridate:: before their names.

I think it's fine to close this as "won't fix" at least for now. It's a general limitation affecting dplyr when used on objects that are not R data frames.

Alternatively, we could try to fix this, like I do in tidyquery: When the data object is something besides an R data frame, tidyquery calls this unscope_expression function to remove the package:: prefix by manipulating the AST: https://github.com/ianmcook/tidyquery/blob/master/R/unscope.R My 2 cents: I don't think this is worth the effort right now.

@asfimport
Copy link
Author

Jonathan Keane / @jonkeane:
Yeah, it's certainly a bit of an edge case, and might not be worth unscoping or trying to error more gracefully. But what I find a bit surprising is that I expected (in this case, at least) that when it pulled into R it would evaluate correctly since at that point it should be standard dplyr-on-data.frame.

This works:

> arrow_table %>%
+   collect() %>% 
+   transmute(stringr::str_c(x, y, sep = " "))
# A tibble: 3 x 1
  `stringr::str_c(x, y, sep = " ")`
  <chr>                            
1 NA                               
2 NA                               
3 f i   

But if transmute is above the collect it does not.

@asfimport
Copy link
Author

Ian Cook / @ianmcook:
Oh, right, the expectation is that it should warn (not supported in Arrow; pulling data into R) but then succeed, not error. Strange.

@asfimport
Copy link
Author

Ian Cook / @ianmcook:
I found out why the failure was happening after the data is pulled into R. It's because the transmute function was not defusing the dots arguments before passing them to mutate. I will open a PR to resolve this.

@asfimport
Copy link
Author

Ian Cook / @ianmcook:
Issue resolved by pull request 10672
#10672

@asfimport asfimport added this to the 5.0.0 milestone Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants