-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-17439: [R] Change behavior of pull to compute instead of collect #14330
Conversation
|
This is due to ARROW-17738. |
On the CHECK issue, IIRC we don't Lines 75 to 79 in 418f115
|
Thanks for the link @eitsupi that change fixes the issue I was having. Re: @wjones127's comment:
I removed the export directive but I would eventually like to understand why the other exported functions in Both the issues I raised are resolved and this can be reviewed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One suggestion but otherwise LGTM, thanks!
r/R/arrow-tabular.R
Outdated
@@ -259,3 +259,7 @@ na.omit.ArrowTabular <- function(object, ...) { | |||
|
|||
#' @export | |||
na.exclude.ArrowTabular <- na.omit.ArrowTabular | |||
|
|||
pull.ArrowTabular <- function(x, var = -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd move this to dplyr-collect.R to keep it with the other pull() method definitions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Done in 9c8691b.
As I mentioned in JIRA, do you implement an argument like |
Personally I don't think so, and as I previously commented on the jira, I don't think the consistency argument is quite so simple because arrow is different from querying remote databases with dbplyr. |
Since the dplyr documentation says that pull returns a R vector, I think the documentation should at least explain that |
No longer needed thanks to apache#14160 and @eitsupi's work
Thanks the look @nealrichardson. This should be good to merge now. |
|
Benchmark runs are scheduled for baseline = 093a4fe and contender = 20626f8. 20626f8 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
['Python', 'R'] benchmarks have high level of regressions. |
I could use some help with a couple of things:
devtools::check
warns aboutpull.ArrowTabluar
being undocumented. I@export
ed it to stay consistent with other ArrowTabular generics defined inarrow-tabular.R
and don't understand why checking doesn't warn on all of these. Does this one just not need exporting?Relevant devtools::check() output
I found an inconsistency with
dplyr::pull
and Tables: Pulling an ungrouped Table produces a ChunkedArray whereas pulling a grouped Table produces a Table. This makes a subsequent call toas.vector
produce an error ofError in as.vector(x, mode) : cannot coerce type 'environment' to vector of type 'any'
Example of the difference