You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is expected that dplyr::compute() will perform the calculation on the arrow dplyr query and convert it to a Table, but it does not seem to work correctly for grouped arrow dplyr queries and does not result in a Table.
SHIMA Tatsuya / @eitsupi:
Ah, is this the intended behavior?
I didn't understand why this behavior was intended, I think compute should return a Table here, just as dbplyr and dtplyr do.
Neal Richardson / @nealrichardson:
They are evaluated and converted to Tables, but then if there are groups, group_by is called on the Table, which results in an arrow_dplyr_query object containing the Table. So, yes, this was intentional. Do you have a use case where this is a problem?
SHIMA Tatsuya / @eitsupi:
I think it is confusing to users when compute does not result in a Table as intended when the group is left after summarise, etc. is executed.
mtcars|>arrow::arrow_table() |>dplyr::group_by(vs, am) |>dplyr::summarise(wt= mean(wt)) |>dplyr::compute()
#> Table (query)#> vs: double#> am: double#> wt: double#>#> * Grouped by vs#> See $.data for the source Arrow object
It is expected that
dplyr::compute()
will perform the calculation on the arrow dplyr query and convert it to a Table, but it does not seem to work correctly for grouped arrow dplyr queries and does not result in a Table.as_arrow_table()
works fine.It seems to revert to arrow dplyr query in the following line.
arrow/r/R/dplyr-collect.R
Lines 73 to 75 in 7cfdfbb
Reporter: SHIMA Tatsuya / @eitsupi
Assignee: SHIMA Tatsuya / @eitsupi
Related issues:
collect()
(is related to)PRs and other links:
Note: This issue was originally created as ARROW-17738. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: