-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-17665: [R] Document dplyr and compute functionality #14387
Conversation
eb0c771
to
3e82f21
Compare
r/R/arrow-package.R
Outdated
group_by = NULL, | ||
groups = NULL, | ||
group_vars = NULL, | ||
group_by_drop_default = NULL, | ||
ungroup = NULL, | ||
mutate = NULL, | ||
mutate = "window functions not currently supported", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For users, it may be easier to explain that "mutate cannot be used after being grouped".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I'll find other words. group_by() |> mutate()
does work in some cases, just not if there are aggregations involved (like x - mean(x)
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few more words, let me know what you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is easier to understand. Thanks.
cbcae88
to
85e3bb4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One tiny nit, otherwise LGTM
group_by = NULL, | ||
groups = NULL, | ||
group_vars = NULL, | ||
group_by_drop_default = NULL, | ||
ungroup = NULL, | ||
mutate = NULL, | ||
mutate = c( | ||
"window functions (e.g. things that require aggregation within groups)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"window functions (e.g. things that require aggregation within groups)", | |
"window functions (i.e. things that require aggregation within groups)", |
nit, but only if window functions are literally defined by this, which I wasn't sure about
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Window functions are more than this, though this is in practice how you would trigger them inside mutate()
. I'll leave it as is (if for no other reason than to save the CI cycles). Who knows, maybe I can make this work for 11.0 and we can remove the note altogether ;)
Benchmark runs are scheduled for baseline = 8972ebd and contender = 81e1fbc. 81e1fbc is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
['Python', 'R'] benchmarks have high level of regressions. |
Includes ARROW-17666 and ARROW-17667
You can just review r/R/dplyr-funcs-doc.R; the rest of the diff is muddled by indentation changes when I added the
notes
to theregister_binding()
calls.