Skip to content

Conversation

@markromanmiller
Copy link
Contributor

@markromanmiller markromanmiller commented Mar 16, 2022

Created a simple version of rename_with that applies the function to the current names of the .data argument and passes the result to rename.

This was a quick writeup on my end and passed a few smoke tests. I believe all the functions I've used in the new code are from base R or current imports like rlang, and the magic in editing the arrow data query is handled by rename.

Let me know what next steps can be, like writing test cases, etc.

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@thisisnic
Copy link
Member

Hi @MrMallIronmaker - thanks for submitting this PR! Unit tests are definitely the way to go in terms of next steps. Let us know if you have any questions about any of our set-up!

@markromanmiller
Copy link
Contributor Author

I've written a few test cases for rename_with and hit the "todo" for using tidyselect with select, as well. I think this is good enough to be reviewed! Thanks.

Copy link
Member

@thisisnic thisisnic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this PR - it'll be great to have this functionality added!

A few minor comments and then also, please could you add an extra test where .cols doesn't end up selecting anything?

new_names <- do.call(.fn, list(old_names))
rename_args <- c(list(.data), as.list(old_names))
names(rename_args) <- c(".data", new_names)
do.call(rename, rename_args)
Copy link
Member

@thisisnic thisisnic Mar 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
do.call(rename, rename_args)
rename(.data, !!set_names(old_names, new_names))

Minor change here to make this more rlang style in keeping with the other code in the package + avoiding use of do.call().

rename.Dataset <- rename.ArrowTabular <- rename.RecordBatchReader <- rename.arrow_dplyr_query

rename_with.arrow_dplyr_query <- function(.data, .fn, .cols = everything(), ...) {
.fn <- rlang::as_function(.fn)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In arrow/r/R/arrow-package-R we have a load of rlang function listed with @importFrom roxygen tags. Could you add as_function there and then call it directly here?

rename_with.arrow_dplyr_query <- function(.data, .fn, .cols = everything(), ...) {
.fn <- rlang::as_function(.fn)
old_names <- names(select(.data, {{ .cols }}))
new_names <- do.call(.fn, list(old_names))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the do.call() here. Perhaps .fn(old_names) could work?

@markromanmiller
Copy link
Contributor Author

Changes made as requested - thanks!

Copy link
Member

@thisisnic thisisnic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great - thanks for making those changes! One last tiny thing - the linter is complaining about spacing; would you mind running the styler on the files you've changes so that it automatically fixes those? More information on how to do that can be found here in the docs: https://arrow.apache.org/docs/r/articles/developers/workflow.html

@markromanmiller
Copy link
Contributor Author

I changed the spacing manually based on the lintr job output. remotes::install_github wasn't able to download the tarball of the lintr branch mentioned in the workflow setup, but that issue it seems to be a Github thing? I wasn't able to download it either through the web interface. In any case, I was able to run the standard lintr, and there were no linting messages in the files I've changed.

Copy link
Member

@jonkeane jonkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great. Thank you for your contribution!

@jonkeane jonkeane closed this in b781710 Mar 28, 2022
@ursabot
Copy link

ursabot commented Mar 28, 2022

Benchmark runs are scheduled for baseline = ad7380e and contender = b781710. b781710 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.21% ⬆️0.08%] test-mac-arm
[Failed ⬇️0.71% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.26% ⬆️0.04%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants