Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-5505: [R] namespace cleanup #4491

Closed

Conversation

nealrichardson
Copy link
Member

This patch:

  • renames array() to arrow_array() and table() to arrow_table() to avoid collision with base R functions of the same name. AFAICT there are no callers out there of those functions (i.e. I checked sparklyr and I'm not aware of anything else depending on arrow in any way).
  • Prunes a trailing - from a test file name
  • Moves crayon and vctrs from Imports to Suggests, in response to a R CMD CHECK note. crayon was only used in one place in the package and makes sense to be optional; vctrs is only used in tests.

@romainfrancois
Copy link
Contributor

🤔 I understand the issue with masking the base package, but I'm not sure I like arrow_array() and arrow_table() for these factories, esp that other factories, e.g. record_batch() or chunked_array() presumably stay the same ?

vctrs used to be more important earlier in the development, specifically when array() took ... and vctrs::vec_c() would collapse the ... into one vector. So yeah it's totally fine to move it to Suggests, perhaps we can even get rid of it for the tests.

We can totally just not use crayon at all.

@nealrichardson
Copy link
Member Author

Fair point, I worried about that too. My preference would be to namespace everything (like sparklyr), e.g. arrow_record_batch, rather than masking the base functions, but that was a bigger, more disruptive change so I wanted to try this. Happy to try that out if you're open to it--I think it also raises some design questions about how we want the R developer interface to Arrow to look and feel, which is a good thing to think about anyway.

@romainfrancois
Copy link
Contributor

Yeah, maybe it's worth have a broader discussion about the interface as a whole. These functions array(), chunked_array(), table(), and record_batch() are factories that make objects, maybe they should be e.g. :

new_Array(), new_ChunkedArray(), new_Table(), new_RecordBatch()

Maybe the R6 classes should be made more user visible and then they would be e.g.

Array$new(), ChunkedArray$new(), Table$new(), RecordBatch$new() ...

In any case, we should be consistent in what we do, and not just prefixing with arrow_ for the cases where we do mask another function.

@nealrichardson
Copy link
Member Author

Closing for now; we can continue the discussion on Jira/elsewhere and revisit.

romainfrancois added a commit that referenced this pull request Jun 14, 2019
This is instead of #4491, without the function naming change that we wanted to think about more intentionally.

It also removes a few lingering references to `tibble` in the package, which were still passing in tests because tibble is in Suggests and the test hosts install all of the Suggests packages.

@romainfrancois

Author: Romain Francois <romain@rstudio.com>
Author: Neal Richardson <neal.p.richardson@gmail.com>

Closes #4566 from nealrichardson/clean-imports and squashes the following commits:

e0cf005 <Romain Francois> not importing glue::glue
002c0f0 <Romain Francois> no need for glue either at this point
25d4873 <Neal Richardson> Prune unused Imports; fix a couple of lingering tibble references
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants