Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add index option to compute() #1499

Closed
iangow opened this issue Nov 2, 2015 · 4 comments
Closed

Add index option to compute() #1499

iangow opened this issue Nov 2, 2015 · 4 comments
Labels
Milestone

Comments

@iangow
Copy link

@iangow iangow commented Nov 2, 2015

It would be great if collect() could be accompanied by the creation of indexes for performance.

@hadley
Copy link
Member

@hadley hadley commented Nov 2, 2015

Not sure how that would help - collect() bring all the data into R, where there aren't indices.

Loading

@iangow
Copy link
Author

@iangow iangow commented Nov 2, 2015

Sorry. I think I should've said compute().

But collect() is what I've ended up doing.

I am merging A with B to produce AB, which is bigger than either, then merging with C to produce something smaller. I figure a server-side merge of AB with C assisted by an index would be faster than bringing the data to R. But if I'm using RStudio Server, then it's no big deal for the data I'm working with. (If I were using RStudio with a remote database, then there'd be more of a difference.)

(Of course, if one could slip arbitrary SQL in through the dplyr-constructed connections, then one could make indexes "by hand".)

Loading

@hadley
Copy link
Member

@hadley hadley commented Nov 2, 2015

Ah yes, there should be an option to do that - it's easy to add.

Loading

@iangow iangow changed the title Add index option to collect() Add index option to compute() Nov 2, 2015
@krlmlr
Copy link
Member

@krlmlr krlmlr commented Nov 16, 2015

Checking that a column is unique is another possible application. Perhaps compute() can be taught to create both unique and non-unique indexes. Mind that indexes can include multiple columns.

Loading

@hadley hadley added this to the 0.5 milestone Mar 1, 2016
@hadley hadley closed this in #1550 Mar 1, 2016
@lock lock bot locked as resolved and limited conversation to collaborators Jun 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants