-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
set_key #1792
Comments
|
If two tables have different keys, the natural join uses the intersection. What's a secondary index? |
I guess we don't want to use the intersection if it's empty. Also, if it's not a strict subset or superset (A + B x A + C), Secondary index: One that's used only for optimizing query execution but doesn't have the uniqueness property. Don't know if this is the "official" terminology, though. |
I now think this is out of scope for dplyr, because it's complex that it needs its own package, e.g. https://github.com/krlmlr/dm |
This would check that the combination of variables is a valid key (i.e. no duplicates and no missing values), and would store the keys as an attribute. Then joins would use the key attribute (if present) for natural joins, rather than the complete set of variables.
cc @jennybc @krlmlr
The text was updated successfully, but these errors were encountered: