Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a symmetric difference set function #4811

Closed
felipegerard opened this issue Jan 29, 2020 · 2 comments · Fixed by #6385
Closed

Add a symmetric difference set function #4811

felipegerard opened this issue Jan 29, 2020 · 2 comments · Fixed by #6385
Labels
feature a feature request or enhancement tables 🧮 joins and set operations

Comments

@felipegerard
Copy link

I often have the need to find elements (rows) that two vectors (data frames) don't have in common. Finding these elements using setops is straightforward, but I wonder if it would be ok to add the symmetric difference (aka. union \ intersection) function to save a few key strokes. Below is my proposed solution. I can make the PR if we agree it makes sense to add this.

symdiff <- function(x, y) {
  setdiff(union(x, y), intersect(x, y))
}
@hadley hadley added feature a feature request or enhancement verbs 🏃‍♀️ labels Mar 1, 2020
@seasmith
Copy link

would it also be worth having a symmetrical anti-join?

sym_anti_join <- function (x, y, by = NULL, copy = FALSE, ...) {
  x_only <- anti_join(x, y, by = by, copy = copy, ...)
  y_only <- anti_join(y, x, by = by, copy = copy, ...)
  bind_rows(x_only, y_only)
}

@romainfrancois romainfrancois added tables 🧮 joins and set operations and removed verbs 🏃‍♀️ labels Oct 1, 2021
@hadley
Copy link
Member

hadley commented Jul 22, 2022

Implementation might look something like this:

symdiff <- function(x, y, ...) {
  check_dots_empty()
  UseMethod("symdiff")
}

symdiff.data.frame <- function(x, y) {
  check_compatible(x, y)
  
  cast <- vec_cast_common(x = x, y = y)
  only_x <- vec_slice(cast$x, !vec_in(cast$x, cast$y)
  only_y <- vec_slice(cast$y, !vec_in(cast$y, cast$x)
  
  out <- vec_unique(vec_rbind(only_x, only_y))
  reconstruct_set(out, x)
}

@seasmith I don't think a symmetric antijoin makes sense because there's generally no reason to expect the data frames on either side of a join to look at all alike.

hadley added a commit that referenced this issue Aug 3, 2022
hadley added a commit that referenced this issue Aug 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement tables 🧮 joins and set operations
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants