Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: UPDATE- and UPSERT-like functionality #2075

Closed
krlmlr opened this issue Aug 18, 2016 · 5 comments
Closed

FR: UPDATE- and UPSERT-like functionality #2075

krlmlr opened this issue Aug 18, 2016 · 5 comments
Labels
feature a feature request or enhancement

Comments

@krlmlr
Copy link
Member

krlmlr commented Aug 18, 2016

Sometimes I need to update a bunch of observations in a target table with information from a second table. Currently, I'm doing something like

library(tibble)
library(dplyr)

original <- tribble(
  ~id, ~attr1, ~attr2,
    1,      1,      3,
    2,      2,      4)

new <- tribble(
  ~id, ~attr1, ~attr2,
  2,      5,      6)

combined <-
  original %>%
  left_join(new, by = "id") %>%
  transmute(id,
            attr1 = coalesce(attr1.y, attr1.x),
            attr2 = coalesce(attr2.y, attr2.x))

It would be great if dplyr offered a verb that does this for me. Unlike #1736 and #1216, that verb would create a new table. The code above implements UPDATE, if full_join() is used instead, it's more like UPSERT.

@krlmlr
Copy link
Member Author

krlmlr commented Nov 7, 2016

@hadley: Is that something we want here or in tidyr?

@hadley
Copy link
Member

hadley commented Nov 7, 2016

This feels more like dplyr to me, although there's clearly some connection to tidyverse/tidyr#183

@krlmlr krlmlr added the feature a feature request or enhancement label Dec 15, 2016
@hadley
Copy link
Member

hadley commented Feb 20, 2017

Do you think a single verb upsert() would be sufficient? I imagine it would have the same syntax as a join function, although you'd have to supply the keys.

Might need some variant as to whether an NA on the RHS should be treated as a literal NA, or just means to leave the LHS value as is.

@krlmlr
Copy link
Member Author

krlmlr commented Feb 21, 2017

I think a single verb that always does a full join should be sufficient. See #1792 for a discussion about keys. We'd have to assume (or check) that the join keys are unique for both datasets (#1619).

Treating NA on the RHS as literal NA feels difficult to implement for SQL sources. All in all, because this functionality can be implemented completely using existing dplyr verbs, it may be more suitable for tidyr. (tidyverse/tidyr#183 contains a lot more discussion, too.)

@hadley
Copy link
Member

hadley commented Feb 21, 2017

Ok, lets keep discussion there

@hadley hadley closed this as completed Feb 21, 2017
@lock lock bot locked as resolved and limited conversation to collaborators Jun 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants