-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database semi_join() doesn't match R's NA semantics #180
Comments
Or use one of the alternatives discussed in https://modern-sql.com/feature/is-distinct-from
|
i.e. we need to use the |
Reprex: library(dbplyr)
library(dplyr, warn.conflicts = FALSE)
mf1 <- memdb_frame(x = c(1, NA, 3))
mf2 <- memdb_frame(x = c(1, NA))
anti_join(mf1, mf2)
#> Joining, by = "x"
#> # Source: lazy query [?? x 1]
#> # Database: sqlite 3.25.3 [:memory:]
#> x
#> <dbl>
#> 1 NA
#> 2 3
anti_join(collect(mf1), collect(mf2))
#> Joining, by = "x"
#> # A tibble: 1 x 1
#> x
#> <dbl>
#> 1 3 Created on 2019-02-06 by the reprex package (v0.2.1.9000) |
Could provide |
I was attempting a dplyr semi-join against SQL Server tables. In regular dplyr with local dataframes, semi-joins can succesfully join on
NULL
values between the left and right table. However, this is not the default behavior in SQL Server. For example, dbplyr implements semi-joins with theWHERE EXISTS
method, where theWHERE
clause will specify the join conditions. By default, dbplyr generates the following SQL code:To make the semi-join on SQL Server allow joins on
NULL
s, we'd modify the finalWHERE
clause to:I so far don't see a way that we'd be able to create the 2nd version from dplyr. Perhaps, we need a new parameter in the
semi_join()
The text was updated successfully, but these errors were encountered: