Skip to content

Commit

Permalink
update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
mgirlich committed May 7, 2021
1 parent 494a0ca commit 4cb3d81
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 17 deletions.
12 changes: 7 additions & 5 deletions R/step-join.R
Expand Up @@ -103,11 +103,13 @@ dt_call.dtplyr_step_join <- function(x, needs_copy = x$needs_copy) {
#' Join data tables
#'
#' These are methods for the dplyr generics [left_join()], [right_join()],
#' [inner_join()], [full_join()], [anti_join()], and [semi_join()]. The
#' mutating joins (left, right, inner, and full) are translated to
#' [data.table::merge.data.table()], except for the special cases where it's
#' possible to translate to `[.data.table`. Semi- and anti-joins have no
#' direct data.table equivalent.
#' [inner_join()], [full_join()], [anti_join()], and [semi_join()]. Left, right,
#' inner, and anti join are translated to the `[.data.table` equivalent,
#' full joins to [data.table::merge.data.table()].
#' Left, right, and full joins are in some cases followed by calls to
#' [data.table::setcolorder()] and [data.table::setnames()] to ensure correct
#' column order and names.
#' Semi-joins don't have a direct data.table equivalent.
#'
#' @param x,y A pair of [lazy_dt()]s.
#' @inheritParams dplyr::left_join
Expand Down
12 changes: 7 additions & 5 deletions man/left_join.dtplyr_step.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 12 additions & 7 deletions vignettes/translation.Rmd
Expand Up @@ -125,26 +125,31 @@ dt %>% distinct(c = a + b, .keep_all = TRUE) %>% show_query()

### Joins

Most joins use `merge()`:
Most joins use the `[.data.table` equivalent:

```{r}
dt2 <- lazy_dt(data.frame(a = 1))
dt %>% right_join(dt2, by = "a") %>% show_query()
dt %>% inner_join(dt2, by = "a") %>% show_query()
dt %>% full_join(dt2, by = "a") %>% show_query()
dt %>% right_join(dt2, by = "a") %>% show_query()
dt %>% left_join(dt2, by = "a") %>% show_query()
dt %>% anti_join(dt2, by = "a") %>% show_query()
```

But `left_join()` will use the `i` position where possible:
But `full_join()` uses `merge()`

```{r}
dt %>% left_join(dt2, by = "a") %>% show_query()
dt %>% full_join(dt2, by = "a") %>% show_query()
```

Anti-joins are easy to translate because data.table has a specific form for them:
In some case extra calls to `data.table::setcolorder()` and `data.table::setnames()`
are required to ensure correct column order and names in:

```{r}
dt %>% anti_join(dt2, by = "a") %>% show_query()
dt3 <- lazy_dt(data.frame(b = 1, a = 1))
dt %>% left_join(dt3, by = "a") %>% show_query()
dt %>% full_join(dt3, by = "b") %>% show_query()
```

Semi-joins are little more complex:
Expand Down

0 comments on commit 4cb3d81

Please sign in to comment.