Skip to content

dplyr <-> base R translation vignette #4755

@hadley

Description

@hadley

One table:

  • arrange(df, x) -> df[order(x), , drop = FALSE]
  • distinct(df, x) -> df[!duplicated(x), , drop = FALSE]; unique()
  • filter(df, x) -> df[x & !is.na(x), , drop = FALSE]; subset()
  • mutate(df, z = x + y) -> df$z <- df$x + df$y; transform()
  • pull(df, x) -> df$x
  • rename(df, y = x) -> ?
  • select(df, x, y) -> df[c("x", "y")], subset()
  • select(df, starts_with("x") -> df[grepl(names(df), "^x")], etc
  • summarise(df, mean(x)) -> mean(df$x)
  • slice(df, c(1, 2, 5)) -> df[c(1, 2, 5), , drop = FALSE]

Two table:

  • inner_join(df1, df2) -> merge(df1, df2) (but mention row order)
  • left_join(df1, df2) -> merge(df1, df2, all.x = TRUE)
  • right_join(df1, df2) -> merge(df1, df2, all.y = TRUE)
  • full_join(df1, df2) -> merge(df1, df2, all = TRUE)
  • semi_join(df1, df2) -> df1[df1$x %in% df2$x, , drop = FALSE]
  • anti_join(df1, df2) -> df1[!df1$x %in% df2$x, , drop = FALSE]

Probably worth a systematic comparison only ungrouped data frames. But might be worth spelling showing a couple of special cases (e.g. ave() for grouped mutate) and then illustrating the general split(), lapply(), do.call(rbind) pattern.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions