Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General lead/lag function (allow negative numbers) #5260

Closed
luispfonseca opened this issue May 20, 2020 · 4 comments
Closed

General lead/lag function (allow negative numbers) #5260

luispfonseca opened this issue May 20, 2020 · 4 comments

Comments

@luispfonseca
Copy link

I am suggesting creating a general lead/lag function that can take in both positive and negative integers and generate leads and lags respectively. I understand the idea of not allowing negative integers in the current format, to catch potential user errors, but this situation also duplicates code for what is essentially a single function, just separated in two. I propose an example in the following reprex.

suppressMessages(library(dplyr))
library(purrr)

# I want to create versions of the same dataset where a variable is in leads or lags

# current method
mtcars_leads <- map(0L:3L, ~mtcars %>% mutate(shift_wt = lead(wt, n = .x)))
mtcars_lags <- map(1L:3L,  ~mtcars %>% mutate(shift_wt = lag(wt, n = .x)))
final_list <- map(c(rev(mtcars_lags), mtcars_leads), ~.)

# desired general function
leadlag <- function(x, n, ...) {
  
  # insert appropriate argument and type checking
    
  # core function
    if (n >= 0) {
      return(lead(x, n, ...))
    }
    
    if (n < 0) {
      return(lag(x, abs(n), ...))
    }
}

# single line for same function, avoids code duplication
final_list_desired <- map(-3L:3L, ~mtcars %>% mutate(shift_wt = leadlag(wt, n = .x)))

# check they are identical
identical(final_list, final_list_desired)
#> [1] TRUE

Created on 2020-05-20 by the reprex package (v0.3.0)

@luispfonseca
Copy link
Author

As an additional note, the code of lead and lag is mostly repeated across the 2 functions. See https://github.com/tidyverse/dplyr/blob/09f61457059c4b00d8b9134cfa55f00255a14bf2/R/lead-lag.R
A generic function could address both cases, of which lead and lag are special cases and avoid code duplication. Exporting this function would address the suggestion I am making as well.

@luispfonseca
Copy link
Author

A proposal, made by merging the current versions of the 2 functions. Passes the tests.

#' @export
#' @rdname lead-lag
lead_lag <- function(x, n, default = NA, order_by = NULL, ...) {
  if (!is.null(order_by)) {
    return(with_order(order_by, lead_lag, x, n = n, default = default))
  }

  if (inherits(x, "ts")) {
    bad_args("x", "must be a vector, not a ts object, do you want `stats::lag()`?")
  }

  if (length(n) != 1 || !is.numeric(n)) {
    bad_args(
      "n", "must be an integer scalar, ",
      "not {friendly_type_of(n)} of length {length(n)}."
    )
  }
  if (n == 0) {
    return(x)
  }

  lag_toggle <- (n < 0)

  xlen <- vec_size(x)
  n <- pmin(abs(n), xlen)

  inputs <- vec_cast_common(default = default, x = x)

  if (lag_toggle == TRUE) {
    return(
      vec_c(
        vec_rep(inputs$default, n),
        vec_slice(inputs$x, seq_len(xlen - n))
      )
    )
  }

  if (lag_toggle == FALSE) {
    return(
      vec_c(
        vec_slice(inputs$x, -seq_len(n)),
        vec_rep(inputs$default, n)
      )
    )
  }
}

#' @export
#' @rdname lead-lag
lead <- function(x, n = 1L, ...) {
  if (length(n) != 1 || !is.numeric(n) || n < 0) {
    bad_args(
      "n", "must be a nonnegative integer scalar, ",
      "not {friendly_type_of(n)} of length {length(n)}."
    )
  }

  lead_lag(x, n, ...)
}

#' @export
#' @rdname lead-lag
lag <- function(x, n = 1L, ...) {
  if (length(n) != 1 || !is.numeric(n) || n < 0) {
    bad_args(
      "n", "must be a nonnegative integer scalar, ",
      "not {friendly_type_of(n)} of length {length(n)}."
    )
  }

  lead_lag(x, -n, ...)
}

@romainfrancois
Copy link
Member

We'll probably reimplement lead() and lag() in the funs:: package at some point.

@hadley
Copy link
Member

hadley commented Jun 5, 2020

Tracking in linked issue.

@hadley hadley closed this as completed Jun 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants