Skip to content

step_holiday fails with NA values #743

@MrFlick

Description

@MrFlick

The problem

The step_holiday function seems to return errors during baking/juicing in the presence of NA values. For example

Reproducible example

library(recipes)

dd <- data.frame(
  date = as.Date(c("2021-12-04", "2021-12-25", NA, "2021-12-27"))
)

recipe(~date, data=dd) %>% 
  step_holiday(date) %>% 
  prep() %>% 
  juice()

# Error in if (rng.nch[1] != rng.nch[2]) stop("'charvec' has non-NA entries of different number of characters") : 
#   missing value where TRUE/FALSE needed

Tested with recipes_0.1.16

Possible solution

The problem seems to be with the is_holiday function. It's passing a unique vector of years to the timeDate::holiday() function but that vector can include NA values which end up throwing the error the timeDate::holiday() code. It would be better to omit NA years (possibly checking to make sure there is at least one NA-year). It probably also makes sense to propagate the NA values in the returned indicator columns. One possible solution is

is_holiday <- function(hol, dt) {
  # ~~ add na.omit() here to drop NA values ~~
  hdate <- holiday(year = unique(year(na.omit(dt))), Holiday = hol)
  hdate <- as.Date(hdate)
  out <- rep(0, length(dt))
  out[dt %in% hdate] <- 1
  # ~~ carry forward missing values ~~
  out[is.na(dt)] <- NA
  out
}

This updated version of the function would return

# A tibble: 4 x 4
  date       date_LaborDay date_NewYearsDay date_ChristmasDay
  <date>             <dbl>            <dbl>             <dbl>
1 2021-12-04             0                0                 0
2 2021-12-25             0                0                 1
3 NA                    NA               NA                NA
4 2021-12-27             0                0                 0

If that seems like a reasonable solution, I could prepare a pull request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions