Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The bake.step_log() S3 method breaks legacy recipe objects #1284

Closed
stufield opened this issue Feb 7, 2024 · 5 comments · Fixed by #1285
Closed

The bake.step_log() S3 method breaks legacy recipe objects #1284

stufield opened this issue Feb 7, 2024 · 5 comments · Fixed by #1285
Labels
bug an unexpected problem or unintended behavior

Comments

@stufield
Copy link
Contributor

stufield commented Feb 7, 2024

The problem

The current (CRAN v1.0.9) versions of bake.step_log() breaks older (legacy) recipes made prior to (guessing) v1.0.4, > 1 year ago. When attempting to bake() a prepped recipe that has a log_step, the bake() S3 method used to check for the appropriate columns using the x$columns entry of the object, however it now looks for its names, i.e. names(x$columns), which for older recipes without vector names will be NULL, and breaks backward compatibility of the object.

The line where this happens is: https://github.com/tidymodels/recipes/blob/main/R/log.R#L127

And it appears to have been introduced here: 6155183

Perhaps even worse, the bake() step does not fail when names are NULL, it skips the logging-loop and exits without modifying the any columns, and resulting in incorrect pre-processed data.

Questions:

  1. are there any future plans to address backward compatibility of older objects?
  2. is there a recommended (preferred) workaround?
  3. should breaking changes of legacy recipes be a consideration in semantic versioning?
  4. should I modify the recipe object as necessary for it to function in v1.0.9?

The last option is not ideal since in my use case, these recipes (~45) are production level objects that are frozen with their corresponding model objects and are tied to regulatory controlled customer products.

Reprex

I do not have a reprex per se, but there is a simple workaround:

update_legacy_logstep <- function(obj) {
  idx <- which(vapply(obj$steps, inherits, what = "step_log", NA))
  for (i in seq_len(idx)) {
    step <- obj$steps[[i]]
    if (is.null(names(step$columns))) {
      names(obj$steps[[i]]$columns) <- step$columns
    }
  }
  obj
}
@EmilHvitfeldt
Copy link
Member

Hello!

This is indeed a bug. And should be fixed. It is kinda hard testing recipes steps for backwards compatibility so sometimes these things fall through the cracks. This is a high priority issue

Would you be able to freeze the installed {recipes} versions? Or were the recent update necessary? Your temporary fix would work as well

I'm out of office without computer, coming back on 15th.

@EmilHvitfeldt EmilHvitfeldt added the bug an unexpected problem or unintended behavior label Feb 7, 2024
@stufield
Copy link
Contributor Author

stufield commented Feb 7, 2024

Hi @EmilHvitfeldt ,

In our production environment we intentionally run a few versions behind "live CRAN" and do an annual update, which is when we caught this issue.

Waiting for a bug-fix likely won't work since we cannot use a development version (for reproducibility we must use a released CRAN version). So I'll use the workaround-fix adding the temporary names to the object prior to baking, which should allow backward compatibility internally with our code base too.

I definitely know how things can slip through the cracks 🤷🏽‍♂️
I more or less just wanted to know if this was on your radar and if there was a known fix I should be looking into.

Thanks! 🙌🏽

stufield added a commit to stufield/recipes that referenced this issue Feb 12, 2024
- legacy recipes index the `columns` entry of `object`
  whereas newer recipes index `names(object$columns)`
- this breaks backward compatibility
- This bugfix prefers the 'new' syntax, but if
  `names(object)` is `NULL`, reverts to the 'old' syntax
- fixes tidymodels#1284
@topepo
Copy link
Member

topepo commented Feb 18, 2024

@stufield A new version is on CRAN with your change. Thanks!

@stufield
Copy link
Contributor Author

Thank you @topepo !

Copy link

github-actions bot commented Mar 4, 2024

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants