You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a cross-post from a caret issue that affects recipes.
Right now, caret preps a recipe twice: once to generate the dataset that finalModel is trained on, and, in a second step outside the finalModel training, it generates a second prepped recipe that is returned with the rest of the train output. The latter is what caret uses for transforming new data during later predictions out.
This means that any recipes steps that are not deterministic - I think the main cases that would affect users now are step_upsample and step_downsample - will lead to a caret finalModel that is not trained on the same dataset as the preprocessing recipe included for later. This is likely to be an invisible bug, except in cases where the difference between randomly-generated up/down-samples during the different recipe preparations becomes large enough to cause some downstream changes in (eg) which features are retained.
The text was updated successfully, but these errors were encountered:
The way to solve this would be to add seed arguments (and to default them to something like sample.int(10^5, 1)) and execute the code using withr::with_seed.
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.
This is a cross-post from a
caret
issue that affectsrecipes
.Right now,
caret
preps arecipe
twice: once to generate the dataset thatfinalModel
is trained on, and, in a second step outside thefinalModel
training, it generates a second prepped recipe that is returned with the rest of thetrain
output. The latter is whatcaret
uses for transforming new data during later predictions out.This means that any
recipes
steps that are not deterministic - I think the main cases that would affect users now arestep_upsample
andstep_downsample
- will lead to a caretfinalModel
that is not trained on the same dataset as the preprocessing recipe included for later. This is likely to be an invisible bug, except in cases where the difference between randomly-generated up/down-samples during the different recipe preparations becomes large enough to cause some downstream changes in (eg) which features are retained.The text was updated successfully, but these errors were encountered: