Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable set.seed for any recipe steps that are not deterministic #188

Closed
sheffe opened this issue Aug 30, 2018 · 2 comments
Closed

enable set.seed for any recipe steps that are not deterministic #188

sheffe opened this issue Aug 30, 2018 · 2 comments

Comments

@sheffe
Copy link

sheffe commented Aug 30, 2018

This is a cross-post from a caret issue that affects recipes.

Right now, caret preps a recipe twice: once to generate the dataset that finalModel is trained on, and, in a second step outside the finalModel training, it generates a second prepped recipe that is returned with the rest of the train output. The latter is what caret uses for transforming new data during later predictions out.

This means that any recipes steps that are not deterministic - I think the main cases that would affect users now are step_upsample and step_downsample - will lead to a caret finalModel that is not trained on the same dataset as the preprocessing recipe included for later. This is likely to be an invisible bug, except in cases where the difference between randomly-generated up/down-samples during the different recipe preparations becomes large enough to cause some downstream changes in (eg) which features are retained.

@topepo
Copy link
Member

topepo commented Sep 10, 2018

The way to solve this would be to add seed arguments (and to default them to something like sample.int(10^5, 1)) and execute the code using withr::with_seed.

@topepo topepo closed this as completed in 3683dc6 Sep 28, 2018
@github-actions
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Feb 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants