New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
more explicit seed setting for tune_grid #11
Comments
Folks are running into unexpected results with seed setting for library(tidymodels)
set.seed(123)
tr_te_split <- initial_split(mtcars)
rf_spec <- rand_forest() %>%
set_mode("regression") %>%
set_engine("ranger")
rf_wf <- workflow() %>%
add_model(rf_spec) %>%
add_formula(mpg ~ .)
set.seed(345)
last_rf_fit <- last_fit(rf_wf, split = tr_te_split)
collect_predictions(last_rf_fit)
#> # A tibble: 8 x 4
#> id .pred .row mpg
#> <chr> <dbl> <int> <dbl>
#> 1 train/test split 24.8 3 22.8
#> 2 train/test split 18.8 10 19.2
#> 3 train/test split 16.8 14 15.2
#> 4 train/test split 13.8 15 10.4
#> 5 train/test split 28.0 18 32.4
#> 6 train/test split 29.0 19 30.4
#> 7 train/test split 16.9 22 15.5
#> 8 train/test split 15.4 31 15
set.seed(345)
last_rf_fit_2 <- fit(rf_wf, training(tr_te_split))
predict(last_rf_fit_2, testing(tr_te_split))
#> # A tibble: 8 x 1
#> .pred
#> <dbl>
#> 1 24.3
#> 2 18.6
#> 3 16.8
#> 4 13.5
#> 5 28.0
#> 6 29.1
#> 7 17.0
#> 8 15.4 Created on 2020-08-18 by the reprex package (v0.3.0.9001) |
It's worth mentioning that I tried to set the library(tidymodels)
#> ── Attaching packages ────────────────────────────────────────────────────────────────────────────────────── tidymodels 0.1.1 ──
#> ✓ broom 0.7.0 ✓ recipes 0.1.13
#> ✓ dials 0.0.8 ✓ rsample 0.0.7
#> ✓ dplyr 1.0.0 ✓ tibble 3.0.3
#> ✓ ggplot2 3.3.2 ✓ tidyr 1.1.0
#> ✓ infer 0.5.3 ✓ tune 0.1.1
#> ✓ modeldata 0.0.2 ✓ workflows 0.1.2
#> ✓ parsnip 0.1.2 ✓ yardstick 0.0.7
#> ✓ purrr 0.3.4
#> ── Conflicts ───────────────────────────────────────────────────────────────────────────────────────── tidymodels_conflicts() ──
#> x purrr::discard() masks scales::discard()
#> x dplyr::filter() masks stats::filter()
#> x dplyr::lag() masks stats::lag()
#> x recipes::step() masks stats::step()
set.seed(123)
tr_te_split <- initial_split(mtcars)
rf_spec <- rand_forest() %>%
set_mode("regression") %>%
set_engine("ranger", seed = 123)
rf_wf <- workflow() %>%
add_model(rf_spec) %>%
add_formula(mpg ~ .)
set.seed(345)
last_rf_fit <- last_fit(rf_wf, split = tr_te_split)
collect_predictions(last_rf_fit)
#> # A tibble: 8 x 4
#> id .pred .row mpg
#> <chr> <dbl> <int> <dbl>
#> 1 train/test split 24.8 3 22.8
#> 2 train/test split 18.8 10 19.2
#> 3 train/test split 16.5 14 15.2
#> 4 train/test split 13.6 15 10.4
#> 5 train/test split 28.2 18 32.4
#> 6 train/test split 29.2 19 30.4
#> 7 train/test split 17.3 22 15.5
#> 8 train/test split 15.3 31 15
set.seed(345)
last_rf_fit_2 <- fit(rf_wf, training(tr_te_split))
predict(last_rf_fit_2, testing(tr_te_split))
#> # A tibble: 8 x 1
#> .pred
#> <dbl>
#> 1 24.8
#> 2 18.8
#> 3 16.5
#> 4 13.6
#> 5 28.2
#> 6 29.2
#> 7 17.3
#> 8 15.3 Created on 2020-08-19 by the reprex package (v0.3.0) |
Closed in #275 🎉 |
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue. |
Perhaps take the
caret::train()
route and have a vector of seeds as a control argument and add a lot ofwith_seed()
to the different modules.The text was updated successfully, but these errors were encountered: