Skip to content

Update engine-specific args #351

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Jul 31, 2020
Merged

Update engine-specific args #351

merged 21 commits into from
Jul 31, 2020

Conversation

juliasilge
Copy link
Member

This PR addresses tidymodels/tune#254.

Before this PR, update_dot_check() would error if anything (such as engine-specific parameters) showed up in the dots for update methods. This changes that, and then implements updating (from tune() ➡️ finalized parameters) for engine-specific parameters.

@juliasilge
Copy link
Member Author

These are the results for random forest:

library(tidymodels)
#> ── Attaching packages ────────────────────────── tidymodels 0.1.1 ──
#> ✓ broom     0.7.0          ✓ recipes   0.1.13    
#> ✓ dials     0.0.8          ✓ rsample   0.0.7     
#> ✓ dplyr     1.0.0          ✓ tibble    3.0.3     
#> ✓ ggplot2   3.3.2          ✓ tidyr     1.1.0     
#> ✓ infer     0.5.3          ✓ tune      0.1.1.9000
#> ✓ modeldata 0.0.2          ✓ workflows 0.1.2     
#> ✓ parsnip   0.1.2.9000     ✓ yardstick 0.0.7     
#> ✓ purrr     0.3.4
#> ── Conflicts ───────────────────────────── tidymodels_conflicts() ──
#> x purrr::discard() masks scales::discard()
#> x dplyr::filter()  masks stats::filter()
#> x dplyr::lag()     masks stats::lag()
#> x recipes::step()  masks stats::step()

rf_mod1 <- 
  rand_forest(min_n = tune()) %>%
  set_mode("regression") %>%
  set_engine("ranger", 
             regularization.factor = tune(), 
             regularization.usedepth = tune())

rf_mod2 <- 
  rand_forest(min_n = tune()) %>%
  set_mode("regression") %>%
  set_engine("ranger")

set.seed(123)
car_boot <- bootstraps(mtcars, times = 5)

car_wf <- workflow() %>%
  add_formula(mpg ~ .) 

set.seed(123)
best1 <- car_wf %>%
  add_model(rf_mod1) %>%
  tune_grid(
    resamples = car_boot
  ) %>%
  select_best("rmse")
#> ! Bootstrap1: internal: A correlation computation is required, but `estimate` is const...
#> ! Bootstrap2: internal: A correlation computation is required, but `estimate` is const...
#> ! Bootstrap3: internal: A correlation computation is required, but `estimate` is const...
#> ! Bootstrap4: internal: A correlation computation is required, but `estimate` is const...
#> ! Bootstrap5: internal: A correlation computation is required, but `estimate` is const...

best2 <- car_wf %>%
  add_model(rf_mod2) %>%
  tune_grid(
    resamples = car_boot
  ) %>%
  select_best("rmse")
#> ! Bootstrap1: internal: A correlation computation is required, but `estimate` is const...
#> ! Bootstrap2: internal: A correlation computation is required, but `estimate` is const...
#> ! Bootstrap3: internal: A correlation computation is required, but `estimate` is const...
#> ! Bootstrap4: internal: A correlation computation is required, but `estimate` is const...
#> ! Bootstrap5: internal: A correlation computation is required, but `estimate` is const...

car_wf %>%
  add_model(rf_mod1) %>%
  finalize_workflow(best1)
#> ══ Workflow ════════════════════════════════════════════════════════
#> Preprocessor: Formula
#> Model: rand_forest()
#> 
#> ── Preprocessor ────────────────────────────────────────────────────
#> mpg ~ .
#> 
#> ── Model ───────────────────────────────────────────────────────────
#> Random Forest Model Specification (regression)
#> 
#> Main Arguments:
#>   min_n = 4
#> 
#> Engine-Specific Arguments:
#>   regularization.factor = 0.684913042467088
#>   regularization.usedepth = TRUE
#> 
#> Computational engine: ranger

car_wf %>%
  add_model(rf_mod2) %>%
  finalize_workflow(best2)
#> ══ Workflow ════════════════════════════════════════════════════════
#> Preprocessor: Formula
#> Model: rand_forest()
#> 
#> ── Preprocessor ────────────────────────────────────────────────────
#> mpg ~ .
#> 
#> ── Model ───────────────────────────────────────────────────────────
#> Random Forest Model Specification (regression)
#> 
#> Main Arguments:
#>   min_n = 5
#> 
#> Computational engine: ranger

Created on 2020-07-29 by the reprex package (v0.3.0.9001)

@topepo
Copy link
Member

topepo commented Jul 29, 2020

You probably want to update from main; I just merged a PR that fixed some tests.

@juliasilge
Copy link
Member Author

Here are some boost_tree() results:

library(tidymodels)
#> ── Attaching packages ──────────────────────────────── tidymodels 0.1.1 ──
#> ✓ broom     0.7.0          ✓ recipes   0.1.13    
#> ✓ dials     0.0.8          ✓ rsample   0.0.7     
#> ✓ dplyr     1.0.0          ✓ tibble    3.0.3     
#> ✓ ggplot2   3.3.2          ✓ tidyr     1.1.0     
#> ✓ infer     0.5.3          ✓ tune      0.1.1.9000
#> ✓ modeldata 0.0.2          ✓ workflows 0.1.2     
#> ✓ parsnip   0.1.2.9000     ✓ yardstick 0.0.7     
#> ✓ purrr     0.3.4
#> ── Conflicts ─────────────────────────────────── tidymodels_conflicts() ──
#> x purrr::discard() masks scales::discard()
#> x dplyr::filter()  masks stats::filter()
#> x dplyr::lag()     masks stats::lag()
#> x recipes::step()  masks stats::step()

data(penguins, package = "palmerpenguins")

set.seed(234)
folds <- vfold_cv(penguins, v = 5)

bands_grid <- tibble(bands = c(2, 50, 100))

c50_spec <- boost_tree() %>% 
  set_mode("classification") %>%
  set_engine("C5.0", rules = TRUE, bands = tune())

c50_wf <- workflow() %>%
  add_formula(species ~ .) %>%
  add_model(c50_spec)

best <- c50_wf %>%
  tune_grid(resamples = folds, grid = bands_grid) %>%
  select_best("roc_auc")

best
#> # A tibble: 1 x 2
#>   bands .config
#>   <dbl> <chr>  
#> 1     2 Model1

finalize_workflow(c50_wf, best)
#> ══ Workflow ══════════════════════════════════════════════════════════════
#> Preprocessor: Formula
#> Model: boost_tree()
#> 
#> ── Preprocessor ──────────────────────────────────────────────────────────
#> species ~ .
#> 
#> ── Model ─────────────────────────────────────────────────────────────────
#> Boosted Tree Model Specification (classification)
#> 
#> Engine-Specific Arguments:
#>   rules = TRUE
#>   bands = 2
#> 
#> Computational engine: C5.0

Created on 2020-07-30 by the reprex package (v0.3.0.9001)

@juliasilge juliasilge marked this pull request as ready for review July 30, 2020 18:29
@juliasilge juliasilge requested a review from topepo July 30, 2020 18:29
@topepo
Copy link
Member

topepo commented Jul 31, 2020

Looks great! I also tested it on tidymodels/tune#254

@topepo topepo merged commit d251c73 into master Jul 31, 2020
@topepo topepo deleted the update-engine-args branch July 31, 2020 01:50
@github-actions
Copy link

github-actions bot commented Mar 6, 2021

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants