patch `params` argument with `xgboost` engine in `boost_tree()` #787

simonpcouch · 2022-08-15T14:24:01Z

Closes #774, closes #459. Related to #411.

The goal of this PR is to ensure that folks can pass arguments that live in the param argument to xgb.train. The new docs section is probably the best place to start for big picture here. :)

Some notes-to-self that helped me keep track of arguments:

xgb.train routes arguments from its dots to the params argument. For simplicity of our own argument routing, this PR proposes we pass non-main params arguments to the dots rather than params.
xgb_train previously took objective as a main argument. This argument is eventually routed to the params argument in xgb.train, so again for simplicity of our machinery, I deleted the objective argument to xgb_train so that it will be passed through dots. This change isn’t user-facing and doesn’t change the way the argument is passed in practice.
We’d prefer that users pass elements of the params argument directly to set_engine rather than as part of the params list so that tune machinery works “out-of-the-box.” This PR now raises a warning when users supply a non-empty params argument (though now correctly handles patching the params argument with main boost_tree arguments).

An additional unit test that arguments passed via ... to xgb_train can indeed be tuned; will PR to extratests after this is merged:

# with this devel parsnip:
library(tidymodels)

ctrl$verbosity <- 0L
#> Error in ctrl$verbosity <- 0L: object 'ctrl' not found

# define base model spec
spec_base <-
  boost_tree() %>%
  set_mode("regression") %>%
  set_engine("xgboost", eval_metric = tune())

res <-
  tune_grid(
    spec_base,
    preprocessor = mpg ~ .,
    resamples = vfold_cv(mtcars, v = 6),
    param_info =
      extract_parameter_set_dials(spec_base) %>%
      update(eval_metric = new_qual_param(
        type = "character",
        values = c("rmse", "logloss"),
        label = c(eval_metric = "Evaluation Metric")
      )),
    control = control_grid(save_workflow = TRUE)
  )

res_eval_values <-
  collect_metrics(res) %>%
  pull(eval_metric)

all(c("logloss", "rmse") %in% res_eval_values)
#> [1] TRUE

^{Created on 2022-08-15 by the reprex package (v2.0.1)}

topepo · 2022-08-17T13:20:10Z

I'm ready to merge this but it breaks a package. Can you investigate and, if needed, add a PR (or issue) to https://github.com/Harrison4192/autostats/issues?

autostats

Version: 0.3.0
GitHub: https://github.com/Harrison4192/autostats
Source code: https://github.com/cran/autostats
Date/Publication: 2022-02-09 08:50:02 UTC
Number of recursive dependencies: 231

Run revdep_details(, "autostats") for more info

Newly broken

checking examples ... ERROR

Running examples in ‘autostats-Ex.R’ failed
The error most likely occurred in:

> ### Name: get_params
> ### Title: get params
> ### Aliases: get_params get_params.xgb.Booster
> 
> ### ** Examples
> 
> 
...
> iris_dummies %>%
+   tidy_formula(target = Petal.Length) -> p_form
> 
> iris_dummies %>%
+   tidy_xgboost(p_form, mtry = .5, trees = 5L, loss_reduction = 2, sample_size = .7) -> xgb
Warning: `early_stop` was reduced to 4.
Warning: `early_stop` was reduced to 4.
Error in if (objective == "multi:softmax") { : argument is of length zero
Calls: %>% ... withCallingHandlers -> %>% -> tidy_predict -> tidy_predict.xgb.Booster
Execution halted

simonpcouch · 2022-08-17T13:21:14Z

Yup, sure thing.

simonpcouch · 2022-08-17T14:05:31Z

Submitted a fix!👍

github-actions · 2022-09-01T01:14:20Z

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

simonpcouch added 3 commits August 15, 2022 10:16

patch params argument with xgboost engine in boost_tree()

ecf2040

remove + add snapshots from previous PRs

0b82f1b

update snaps with new help-page reference

11d91c2

simonpcouch requested a review from topepo August 15, 2022 15:12

simonpcouch mentioned this pull request Aug 17, 2022

check failures with upcoming parsnip release Harrison4192/autostats#1

Closed

Merge branch 'main' into boost-tree-params-774

3fe4f46

topepo merged commit 6c5482a into main Aug 17, 2022

topepo deleted the boost-tree-params-774 branch August 17, 2022 23:32

github-actions bot locked and limited conversation to collaborators Sep 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

patch `params` argument with `xgboost` engine in `boost_tree()` #787

patch `params` argument with `xgboost` engine in `boost_tree()` #787

simonpcouch commented Aug 15, 2022

topepo commented Aug 17, 2022

simonpcouch commented Aug 17, 2022

simonpcouch commented Aug 17, 2022

github-actions bot commented Sep 1, 2022

patch params argument with xgboost engine in boost_tree() #787

patch params argument with xgboost engine in boost_tree() #787

Conversation

simonpcouch commented Aug 15, 2022

topepo commented Aug 17, 2022

autostats

Newly broken

simonpcouch commented Aug 17, 2022

simonpcouch commented Aug 17, 2022

github-actions bot commented Sep 1, 2022

patch `params` argument with `xgboost` engine in `boost_tree()` #787

patch `params` argument with `xgboost` engine in `boost_tree()` #787