Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tuning with list-columns in grid #625

Closed
SHo-JANG opened this issue Feb 28, 2023 · 6 comments · Fixed by #633
Closed

tuning with list-columns in grid #625

SHo-JANG opened this issue Feb 28, 2023 · 6 comments · Fixed by #633
Labels
bug an unexpected problem or unintended behavior

Comments

@SHo-JANG
Copy link

SHo-JANG commented Feb 28, 2023

I want to customize objective function.

For example, what I want to implement now is a loss function called Focal loss, which can be used when there is a class imbalance.
Focal Loss

Because focal loss is a generalization of cross-entropy, setting certain hyperparameters will result in exactly the same result as CE.

library(tidymodels)
library(xgboost)
#> 
#> Attaching package: 'xgboost'
#> The following object is masked from 'package:dplyr':
#> 
#>     slice

data(agaricus.train, package = "xgboost")
data(agaricus.test, package = "xgboost")

dtrain <- with(agaricus.train, xgb.DMatrix(data, label = label, nthread = 2))
dtest <- with(agaricus.test, xgb.DMatrix(data, label = label, nthread = 2))
watchlist <- list(train = dtrain, eval = dtest)

# original cross entropy loss
logregobj <- function(preds, dtrain) {
  labels <- getinfo(dtrain, "label")
  preds <- 1/(1 + exp(-preds))
  grad <- preds - labels
  hess <- preds * (1 - preds)
  return(list(grad = grad, hess = hess))
}

# focal_loss --------------------------------------------------------------

focal_loss <- function(preds, dtrain,alpha=0.5,focal_gamma=0) {
  labels <- getinfo(dtrain, "label")
  preds <- 1 / (1 + exp(-preds))
  
  p<- preds
  y <- labels
  
  grad <- (y*alpha*(1-p)^focal_gamma*(focal_gamma*log(p)*p-(1-p))-
             (1-y)*(1-alpha)*p^(focal_gamma)*(focal_gamma*log(1-p)*(1-p)-p))
  
  
  
  du <- y*alpha*(1-p)^(focal_gamma-1)*(log(p)*(-focal_gamma^2 *p + (1-p)*focal_gamma)+ 2*focal_gamma*(1-p)+(1-p))
  dv <- -(1-y)*(1-alpha)*p^(focal_gamma-1)*(log(1-p)*(focal_gamma^2*(1-p)-p*focal_gamma)-2*focal_gamma*p-p)
  
  hess <- (du+dv)*p*(1-p)
  
  return(list(grad = grad, hess = hess))
}


param <- list(max_depth = 2, eta = 1, nthread = 2,
              objective = "binary:logistic", eval_metric = "auc")
bst <- xgb.train(param, dtrain, nrounds = 5, watchlist)
#> [1]  train-auc:0.958228  eval-auc:0.960373 
#> [2]  train-auc:0.981413  eval-auc:0.979930 
#> [3]  train-auc:0.997070  eval-auc:0.998518 
#> [4]  train-auc:0.998757  eval-auc:0.998943 
#> [5]  train-auc:0.999298  eval-auc:0.999830
#> [1]  train-auc:0.958228  eval-auc:0.960373 
#> [2]  train-auc:0.981413  eval-auc:0.979930

bst$params$objective
#> [1] "binary:logistic"
#> [1] "binary:logistic"

param$objective <- logregobj

bst <- xgb.train(param, dtrain, nrounds = 5, watchlist)
#> [1]  train-auc:0.958228  eval-auc:0.960373 
#> [2]  train-auc:0.981413  eval-auc:0.979930 
#> [3]  train-auc:0.997070  eval-auc:0.998518 
#> [4]  train-auc:0.998757  eval-auc:0.998943 
#> [5]  train-auc:0.998120  eval-auc:0.999830
#> [1]  train-auc:0.958228  eval-auc:0.960373 
#> [2]  train-auc:0.981413  eval-auc:0.979930

param$objective <- focal_loss# alpha = 0.5, gamma= 0 -> same result! 

bst <- xgb.train(param, dtrain, nrounds = 5, watchlist)
#> [1]  train-auc:0.958228  eval-auc:0.960373 
#> [2]  train-auc:0.981413  eval-auc:0.979930 
#> [3]  train-auc:0.997070  eval-auc:0.998518 
#> [4]  train-auc:0.998166  eval-auc:0.998943 
#> [5]  train-auc:0.998823  eval-auc:0.999830

I want to tuning alpha , focal_gamma

param$objective <- partial(focal_loss, alpha = 0.3, focal_gamma = 0)

bst <- xgb.train(param, dtrain, nrounds = 5, watchlist)
#> [1]  train-auc:0.979337  eval-auc:0.980196 
#> [2]  train-auc:0.992593  eval-auc:0.993159 
#> [3]  train-auc:0.999934  eval-auc:0.999917 
#> [4]  train-auc:0.999978  eval-auc:0.999972 
#> [5]  train-auc:0.999978  eval-auc:0.999972

Created on 2023-02-28 with reprex v2.0.2

I checked that it works well on the existing xgb.train, but I don't know how to apply the tune function.
From now on, the code below is the way I tried.

library(tidymodels)
library(xgboost)
#> 
#> Attaching package: 'xgboost'
#> The following object is masked from 'package:dplyr':
#> 
#>     slice
library(scales)
library(dials)



alpha <- function(range = c(0,1), trans = NULL) {
  new_quant_param(
    type = "double",
    range = range,
    inclusive = c(TRUE, TRUE),
    trans = trans,
    label = c(num_initial_terms = "# Initial alpha"),
    finalize = NULL
  )
}


focal_gamma <- function(range = c(0, 5), trans = NULL) {
  new_quant_param(
    type = "double",
    range = range,
    inclusive = c(TRUE, TRUE),
    trans = trans,
    label = c(num_initial_terms = "# Initial gamma"),
    finalize = NULL
  )
}


data<- two_class_dat |> 
  rename(y=Class)

set.seed(100)
splits<- initial_split(data,prop = 0.8,strata = y)
train_data <- training(splits)
test_data <- testing(splits)
resamples<- vfold_cv(data = train_data,v = 5,strata = y)

xgb_model <- boost_tree( mode = "classification",
                               tree_depth     =tune(),
                               trees          =tune()) |> 
  set_engine(engine = "xgboost" ,
             objective = partial(focal_loss,
                                 focal_gamma=tune())) # 
#I wanted to tune the two hyperparameters together at first, but due to the error message, I am aiming to tune one first.

#>Error: Only one tunable value is currently allowed per argument.
#>The current argument has: `partial(focal_loss, focal_gamma = tune(), alpha = tune())`.



xgb_model |> translate()
#> Boosted Tree Model Specification (classification)
#> 
#> Main Arguments:
#>   trees = tune()
#>   tree_depth = tune()
#> 
#> Engine-Specific Arguments:
#>   objective = partial(focal_loss, focal_gamma = tune())
#> 
#> Computational engine: xgboost 
#> 
#> Model fit template:
#> parsnip::xgb_train(x = missing_arg(), y = missing_arg(), weights = missing_arg(), 
#>     nrounds = tune(), max_depth = tune(), objective = partial(focal_loss, 
#>         focal_gamma = tune()), nthread = 1, verbose = 0)



rec_base<- train_data %>% 
  recipe(y~.) |> 
  #step_mutate_at(all_numeric_predictors(), fn = list(orig = ~.)) %>%
  step_normalize(all_predictors(), -all_outcomes()) 


xgb_workflow <- 
  workflow() %>% 
  add_recipe(rec_base) %>% 
  add_model(xgb_model)


xgb_workflow %>%
  extract_parameter_set_dials()
#> Collection of 3 parameters for tuning
#> 
#>  identifier       type    object
#>       trees      trees nparam[+]
#>  tree_depth tree_depth nparam[+]
#>   objective  objective   missing
#> The parameter `objective` needs a `param` object. 
#> See `vignette('dials')` to learn more.

Created on 2023-02-28 with reprex v2.0.2

The extract_parameter_set_dials() function does not recognize the focal_gamma argument.
How I can tuning objective function?
Ultimately, I want to tune not only one parameter but also several parameters together.

@simonpcouch
Copy link
Contributor

simonpcouch commented Mar 9, 2023

Hey @SHo-JANG, thanks for the issue.

This is a really interesting use case, and I appreciate the helpful reprex! You’ve worked to integrate with the machinery in a very intuitive way.

I think the hitch you’re running into here is that tune expects the thing to be tuned to be an argument to the training function. This is where the Only one tunable value is currently allowed per argument error is coming from. This is why, for instance, we have a light wrapper xgboost::xgb.train() that “lifts” the params list arguments to be main arguments.

As such, you’ll need to specify the partials of the objective function as your grid argument. This will be your entry point for tuning many arguments as well, as you can manually specify any combination of hyperparameters that you’d like.

I realize this is sub-optimal, and means that you have to be careful in keeping track of which partial is which, though I don’t anticipate we’ll develop this interface further as this is an uncommon use case and would require some fundamental changes to tune. You can use the extracts, though, to be sure you’ve kept track of hyperparameters rigorously.

Mirroring your reprex:

library(tidymodels)

# focal_loss --------------------------------------------------------------
focal_loss <- function(preds, dtrain, alpha = 0.5,focal_gamma = 0) {
  labels <- getinfo(dtrain, "label")
  preds <- 1 / (1 + exp(-preds))
  
  p<- preds
  y <- labels
  
  grad <- (y*alpha*(1-p)^focal_gamma*(focal_gamma*log(p)*p-(1-p))-
             (1-y)*(1-alpha)*p^(focal_gamma)*(focal_gamma*log(1-p)*(1-p)-p))
  
  
  
  du <- y*alpha*(1-p)^(focal_gamma-1)*(log(p)*(-focal_gamma^2 *p + (1-p)*focal_gamma)+ 2*focal_gamma*(1-p)+(1-p))
  dv <- -(1-y)*(1-alpha)*p^(focal_gamma-1)*(log(1-p)*(focal_gamma^2*(1-p)-p*focal_gamma)-2*focal_gamma*p-p)
  
  hess <- (du+dv)*p*(1-p)
  
  return(list(grad = grad, hess = hess))
}

# data setup ----------------------------------------------------------------
data <- two_class_dat %>% 
  rename(y = Class)

set.seed(100)

splits <- initial_split(data,prop = 0.8, strata = y)
train_data <- training(splits)
test_data <- testing(splits)
resamples <- vfold_cv(data = train_data,v = 5,strata = y)

# specifications ----------------------------------------------------
xgb_spec <- 
  boost_tree(mode = "classification", tree_depth = tune(), trees = tune()) %>% 
  # note that i just set `objective = tune()` here
  set_engine(object = ., engine = "xgboost", objective = tune())

xgb_recipe <- train_data %>% 
  recipe(y ~ .) |> 
  #step_mutate_at(all_numeric_predictors(), fn = list(orig = ~.)) %>%
  step_normalize(all_predictors(), -all_outcomes()) 

xgb_workflow <- 
  workflow() %>% 
  add_recipe(xgb_recipe) %>% 
  add_model(xgb_spec)

# grid setup ------------------------------------------------------------------
partial_grid <-
  expand_grid(
    alpha = seq(0, 1, length.out = 3),
    focal_gamma = seq(0, 5, length.out = 3)
  )

partial_grid$partials <-
  map2(
    partial_grid$alpha,
    partial_grid$focal_gamma,
    ~partial(focal_loss, alpha = !!.x, focal_gamma = !!.y)
  )

xgb_grid <-
  extract_parameter_set_dials(xgb_spec) %>%
  filter(id != "objective") %>%
  grid_latin_hypercube(size = 9) %>%
  bind_cols(partial_grid %>% select(objective = partials))

# tune! -----------------------------------------------------------------------
res <- tune_grid(xgb_workflow, resamples = resamples, grid = xgb_grid)
#> → A | error:   Error in xgb.iter.update(bst$handle, dtrain, ...
#> There were issues with some computations   A: x5
#> 
#> Warning: All models failed. Run `show_notes(.Last.tune.result)` for more
#> information.

Created on 2023-03-09 with reprex v2.0.2

That said, as you can see, there's an issue here: supplying partials of the function as part of a tibble column means that we need to wrap that collection of functions as a list. So, tune passes each possible value of objective as a one-element list containing the function, rather than the function itself:

Browse[2]> finalize_workflow_spec(workflow, iter_grid_model)
══ Workflow ══════════════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: boost_tree()

── Preprocessor ──────────────────────────────────────────────────────────────
1 Recipe Step

• step_normalize()

── Model ─────────────────────────────────────────────────────────────────────
Boosted Tree Model Specification (classification)

Main Arguments:
  trees = 1886
  tree_depth = 2

Engine-Specific Arguments:
  objective = list(structure(function (...) {focal_loss <- function (preds,<snip>

Computational engine: xgboost 

and xgboost thus trips up by saying that it doesn't know what to do with a list for an objective function.

This is where that change from objective = tune() to objective = list(...) happens:

workflow <- finalize_workflow_spec(workflow, iter_grid_model)

but that change might actually need to happen in workflows.

So, for now, this does not work, but we're on it. :)

@simonpcouch simonpcouch added the bug an unexpected problem or unintended behavior label Mar 9, 2023
@simonpcouch simonpcouch changed the title How can I tune the objective function of the xgboost model? tuning with list-columns in grid Mar 9, 2023
@simonpcouch
Copy link
Contributor

A possible fix in #633! See updated reprex there—you can install those changes with pak::pak("tidymodels/tune@625").

@SHo-JANG
Copy link
Author

SHo-JANG commented Mar 10, 2023

pak::pak("tidymodels/tune@625")
#> Error: ! error in pak subprocess
#> Caused by error: 
#> ! Could not solve package dependencies:
#> * tidymodels/tune@625: ! pkgdepends resolution error for tidymodels/tune@625.
#> Caused by error: 
#> ! Can't find reference @625 in GitHub repo tidymodels/tune.

Created on 2023-03-10 with reprex v2.0.2

.Last.error
<callr_error/rlib_error_3_0/rlib_error/error>
Error:
! error in pak subprocess
Caused by error:
! Could not solve package dependencies:

  • tidymodels/tune@625: ! pkgdepends resolution error for tidymodels/tune@625.
    Caused by error:
    ! Can't find reference @625 in GitHub repo tidymodels/tune.

Backtrace:

  1. pak::pak("tidymodels/tune@625")
  2. pak::pkg_install(pkg, ...)
  3. pak:::remote(function(...) get("pkg_install_make_plan", asNamespace("pak"))(.…
  4. err$throw(res$error)

Subprocess backtrace:

  1. base::withCallingHandlers(cli_message = function(msg) { …
  2. get("pkg_install_make_plan", asNamespace("pak"))(...)
  3. prop$stop_for_solution_error()
  4. private$plan$stop_for_solve_error()
  5. pkgdepends:::pkgplan_stop_for_solve_error(self, private)
  6. base::throw(new_error("Could not solve package dependencies:\n", msg, …
  7. | base::signalCondition(cond)
  8. global (function (e) …

It doesn't work for me! Thank you for your reply.

@simonpcouch
Copy link
Contributor

Oh, woops! That ref is pak::pak("tidymodels/tune@633")😆

@SHo-JANG
Copy link
Author

It works with pak::pak("tidymodels/tune#633") ! 🤣 Thank you very very much.
But ultimately, I want to be able to do Bayesian optimization.

I sincerely thank you for creating such a useful ecosystem.
If there are more things that can be customized, it will be a better ecosystem.
There are some features that I didn't mention here, but thought it would be nice to have.
I will make a suggestion after organizing it to persuade that it is a necessary function!
I'm really sorry to quote another package here, but I hope it reaches the flexibility that the mlr3 ecosystem or python has.

If there is a necessary function, I will continue to suggest it.
I still lack a lot of study in the R language, but I hope to be able to contribute to the tidymodels ecosystem one day.
I'm sorry I always ask for it. I will make good use of this function you have improved. And I look forward to the update so that it can be applied to other models as well!
Thank you very much.

(I don't speak English very well, so I communicate through a translator. I'm really sorry if there was any violation of etiquette.)

@github-actions
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 30, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants