The problem
I'm having trouble with tune_bayes() tuning xgboost parameters. Without tuning mtry the function works. After mtry is added to the parameter list and then finalized I can tune with tune_grid and random parameter selection without problems. tune_bayes throws an error.
Reproducible example
doParallel::registerDoParallel()
xgboost_set <-
parameters(bike_rf_wkfl) %>%
update(mtry = finalize(mtry(), bike_training))
#> Error in parameters(bike_rf_wkfl) %>% update(mtry = finalize(mtry(), bike_training)): could not find function "%>%"
## or entered by hand
# xgboost_set <-
# parameters(bike_rf_wkfl) %>%
# update(mtry = mtry(c(2,8)))
# this will work
bike_rf_initial <-
bike_rf_wkfl %>%
tune_grid(
resamples = bike_folds,
param_info = xgboost_set,
metrics = bike_metrics,
grid = 9
)
#> Error in bike_rf_wkfl %>% tune_grid(resamples = bike_folds, param_info = xgboost_set, : could not find function "%>%"
# this will not work
bike_rf_rs <-
bike_rf_wkfl %>%
tune_bayes( initial = 9,
resamples = bike_folds,
param_info = xgboost_set,
metrics = metric_set(mape, rsq),
)
#> Error in bike_rf_wkfl %>% tune_bayes(initial = 9, resamples = bike_folds, : could not find function "%>%"
Created on 2021-11-18 by the reprex package (v2.0.1)
Error message for tune bayes
x Gaussian process model: Error in fit_gp(mean_stats %>% dplyr::select(-.iter), pset = param_info, : argument is missing, with no default
Error in eval(expr, p) : no loop for break/next, jumping to top level
x Optimization stopped prematurely; returning current results.
My workflow looks like
bike_all<-
read_csv("dane/train.csv", col_types = cols()) %>%
select(- casual, - registered)
# Create data split object
bike_split <-
initial_time_split(bike_all,
prop = .80)
# Create the training data
bike_training <- bike_split %>%
training()
# Create the test data
bike_testing <- bike_split %>%
testing()
bike_recipe <- recipe(count ~ ., data = bike_training) %>%
step_mutate(datetime_hr = as.factor(lubridate::hour(datetime))) %>%
step_date(datetime, features = c("doy", "dow", "month", "year"), abbr = TRUE) %>%
step_log(windspeed, base = 10, offset = 1) %>%
update_role("datetime", new_role = "id_variable") %>%
step_dummy(all_nominal(), -all_outcomes(), one_hot = TRUE) %>%
{.}
bike_folds <-
timetk::time_series_cv(
bike_training,
assess = "1 months",
initial = "11 months",
skip = "1 months",
slice_limit = 5,
cumulative = TRUE)
rf_model <- boost_tree(
trees = 500,
tree_depth = tune(),
min_n = tune(),
mtry = tune(),
loss_reduction = tune(),
sample_size = tune(),
learn_rate = tune(),
) %>%
set_engine('xgboost',
objective = 'count:poisson') %>%
set_mode('regression')
# Create workflow
bike_rf_wkfl <-
workflow() %>%
# Add model
add_model(rf_model) %>%
# Add recipe
add_recipe(bike_recipe)
# Create custom metrics function
bike_metrics <- metric_set(mape, rsq)
Created on 2021-11-18 by the reprex package (v2.0.1)
I am using the dataset about bike rental (attached)
train.csv
The problem
I'm having trouble with tune_bayes() tuning xgboost parameters. Without tuning mtry the function works. After mtry is added to the parameter list and then finalized I can tune with tune_grid and random parameter selection without problems. tune_bayes throws an error.
Reproducible example
Created on 2021-11-18 by the reprex package (v2.0.1)
Error message for tune bayes
x Gaussian process model: Error in fit_gp(mean_stats %>% dplyr::select(-.iter), pset = param_info, : argument is missing, with no default
Error in eval(expr, p) : no loop for break/next, jumping to top level
x Optimization stopped prematurely; returning current results.
My workflow looks like
Created on 2021-11-18 by the reprex package (v2.0.1)
I am using the dataset about bike rental (attached)
train.csv