boost_tree() reverses 'event' class when converting to Xgb.DMatrix


## The problem

I've consistently found different results when using `boost_tree()` vs `xgb.train()` in classification mode. After looking through the source code I noticed the `y` is being converted from factor to numeric via `y <- as.numeric(y) -1` this has the effect of coding the first factor level as `0` and second factor level as `1`. This has been confusing because tidymodels defaults to the first factor level as the `event` class, but when the xgboost model is trained the second factor level is represented as a `1`. 

## Reproducible example

```r
library(tidyverse)
library(tidymodels)
library(mlbench)
library(xgboost)

data("PimaIndiansDiabetes")

set.seed(24)
df <- PimaIndiansDiabetes %>%
  mutate(diabetes = fct_relevel(diabetes, 'pos'))

xgb_model_1 <- 
  boost_tree(trees = 10,
             tree_depth = 3
             ) %>%
  set_engine('xgboost', 
             eval_metric = 'aucpr',
             verbose = 1) %>%
  set_mode('classification')


# Model is using 'neg' as relevant class
# Conversion of factor to numeric is reversing relevant categories assuming the first factor level is the true relevant class

as.numeric(df$diabetes) - 1
df$diabetes

xgb_model_1 %>%
  fit(diabetes ~ . , df)

# Expected result

x <- as.matrix(df[,-ncol(df)])

y <- if_else(as.numeric(df$diabetes) == 2, 0, 1)

xgbmat <- xgb.DMatrix(data = x, label = y)

set.seed(24)

xgboost::xgb.train(params = list(eta = 0.3, max_depth = 3, gamma = 0, 
    colsample_bytree = 1, min_child_weight = 1, subsample = 1), 
    data = xgbmat, nrounds = 10, watchlist = list('train' = xgbmat), verbose = 1, 
    objective = "binary:logistic", eval_metric = "aucpr", 
    nthread = 1)

```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

boost_tree() reverses 'event' class when converting to Xgb.DMatrix #420

The problem

Reproducible example

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

boost_tree() reverses 'event' class when converting to Xgb.DMatrix #420

Description

The problem

Reproducible example

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions