Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Implement feature weights. #7660

Merged
merged 1 commit into from
Feb 16, 2022
Merged

Conversation

trivialfis
Copy link
Member

Add feature weights to R DMatrix. Please note that the "approx" tree method was rewritten very recently and doesn't support feature weights in released versions.

cc @hetong007 @tibshirani

Close #7657

@trivialfis trivialfis added this to 1.6 In Progress in 2.0 Roadmap via automation Feb 16, 2022
@trivialfis trivialfis merged commit 12949c6 into dmlc:master Feb 16, 2022
2.0 Roadmap automation moved this from 1.6 In Progress to 1.6 Done Feb 16, 2022
@trivialfis trivialfis deleted the r-feature-weights branch February 16, 2022 14:20
@tibshirani
Copy link

I am a github novice: have you added the feature weights ? Is there a new version I can try? Thanks!

rob

@trivialfis
Copy link
Member Author

Hi @tibshirani,

have you added the feature weights

Yes, it's added.

Is there a new version I can try

Not yet, unfortunately, we are still in the development phase and not ready for a new major release. You need to install xgboost from the source code, assuming you have a recent R toolchain:

git clone --recursive https://github.com/dmlc/xgboost.git
cd xgboost/R-package/
R CMD INSTALL .

Please let us know if there's anything we can help with.

@tibshirani
Copy link

tibshirani commented Feb 16, 2022 via email

@tibshirani
Copy link

tibshirani commented Feb 18, 2022 via email

@trivialfis
Copy link
Member Author

Hi @tibshirani , I wrote a test for this PR, can it be a starting point?

library(xgboost)

context("feature weights")

test_that("training with feature weights works", {
  nrows <- 1000
  ncols <- 9
  set.seed(2022)
  x <- matrix(rnorm(nrows * ncols), nrow = nrows)
  y <- rowSums(x)
  weights <- seq(from = 1, to = ncols)

  test <- function(tm) {
    names <- paste0("f", 1:ncols)
    xy <- xgb.DMatrix(data = x, label = y, feature_weights = weights)
    params <- list(colsample_bynode = 0.4, tree_method = tm, nthread = 1)
    model <- xgb.train(params = params, data = xy, nrounds = 32)
    importance <- xgb.importance(model = model, feature_names = names)
    expect_equal(dim(importance), c(ncols, 4))
    importance <- importance[order(importance$Feature)]
    expect_lt(importance[1, Frequency], importance[9, Frequency])
  }

  for (tm in c("hist", "approx", "exact")) {
    test(tm)
  }
})

@tibshirani
Copy link

tibshirani commented Feb 18, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using/Adding feature_weights in R
3 participants