Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ranger now accepts weights #414

Closed
LluisRamon opened this issue Apr 18, 2016 · 4 comments
Closed

ranger now accepts weights #414

LluisRamon opened this issue Apr 18, 2016 · 4 comments

Comments

@LluisRamon
Copy link
Contributor

@LluisRamon LluisRamon commented Apr 18, 2016

Hi Max,

I have seen that package ranger now accepts weights using parameter case.weights. It is included in CRAN version 0.4.
.

If I use case.weights inside train dots, it gives me an error. I created a custom function like the one below and it seems to work fine. If no case.weights, ranger expects NULL as in train, this is why I include wts directly to ranger.

rangerWeight <- getModelInfo("ranger")$ranger

rangerWeight$fit <- function (x, y, wts, param, lev, last, classProbs, ...) 
{
  if (!is.data.frame(x)) 
    x <- as.data.frame(x)
  x$.outcome <- y
  out <- ranger(.outcome ~ ., data = x, mtry = param$mtry, 
                write.forest = TRUE, probability = classProbs, case.weights = wts, ...)
  if (!last) 
    out$y <- y
  out
}

Not sure if this is a feature requesting weights in ranger or a bug when I use them inside dots.

If you need a reproducible example of the error or a pull request to method ranger I'll be happy to provide them.

Thank you very much,

@topepo
Copy link
Owner

@topepo topepo commented Apr 18, 2016

Just passing weights won't work since that isn;t the resampled version of the values (and the dimensions don't match the data). I've checked in a change that allows it:

> library(caret)
> 
> set.seed(1)
> dat <- twoClassSim(100)
> 
> set.seed(2)
> with_weights <- train(Class ~ ., data = dat, method = modelInfo, weights = (1:100)/100)
> set.seed(2)
> no_weights <- train(Class ~ ., data = dat, method = modelInfo)
> 
> with_weights
Random Forest 

100 samples
 15 predictor
  2 classes: 'Class1', 'Class2' 

No pre-processing
Resampling: Bootstrapped (25 reps) 
Summary of sample sizes: 100, 100, 100, 100, 100, 100, ... 
Resampling results across tuning parameters:

  mtry  Accuracy   Kappa    
   2    0.6906904  0.1830998
   8    0.7085037  0.2774630
  15    0.7004598  0.2775892

Accuracy was used to select the optimal model using  the largest value.
The final value used for the model was mtry = 8. 
> no_weights
Random Forest 

100 samples
 15 predictor
  2 classes: 'Class1', 'Class2' 

No pre-processing
Resampling: Bootstrapped (25 reps) 
Summary of sample sizes: 100, 100, 100, 100, 100, 100, ... 
Resampling results across tuning parameters:

  mtry  Accuracy   Kappa    
   2    0.6870326  0.1888561
   8    0.7105888  0.2957754
  15    0.7121800  0.3166938

Accuracy was used to select the optimal model using  the largest value.
The final value used for the model was mtry = 15. 
@LluisRamon
Copy link
Contributor Author

@LluisRamon LluisRamon commented Apr 18, 2016

Hi Max,

Now I get why passing weights directly didn't work. Thanks for the explanation.

I have seen in commit ed14146 that ranger method now accepts class weights, so I close the issue.

Thank you very much.

@njain007
Copy link

@njain007 njain007 commented Oct 24, 2019

Hi, Sorry, I didn't get the previous explanation. I am very new to R. Could you tell me how to rectify my code below. The weights are in the range of 1 to 4000 and not normalised. It is a survey data. I am getting this error "Error in rangerCpp(treetype, dependent.variable.name, data.final, variable.names, :
Not compatible with requested type: [type=character; target=double]." Thanks.

hyper_grid<- expand.grid(
mtry = seq(10, 310, by = 50),
node_size = seq(3, 9, by = 2),
#sampe_size = c(0.55, 0.632, 0.70, 0.80),
OOB_RMSE = 0
)

for(i in 1:nrow(hyper_grid)){

train model

model <- ranger(
formula = CS4_pvt ~.-WT,
case.weights = "WT",
data = traindata1,
num.trees = 1491,
mtry = hyper_grid$mtry[i],
min.node.size = hyper_grid$node_size[i],
importance = "impurity",
seed = 123456
)

add OOB error to grid

hyper_grid$OOB_RMSE[i] <- sqrt(model$prediction.error)
}

@njain007
Copy link

@njain007 njain007 commented Oct 29, 2019

Hi Max,

Now I get why passing weights directly didn't work. Thanks for the explanation.

I have seen in commit ed14146 that ranger method now accepts class weights, so I close the issue.

Thank you very much.

Hi, Could you please explain how did you solve the issue with using case weights. Sorry, I didn't understand the explanation. Could you please help me resolving the error below. WT2 is in decimals.

random_forest_govt2 <- ranger(CS4_govt ~ CS22 + CS23 + TA10A + Nchild_adult + Income_person
+ RO3 + RO5 + COPC + HHEDUC + ED6 + CS10 + CS11 + CS12 + CS8
+ CS5 + ID11 + ED7 + ID13, data=IHDS_2011_vill2_govt2, case.weights = "WT2")
random_forest_govt2

I get an error - Error in rangerCpp(treetype, dependent.variable.name, data.final, variable.names, :
Not compatible with requested type: [type=character; target=double].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.