Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no applicable method for 'predict' applied to an object of class "ranger" #56

Open
quantumlinguist opened this issue Feb 13, 2023 · 12 comments

Comments

@quantumlinguist
Copy link

I have been trying to use the fastshap package but I get this error:

task 1 failed - "no applicable method for 'predict' applied to an object of class "ranger""

If I do methods(predict), predict.ranger does appear in the list.

@bgreenwell
Copy link
Owner

Hi @quantumlinguist, if this is still troubling you, would you mind sharing a reproducible example for me to run on my end?

@viola-hilbert
Copy link

viola-hilbert commented Jul 27, 2023

Hi @bgreenwell, I get the same error when running exactly the code from your vignette (fastshap.Rmd).

library(fastshap)
library(ranger)
library(AmesHousing)
library(doParallel)

ames <- as.data.frame(AmesHousing::make_ames())
X <- subset(ames, select = -Sale_Price)  # features only

# Fit a random forest
set.seed(102)
(rfo <- ranger(Sale_Price ~ ., data =  ames, write.forest = TRUE))


# Prediction wrapper
pfun <- function(object, newdata) {
  predict(object, data = newdata)$predictions
}


# With parallelism ---> error "task 1 failed - "no applicable method for 'predict' applied to an object of class "ranger""
registerDoParallel(cores = 12)  # use forking with 12 cores
set.seed(5038)
system.time({  # estimate run time
  ex.ames.par <- explain(rfo, X = X, pred_wrapper = pfun, nsim = 50, 
                         adjust = TRUE, parallel = TRUE)
})

When I run

# Without parallelism --> works
set.seed(1706)
system.time({  # estimate run time
  ex.ames.nonpar <- explain(rfo, X = X, pred_wrapper = pfun, nsim = 50,
                            adjust = TRUE)
})

it works, so I guess the issue must be related to the option parallel = TRUE?

I also used parallel::detectCores() to figure out that I only have 6 cores to use, but changing the above to registerDoParallel(cores = 6) did not solve the problem.

@brandongreenwell-8451
Copy link

Hi @viola-hilbert are you on getting this error a Windows machine?

@viola-hilbert
Copy link

yes!

@brandongreenwell-8451
Copy link

Instead of registerDoParallel(cores = 12) can you try the following:

cl <- makeCluster(4)
registerDoParallel(cl)

But do try this out on a much smaller sample for testing! For example, pass in newdata = X[1, ] to explain a single instance.

@hlboy2333
Copy link

Hi @brandongreenwell-8451 , I get the same error as @viola-hilbert .I also use the code from your vignette (fastshap.Rmd), and I attempted the method you mentioned(cl <- makeCluster(4)).But the error(no applicable method for 'predict' applied to an object of class...)still exist when I choose the option that parallel = TRUE.
After many attempts, I found that sometimes the parallel operation worked, but most of the time it didn't work and the same error occurred.
Is there any solution at present? Thank you very much and look forward to your reply.

@brandongreenwell-8451
Copy link

Hi @hlboy2333, what type of OS are you running this on?

@hlboy2333
Copy link

Hi @hlboy2333, what type of OS are you running this on?
It is Windows 11, and I have 16 cores to use.

@brandongreenwell-8451
Copy link

Thanks @hlboy2333, I'm having trouble reproducing the issue on my end. You may just need to pass "ranger" via the .packages argument as shown below. Can you try running this and see what you get?

library(fastshap)
library(ranger)
library(AmesHousing)
library(doParallel)

ames <- as.data.frame(AmesHousing::make_ames())[1:200, ]  # try with a sample
X <- subset(ames, select = -Sale_Price)  # features only

# Fit a random forest
set.seed(102)
(rfo <- ranger(Sale_Price ~ ., data =  ames, write.forest = TRUE))

# Prediction wrapper
pfun <- function(object, newdata) {
  predict(object, data = newdata)$predictions
}

cl <- makeCluster(4) # use 4 workers
registerDoParallel(cl) # register the parallel backend

system.time({  # estimate run time
  ex.ames.par <- explain(rfo, X = X, pred_wrapper = pfun, nsim = 5, 
                         adjust = TRUE, parallel = TRUE, .packages = "ranger")
})

@hlboy2333
Copy link

Thank you so much for your help @brandongreenwell-8451. Your method worked successfully(Not only on the code you provided, but also on my own data and models (It took a while, please forgive me for being late). It solved the problem that had been bothering me for nearly a day.
I am curious about why you use ".packages=", what does this parameter do?

@brandongreenwell-8451
Copy link

Hi @hlboy2333, glad it works now. This is more of a function of the foreach package, which is used under the hood. You can read some about it in the associated vignette: https://cran.r-project.org/web/packages/foreach/vignettes/foreach.html.

You can pass additional arguments to foreach via the ... param in the call to explain(), as described in the help page. I think passing packages is more of an issue with which type of parallel processing you're using in R (e.g., snow-like or multicore-like). For the former, which I think is called forking and typically what's used on Windows, you often have to pass in packages, etc. if the function you're running requires it.

@brandongreenwell-8451
Copy link

I'll leave this issue open until I can generalize the vignette example to be more system agnostic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants