Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multinomial classification with tidymodels and #TidyTuesday volcano eruptions | Julia Silge #57

Open
utterances-bot opened this issue Dec 14, 2021 · 12 comments

Comments

@utterances-bot
Copy link

Multinomial classification with tidymodels and #TidyTuesday volcano eruptions | Julia Silge

Lately I’ve been publishing screencasts demonstrating how to use the tidymodels framework, from first steps in modeling to how to evaluate complex models. Today’s screencast demonstrates how to implement multiclass or multinomial classification using with this week’s #TidyTuesday dataset on volcanoes.

https://juliasilge.com/blog/multinomial-volcano-eruptions/

Copy link

CelMcC commented Dec 14, 2021

Thank you so much Dr Silge, this is exactly what I've been hunting for!

Copy link

Thanks Julia,

In this model I didnt see you use tuning hyperparameter for RF model, is there any specific reason for it? in practical, do you usually see any significant different in model performance before and after this tuning process?

@juliasilge
Copy link
Owner

@conlelevn Random forest models tend to perform pretty well without tuning, as long as you use "enough" trees (like 1000 or so). You can tune a random forest if you want to eke out a little more performance; I demonstrate how to do that here but typically you don't see a ton of dramatic improvement (unlike when you tune an xgboost model).

Copy link

Hi Julia,

Many thanks to the wonderful blog! In your example you showed how a recipe works on the training data as a whole (since you don't tune hyperparameter). I am wondering if you can shed some light on how recipe processing can be visible for the resampling object used for parameter tunning?

For example ,given a nested_cv process where each training data from the outer loop is used to generate a resampling object for hyperparameter tunning, how to confirm the upsampling is working properly, in only analysis sets but not assessment sets?

@juliasilge
Copy link
Owner

@Wenyu1024 You can read about how preprocessing works over resamples (in the context of parallel processing) in this section of Tidy Modeling with R; note the difference between parallel_over = "resamples" and parallel_over = "everything". If you are tuning in serial, it will, as expected, preprocess then fit for the resamples sequentially.

If you are using a nested resampling scheme, then you will need to set some of this up yourself, as outlined here.

Copy link

aousabdo commented Dec 8, 2022

Very useful post as always, Dr. Silge. I have learned a lot about tidymodels from your posts. Thank you very much!

Copy link

Hello Julia.
I was wondering how to use the vip "permute" method discussed here (koalaverse/vip#131) with multiple classes, like the volcano data? Is it possible with metric = "mauc" and then somehow specifiying the pred_fun to average over the classes; or would I need to use prediction=FALSE and metric = "accuracy"; or something else entirely?
Many thanks :-)

@juliasilge
Copy link
Owner

@smithhelen Hmmm, I'm not sure. Can you create a small reprex (a minimal reproducible example) for this? The goal of a reprex is to make it easier for folks to recreate your problem so that we can understand it and/or fix it. If you've never heard of a reprex before, you may want to start with the tidyverse.org help page. Once you have a reprex, I recommend posting on RStudio Community, which is a great forum for getting help with these kinds of modeling questions. Thanks! 🙌

Copy link

smithhelen commented Mar 30, 2023

Thank you Julia

I'll make my question a bit clearer :-)

In this volcano example you generate vi scores using the inbuilt importance="permutation" option via set_engine. Even though a probability forest (rather than a classification forest) is grown, these vi scores are measured from the change in classification accuracy (as per the ranger documentation).

In a different example, using bivariate data, you generate vi scores using the method = "permute" option in the vip package and do not specify importance = "permutation" within set_engine. Now, for a probability forest, the vi scores will be calculated using the auc method metric = "auc" (versus metric="accuracy" for a classification forest (i.e. when set_engine(..., probability = FALSE)). For the auc method, a reference class needs to be specified for both the pred_wrapper and vi().

Here is your code for the bivariate data, where you choose the reference class to be "One" (i.e. $.pred_One and reference_class = "One")

pred_fun <- function(object, newdata) {
  predict(object, new_data = newdata, type = "prob")$.pred_One
}
 
ranger_fit %>%
  vi(method = "permute", target = "Class", metric = "auc", nsim = 10,
     pred_wrapper = pred_fun, train = bivariate_train, reference_class = "One")`

An advantage of using the vip approach is that multiple simulations can be run and a boxplot produced.

My questions:

  1. is it possible to use the vip(method = "permute" ...) method to calculate vi scores when there are more than two classes (as for the volcano example) because what would the reference_class be?
  2. if 1. is not possible, then is it sensible to grow a classification forest and use vip with metric = "accuracy" instead?

Thank you!

@juliasilge
Copy link
Owner

Ah OK @smithhelen, I don't know how/if the vip package "permute" method works for multinomial classification (although you can ask over at the vip GH repo, so maybe they can clarify). You probably want to use something like DALEX instead; you can read more about using DALEX with tidymodels here.

@smithhelen
Copy link

smithhelen commented Apr 3, 2023 via email

@bgreenwell
Copy link

bgreenwell commented May 8, 2023

@juliasilge and @smithhelen sorry I'm late to the party. Starting work on the next version of vip now. In short, permutation importance works the same way for multiclass problems as it does for the binary and regression cases. In fact, the vip, iml, and ingredients (the DALEX package for variable importance) packages are all flexible enough to support ANY type of model; even ones built in Python. You just need to supply a suitable metric function and a corresponding prediction wrapper. Here's a somewhat minimal example using a multiclass random forest and the Brier score metric via yardstick:

library(ranger)

set.seed(1028)
rfo <- ranger(Species ~ ., data = iris, probability = TRUE)

p <- predict(rfo, data = iris)$predictions

head(p)
#         setosa   versicolor    virginica
# [1,] 1.0000000 0.0000000000 0.0000000000
# [2,] 0.9963333 0.0030000000 0.0006666667
# [3,] 1.0000000 0.0000000000 0.0000000000
# [4,] 1.0000000 0.0000000000 0.0000000000
# [5,] 1.0000000 0.0000000000 0.0000000000
# [6,] 0.9994286 0.0005714286 0.0000000000

# Multiclass Brier score
yardstick::brier_class_vec(iris$Species, estimate = p)

# Prediction wrapper; to use multiclass Brier score, needs to return matrix of 
# predicted probabilities
pfun <- function(object, newdata) {
  predict(object, data = newdata)$predictions
}

# Metric function; just a thin wrapper around yardstick's Brier score function
mfun <- function(actual, predicted) {
  yardstick::brier_class_vec(actual, estimate = predicted)
}

# Compute permutation importance
vi_permute(
  rfo, 
  train = iris, 
  target = "Species", 
  metric = mfun, 
  pred_wrapper = pfun,  # tells vip how to get predictions form this model
  smaller_is_better = TRUE,  # vip has no idea if smaller or larger is better
  nsim = 10
)
# # A tibble: 4 × 3
#   Variable     Importance    StDev
#   <chr>             <dbl>    <dbl>
# 1 Sepal.Length    0.00867 0.00103 
# 2 Sepal.Width     0.00223 0.000552
# 3 Petal.Length    0.149   0.00885 
# 4 Petal.Width     0.171   0.00798 

# Same, but with sorted output
vi(
  rfo, 
  method = "permute",
  train = iris, 
  target = "Species", 
  metric = mfun, 
  pred_wrapper = pfun,  # tells vip how to get predictions form this model
  smaller_is_better = TRUE,  # vip has no idea if smaller or larger is better
  nsim = 10
)
# # A tibble: 4 × 3
#   Variable     Importance    StDev
#   <chr>             <dbl>    <dbl>
# 1 Petal.Width     0.178   0.0116  
# 2 Petal.Length    0.151   0.0120  
# 3 Sepal.Length    0.00921 0.000942
# 4 Sepal.Width     0.00232 0.000662

Note that I am working to incorporate yardstick into the package to make it a bit easier by not having to write your own metric function each time (but that's where the flexibility comes in). Also, I wrote vip with scale in mind, and it's seemingly faster than alternatives, so keep that in mind. A simple benchmark can be found in out R Journal article (Figure 16). It's also parallelizable via the foreach package for larger problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants