Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fastshap with tree augmented naive bayes of caret (tan) #44

Open
PARODBE opened this issue Jun 23, 2022 · 7 comments
Open

Fastshap with tree augmented naive bayes of caret (tan) #44

PARODBE opened this issue Jun 23, 2022 · 7 comments

Comments

@PARODBE
Copy link

PARODBE commented Jun 23, 2022

Hi,

Is it possible to use this library for tan model? In my tan model features have different categories (they are strings).

Thanks!

@bgreenwell
Copy link
Owner

Hi @PARODBE, can you provide some info as to what a “tan” mode is and what R package is used to fit them?

@PARODBE
Copy link
Author

PARODBE commented Jul 10, 2022

Of course! https://github.com/topepo/caret/blob/master/models/files/tan.R

This model can be train with categories which can work like strings (you use this model for do searchs of conditional probabilities). You can get whatever dataset before of dummies conversion and run fastshap like a proof, in my case doesn't work, but it is very possible that I do something wrong since I am very better in python.

Thank you!

@bgreenwell
Copy link
Owner

Gotcha @PARODBE, and thanks for the link. If you’d be kind enough to post a small reproducible example, I’d be happy to take a look!!

@PARODBE
Copy link
Author

PARODBE commented Jul 11, 2022

Hi again,

Use whatever dataset with categorical data (with strings, for example 1 variable, kind of animals: dogs, birds,cats..., another variable, size: High, medium, little etc and output variable, for example: cute, not cute), and after you can build your model with this (it is only an example):

set.seed(666)

fitControl <- trainControl(method = "repeatedcv",
number=5, repeats=50,
classProbs = TRUE,
summaryFunction = twoClassSummary,
verbose=F)

tune.grid <- expand.grid(smooth=10^seq(-1,2,0.2),
score=c('bic', 'aic'))

alldata.tan <- caret::train(x,y,
method = "tan",
trControl = fitControl,
tuneGrid = tune.grid,
metric = "ROC",
maximize=TRUE
)
I am doing a hyperparameter tunning based on smooth and score type.

And here you have your tan model for introduce it in fastshap, I am doing this:

p_function_G<- function(object, newdata)
caret::predict.train(alldata.tan,
newdata = x,
type = "prob")[,"Positive"] # select G class

shap_values_G <- fastshap::explain(alldata.tan,
X = x,
pred_wrapper = p_function_G,
nsim = 2,
# select examples corresponding to category G from
# the trainset used for building the model (not shown)
adjust=FALSE)

But I obtain nothing, for this reason I think that I am doing something wrong...

@PARODBE
Copy link
Author

PARODBE commented Jul 15, 2022

Have you could find out something about this???

@bgreenwell
Copy link
Owner

Hi @PARODBE, I have not found the time yet, but if you have a reprex I could run on my end, it would make it a lot easier to narrow down the issue and help solve your problem.

@PARODBE
Copy link
Author

PARODBE commented Jul 19, 2022

Hi!

I've create a section in my github for this with a dataset totally artificial but it should be useful for using fastshap: https://github.com/PARODBE/bnlearn_r_playing/tree/main

All best,
Pablo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants