Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integer variables converted to double during LIME #32

Closed
fennovj opened this issue Sep 15, 2017 · 0 comments
Closed

Integer variables converted to double during LIME #32

fennovj opened this issue Sep 15, 2017 · 0 comments

Comments

@fennovj
Copy link

fennovj commented Sep 15, 2017

I'll try to produce a minimal reproducible example, but I tried a few caret methods, and it did not go wrong for any of them. However, it went wrong when using the ctree classifier from the party library, since it seems to be strict at not accepting that the test set has doubles instead of integers.

makeGeneric <- function(ctreemodel){
  return(structure(list(ctreemodel), class = "myclass"))
}

predict.myclass <- function(model, newdata, type="prob", ...){
  stopifnot(type == "prob")
  predict(model[[1]], newdata, type = "prob") %>% data.frame %>% t %>% 
    data.frame("false" = 1 - ., "true" = .)
}

model_type.myclass <- function(x, ...) "classification"

FT <- read.csv("https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/datasets/Titanic.csv")
FT <- na.omit(FT)
FT$Age <- as.integer(FT$Age)

ctreemodel <- party::ctree(Survived ~ PClass + Sex + Age, FT[-1,])
genericModel <- makeGeneric(ctreemodel) 

explainer <- lime::lime(FT[-1,], genericModel)
explanation <- lime::explain(FT[1,], explainer, n_labels = 1, n_features = 2) 

As you can see, I make a class that is used to predict using the ctreemodel. the data.frame("false" = 1-., "true" = .) only works for binary classification, but that is not the issue here since it is easy to extend to multiclass classification. party proceeds to throw the following error:

 Error in checkData(oldData, RET) : 
  Classes of new data do not match original data 

Note that this error does not occur when I comment out the FT$Age <- as.integer(FT$Age) line.
I used browser() during the predict.myclass function, and it turned out that the newdata passed by lime had its integer variable replaced by a double. Then, it goes wrong during the predict(model[[1]], newdata, type = "prob") code, since this function expects Age to be an integer, but lime converted it to a double somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant