Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP observation original value isn't always apparent in variable splits #124

Closed
hbaniecki opened this issue Jul 24, 2020 · 5 comments
Closed
Assignees
Labels
fixed already fixed invalid ❕ This doesn't seem right

Comments

@hbaniecki
Copy link
Member

hbaniecki commented Jul 24, 2020

Added the include_new_observation=True parameter in the python implementation, which adds observation variable values to variable splits.

@pbiecek
Copy link
Member

pbiecek commented Jul 28, 2020

ceteris_paribus has now argument variable_splits_with_obs which adds values from new_observations to variable_splits

default behaviour is not changed!

Example before/after

 model_titanic_rf <- randomForest(survived ~ gender + age + fare,
                                  data = na.omit(titanic_imputed))
 explain_titanic_rf <- explain(model_titanic_rf,
                               data = titanic_imputed[,-8],
                               y = titanic_imputed[,8],
                               verbose = FALSE)

 cp1 <- ceteris_paribus(explain_titanic_rf, titanic_imputed[c(1,2,3, 198),], grid_points=5) 
 plot(cp1) +
   show_observations(cp1)
 
 cp1 <- ceteris_paribus(explain_titanic_rf, titanic_imputed[c(1, 2,3, 198),], grid_points=5, variable_splits_with_obs = TRUE) 
 plot(cp1) +
   show_observations(cp1)

@pbiecek pbiecek added the fixed already fixed label Jul 28, 2020
@hbaniecki
Copy link
Member Author

hbaniecki commented Jul 29, 2020

This yields an error

cp1 <- ceteris_paribus(explain_titanic_rf, titanic_imputed[c(1, 2,3, 198),], grid_points=5, variable_splits_with_obs = TRUE) 
plot(cp1) +
  show_observations(cp1, variable_type='categorical')

Also, can we add drwhy colors to the new errorbar CP?

cp1 <- ceteris_paribus(explain_titanic_rf, titanic_imputed[c(1,2,3, 198),], grid_points=5) 
plot(cp1, variable_type='categorical') 

@pbiecek
Copy link
Member

pbiecek commented Jul 29, 2020

IMHO the

plot(cp1) +
  show_observations(cp1, variable_type='categorical')

should not work, as plot(cp1) plots profiles for continouse variables and show_observations are asked about categorical ones

@pbiecek
Copy link
Member

pbiecek commented Jul 29, 2020

I just noticed that neither version of CP profiles (bars, profiles, steps) uses drwhy colors,
will change this in the next version

thanks

@hbaniecki
Copy link
Member Author

My bad, this works (besides the drwhy palette):

library(DALEX)
library(randomForest)
library(ingredients)

model_titanic_rf <- randomForest(survived ~ gender + age + fare,
                                 data = na.omit(titanic_imputed))
explain_titanic_rf <- explain(model_titanic_rf,
                              data = titanic_imputed[,c(1,2,5)],
                              y = titanic_imputed[,8],
                              verbose = FALSE)

cp1 <- ceteris_paribus(explain_titanic_rf, titanic_imputed[c(1,2,3, 198),], grid_points=5) 

plot(cp1, variable_type='categorical') 

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed already fixed invalid ❕ This doesn't seem right
Projects
None yet
Development

No branches or pull requests

2 participants