Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP observation original value isn't always apparent in variable splits #124

Closed
hbaniecki opened this issue Jul 24, 2020 · 5 comments
Closed

CP observation original value isn't always apparent in variable splits #124

hbaniecki opened this issue Jul 24, 2020 · 5 comments
Assignees

Comments

@hbaniecki
Copy link
Member

@hbaniecki hbaniecki commented Jul 24, 2020

Added the include_new_observation=True parameter in the python implementation, which adds observation variable values to variable splits.

@pbiecek pbiecek self-assigned this Jul 28, 2020
pbiecek added a commit that referenced this issue Jul 28, 2020
@pbiecek
Copy link
Member

@pbiecek pbiecek commented Jul 28, 2020

ceteris_paribus has now argument variable_splits_with_obs which adds values from new_observations to variable_splits

default behaviour is not changed!

Example before/after

 model_titanic_rf <- randomForest(survived ~ gender + age + fare,
                                  data = na.omit(titanic_imputed))
 explain_titanic_rf <- explain(model_titanic_rf,
                               data = titanic_imputed[,-8],
                               y = titanic_imputed[,8],
                               verbose = FALSE)

 cp1 <- ceteris_paribus(explain_titanic_rf, titanic_imputed[c(1,2,3, 198),], grid_points=5) 
 plot(cp1) +
   show_observations(cp1)
 
 cp1 <- ceteris_paribus(explain_titanic_rf, titanic_imputed[c(1, 2,3, 198),], grid_points=5, variable_splits_with_obs = TRUE) 
 plot(cp1) +
   show_observations(cp1)
@pbiecek pbiecek added the fixed label Jul 28, 2020
@hbaniecki
Copy link
Member Author

@hbaniecki hbaniecki commented Jul 29, 2020

This yields an error

cp1 <- ceteris_paribus(explain_titanic_rf, titanic_imputed[c(1, 2,3, 198),], grid_points=5, variable_splits_with_obs = TRUE) 
plot(cp1) +
  show_observations(cp1, variable_type='categorical')

Also, can we add drwhy colors to the new errorbar CP?

cp1 <- ceteris_paribus(explain_titanic_rf, titanic_imputed[c(1,2,3, 198),], grid_points=5) 
plot(cp1, variable_type='categorical') 
@pbiecek
Copy link
Member

@pbiecek pbiecek commented Jul 29, 2020

IMHO the

plot(cp1) +
  show_observations(cp1, variable_type='categorical')

should not work, as plot(cp1) plots profiles for continouse variables and show_observations are asked about categorical ones

@pbiecek
Copy link
Member

@pbiecek pbiecek commented Jul 29, 2020

I just noticed that neither version of CP profiles (bars, profiles, steps) uses drwhy colors,
will change this in the next version

thanks

@hbaniecki
Copy link
Member Author

@hbaniecki hbaniecki commented Jul 29, 2020

My bad, this works (besides the drwhy palette):

library(DALEX)
library(randomForest)
library(ingredients)

model_titanic_rf <- randomForest(survived ~ gender + age + fare,
                                 data = na.omit(titanic_imputed))
explain_titanic_rf <- explain(model_titanic_rf,
                              data = titanic_imputed[,c(1,2,5)],
                              y = titanic_imputed[,8],
                              verbose = FALSE)

cp1 <- ceteris_paribus(explain_titanic_rf, titanic_imputed[c(1,2,3, 198),], grid_points=5) 

plot(cp1, variable_type='categorical') 

Thanks

@hbaniecki hbaniecki closed this Jul 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.