Skip to content
Discussion options

You must be logged in to vote

I have simply found a pragmatic solution by running with 100, 500, 1,000, and 10,000 background points and comparing the SHAP summary plots produced. In my case, 100 seemed to perform almost the same as 10k (some neighboring features in importance swapped places, but they were very similar mean SHAP values and so it isn't really concerning to me).

For reference, I have about 20 features, and had ~25k samples in my explanatory set when performing these experiments. Here are the approximate runtimes for varying background set sizes (you can see the linear relationship):

  • 100: 8.5 min
  • 500: 33 min
  • 1,000: 69 min
  • 10,000: 648 min

Doing the same thing with the number of explanatory samples will …

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
1 reply
@jaxondk
Comment options

Comment options

You must be logged in to vote
0 replies
Answer selected by jaxondk
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
1 participant