You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are considering Shapley Value Analysis for the interpretability of our models. One crucial aspect is determining the optimal visualization plots for Shapley values and deciding the number of features to analyze.
Besides the waterfall plot, summary plot, and beeswarm plot(shared: 902c5a9) , exploring additional visualization plots such as force plots and heatmaps may provide further insights into the models.
Also, while we initially started with 22 features, which can be easier to analyse all. It will become more challenging to analyze all as more features are integrated. Considering we just analyze top 10/15?
I am quite happy with the default visualization capabilities of the SHAP library. However, I've done some extra work in this direction (for example, using tiles for a 2D visualization of Shapley values) that we can recycle, but this is just "cosmetics". I think the by-default Shapley plots are already quite informative. Another question would be whether we can find ways to map this information onto chemicals, this may potentially be useful. Let's discuss in the meeting.
As for the increasing number of features, I completely agree. We need to limit the number of features for interpretation. In my opinion, feature selection to restrict to, say, up to 100 features for Shapley analysis would make sense. I've never tried this package, but it looks good: https://github.com/AutoViML/featurewiz In any case, as always, let's first start with a good-old k-best feature selector (e.g. 100-best) and then we take it from there.
We are considering Shapley Value Analysis for the interpretability of our models. One crucial aspect is determining the optimal visualization plots for Shapley values and deciding the number of features to analyze.
Besides the waterfall plot, summary plot, and beeswarm plot(shared: 902c5a9) , exploring additional visualization plots such as force plots and heatmaps may provide further insights into the models.
Also, while we initially started with 22 features, which can be easier to analyse all. It will become more challenging to analyze all as more features are integrated. Considering we just analyze top 10/15?
@miquelduranfrigola, your input and expertise would be greatly appreciated.
The text was updated successfully, but these errors were encountered: