Review #2 #2
The following peer review was solicited as part of the Distill review process.
The reviewer chose to keep anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service they offer to the community.
Several feature attribution methods rely on an additional input (besides the one being explained) called the “baseline”. The paper discusses how the choice of baseline impact the attributions for an input, and proposes the idea of averaging over several baselines when good individual choices do not exist. It does this in the context of the specific attribution method called “Integrated Gradients” and the specific task of object recognition on the ImageNet dataset.
Below are some suggestions on improving / extending this paper:
Distill employs a reviewer worksheet as a help for reviewers.
The first three parts of this worksheet ask reviewers to rate a submission along certain dimensions on a scale from 1 to 5. While the scale meaning is consistently "higher is better", please read the explanations for our expectations for each score—we do not expect even exceptionally good papers to receive a perfect score in every category, and expect most papers to be around a 3 in most categories.
Any concerns or conflicts of interest that you are aware of?: No known conflicts of interest
The text was updated successfully, but these errors were encountered:
Thank you for the detailed comments! Based on your feedback, we’ve made some changes to the article and added several new sections. In particular:
This is an important point. In our first version of the article, I think we presented some issues regarding integrated gradients in a manner that seemed like they were flaws with the original method, rather than design choices. Our most recent writing attempts to address this by presenting a more nuanced picture of each baseline choice and especially by shifting the discussion of problems to be about the baseline choice rather than the method integrated gradients itself. I am open to even more suggestions about how to improve in this direction.
We considered for a long time adding expanded discussion about the axioms that integrated gradients satisfies, and ended up omitting them from our most recent draft. We feel that an extending discussion of those axioms detract from the main point of the article, which was intended to be focused around the idea of missingness. With that said, we added a footnote about how all of the various baselines we present, including those that are distributions, satisfy the same axioms integrated gradients does.
The idea of seeing which baselines generate which types of patterns in the attributions is a really interesting open question, and one we are particularly interested in thinking about. We leave it to future work :)
Related to the point above: I think there are many ways to expand the discussion around path methods and many ways to improve the method. Again, for the sake of trying to limit the scope of the article, we will leave them to future work! I do think that the questions you raise are very compelling.
For the sake of scope, we don’t include additional data types in this article, especially since they would require significant additional work to visualize. I do agree with you though: the idea of averaging over multiple baselines should be fairly general, and would hope to see future work in this direction.
This is another really good point that we don’t directly address in the article. I fear that doing so would open up a large can of worms about whether or not you can trust attributions that are generated by a stochastic process (I believe you can). However, I am interested in this question as well and hope to pursue it in the future.
I’ve put a fair bit of thought into this and I can’t quite convince myself that it is true. As long as the baseline has lower network output than the explained input, doesn’t the sign retain it’s meaning? That is, as long as f(x) - f(x’) > 0, then I think that positive attributions means increase in output because we increase the output as we move along the path from x’ to x. I would have to formalize this intuition and run experiments to be sure.
In general, we somewhat dodge the issue of sign in this article. I know that it’s a large omission, but it just doesn’t fit in with the rest of the article. A discussion about the sign of attributions for path methods is a much needed discussion, but I can’t find a way to elegantly include it here.
Aha! There is! Based on this feedback, our new section “Expectations, and Connections to SmoothGrad” discusses this in detail.
I hope that our new version addresses some of your concerns, especially the concerns regarding the mis-characterization of the original integrated gradients method. I feel this is an important issue and I don’t want to portray integrated gradients in an unnecessarily negative light.