Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Anonymous Review 1 #1
The following peer review was solicited as part of the Distill review process. Some points in this review were clarified by an editor after consulting the reviewer.
The reviewer chose to keep keep anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service them offer to the community.
Distill is grateful to the reviewer, for taking the time to review this article.
Conflicts of Interest: Reviewer disclosed no conflicts of interest.
The images shown in this this paper are truly fascinating, and provide an interesting and useful manner to visualize the behavior of neurons in a neural network. Overall, this article is well written and contains many useful results, but in many areas omits background material that makes it difficult to follow exactly what is occurring. When clarified, this article would be highly useful to those not already familiar with the area.
There seems to be a short background paragraph that's missing from the introduction. This article immediately jumps into talking about feature visualization through optimization, but skips some important questions. What network is being used? On what dataset? Similarly, I presume that "conv2d0", "mixed3a", etc are layers of a network. If I look it up, I can see that it's Inception, but it would be good to state this explicitly. Similarly, what does "mixed4b,c" mean? Similarly, some figures are not clearly explained. In the figure talking about different objectives, what do x, y, z, and n represent? Is the layer_n the same n as the softmax[n]? (I assume not.)
I was caught off guard that after spending the majority of the paper describing how optimization can be used to produce these figures, they then say that directly optimizing for the objectives doesn't work. It might have been nice to mention this fact earlier, and just forward reference it -- if someone were to stop reading half way through they would just think that by performing direct optimization they'd be golden. The next section does survey regularization techniques (but even then, none of the regularized figures look nearly as nice as the ones prior). This seems to be the most important part of the paper, but I feel like i get the fewest details about how this is done. It also leaves me wondering which regularization methods were used to make the earlier figures.
When discussing preconditioning, I get the feeling that this is an important aspect of generating high-quality images, but I don't actually know what is happening. How is something spatially decorelated? What is done to minimize over this space? Similarly, what does "Let’s compare the direction of steepest descent in a decorelated parameterization to two other directions of steepest descent" -- I would expect there is only one steepest direction. How do you pick two other steepest ones that aren't the same? What does "compare" mean, and how do you compare to two other? This sentence seems important, but I don't understand what it is trying to say. (CSS issue: footnotes 7 and 8 do not display in Chrome.) On the whole, this section could be better explained.
Minor comments to author:
Thank you for your high-quality in-depth feedback! We went through every sentence of it and have made numerous changes to the article based upon the review you provided. These can collectively be found in the pull request #5 and are mentioned on a per-commit basis in this response.
We are especially grateful both for the insightful critique as well as the hints about sections that may be hard to understand—we do not just want to be factually correct, but also approachable.
We use GoogLeNet trained on ImageNet.
We added additional captioning on the hero diagram mentioning both the model and the dataset it was trained on. We also expanded the confusingly named
We have changed the class index from
We added a paragraph linking to the "The Enemy of Feature Visualization" section directly after the first diagram. We also explicitly list the challenges with feature visualization at the end of the introduction from 8ed5819 on.
We try to make this clearer in multiple areas:
On reviewing the section on preconditioning we agree that we were trying to be very general, potentially at the expense of concreteness and approachability. We rewrote the section in 29e8b19 to be more explicit about how this technique works when applied to images. We also added additional footnotes going into more detail on the the derivation of these techniques.
We rewrote this section and simplified the diagram to explain how the Fourier Transform induces a different metric under which the direction of steepest descent is different from the regular (L2) gradient in 29e8b19.
We have reworded those sections while keeping their intent to show that these areas offer ample opportunity for further work in 29e8b19.
That is right. We reordered this sentence to be more clear in 29e8b19.
We realize this kind of praise is unusual in an academic context. We are making a deliberate choice to do it because we think it fosters a healthy and collegial atmosphere. We think Nguyen, Yosinski, and their collaborators have made truly outstanding contributions and some other parts of our article could be read as critiquing their work, so it seems especially important to make it clear that we value their work.
After your prompt we removed those sections in 1c583d2.
You are right—fixed in 94ddda1.
In this case, the colon referred to the following diagram rather than a continued sentence. However, we found we could reword it to be more clear in 91edce5.
Thank you again for your time and helpful comments! We think the article was significantly improved by incorporating your feedback. :-)