Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review 2 - Anonymous #7

colah opened this issue Mar 16, 2020 · 0 comments

Review 2 - Anonymous #7

colah opened this issue Mar 16, 2020 · 0 comments


Copy link

@colah colah commented Mar 16, 2020

The following peer review was solicited as part of the Distill review process. The review was formatted by the editor to help with readability.

The reviewer chose to keep anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service them offer to the community.

Distill is grateful to the reviewer  for taking the time to write such a thorough review.

General Comments

The idea of using the Grant Tour approach to high-dimensional visualization for working with neural network representations is very good. The mathematical underpinnings are clean and clever. Most of the illustrative interactive figures are very helpful.

There are a few issues with the paper, mostly due to the writing. For example, while the writeup of the math is overall convincing there are typos that confuse, e.g. including more terms in (dx, dy, ...) than are actually there. I also think that even in the technical explanation part, more care to use citations with linear algebra and neural networks literature would be helpful. Much of the linear algebra is core material, so instead of just citing a textbook, it would be helpful to include a few sentences of explanation rather than stating something as straightforward fact. Even more so with neural networks material throughout the paper - the audience for this work could be wide, so stating things about neural network internals without citation or explanation is damaging. One example is the idea that linear methods for projection are most appropriate because of the linear nature of the networks. This is not really straightforward and I'm not convinced but it is left unsupported. One more example is that a sentence about how network components become vector inputs to this system may widen the audience. Relatedly, the enthusiasm is a bit too high for academic material. Using words like amazing and including an exclamation point are not appropriate. Perhaps the most disconcerting was an assertion at the beginning that neural networks are now the default classifier. This is far from true for many reasons. They are slow and expensive to train and require lots of data, plus they are difficult to interpret, making them a poor choice for many machine learning applications. Not only is that assertion inaccurate, it will be galling to many potential readers.

The main confusion is actually about how the user is meant to take advantage of this technique and when. This seems to be about confusion over two separate things. First, the concept of moving axes vs moving selected groups of points. These distinctions are neatly identified in the math, but I do not see how this works with the user perspective. The case has been made for moving axes to get a good view, but when would the points be more appropriate? Would I have to know which ones to look at from some other source? There is some coverage of the different uses, but it needs to be organized better to lay it out from the user perspective. The other issue is that the idea about a continuum of projections is lovely in the mathematics, but isn't really practical for understanding how a user would take advantage. In all cases it seems like the someone really needs to know exactly what they're looking for already. I think breaking this down by use cases, again by user tasks, would really clarify what this system can do. Right now, I'm left with an interesting concept, and an implementation I believe in, but no clear vision for how this helps.

Some other comments -

  • 'distribution of data from softmax layers is spherical' - without citation? In this case it's just about orthogonality and makes sense, but that strong unclear statement weakens the paragraph.
  • typos, especially of the misconjugation variety, are very common
  • watch GN^{new} vs GT^'. They seem to have been used interchangeably.

Distill employs a reviewer worksheet as a help for reviewers.

The first three parts of this worksheet ask reviewers to rate a submission along certain dimensions on a scale from 1 to 5. While the scale meaning is consistently "higher is better", please read the explanations for our expectations for each score—we do not expect even exceptionally good papers to receive a perfect score in every category, and expect most papers to be around a 3 in most categories.

Any concerns or conflicts of interest that you are aware of?: n/a
What type of contributions does this article make?: Novel results

Advancing the Dialogue Score
How significant are these contributions? 4/5
Outstanding Communication Score
Article Structure 2/5
Writing Style 3/5
Diagram & Interface Style 4/5
Impact of diagrams / interfaces / tools for thought? 4/5
Readability 4/5
Scientific Correctness & Integrity Score
Are claims in the article well supported? 3/5
Does the article critically evaluate its limitations? How easily would a lay person understand them? 3/5
How easy would it be to replicate (or falsify) the results? 3 /5
Does the article cite relevant work? 3/5
Does the article exhibit strong intellectual honesty and scientific hygiene? 3/5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet

No branches or pull requests

2 participants