The following peer review was solicited as part of the Distill review process.
The reviewer chose to keep keep anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service them offer to the community.
Distill is grateful to the reviewer, for taking the time to review this article.
Conflicts of Interest: Reviewer disclosed a minor potential conflict of interest.
I think this is just OK.
I will list the things I like about it and the things I don't like, then I'll give my thoughts as I had them
Having read some other distill articles, I can say that I don't think this one is currently as good as the one about attention (which it is spiritually similar to). I think it is much less good than the How Momentum Works Article, which in my mind is the model for how these articles should be. To be precise, what I mean by 'good' is: did this help me understand the topic in a way that I wouldn't have been able to without a lot of effort on my own.
Things I Like
I like that you point out that a bunch of different models are doing something that is basically a special case of this one idea. That's a valuable contribution that helps me understand the world better.
I like that you put effort into giving good descriptions of those models, though I think it's hard to really do a thorough job of this in the space you have.
I like the idea of task representations, though I felt way too little time was spent on it.
I think the presentation is mostly nice and the diagrams are clear, though there are quite a few typos.
Things I Don't Like
To me the most interesting property of these models is the one you describe near the end:
But after reading this article I don't feel like I understand this phenomenon any better!
IMO, way too much time is spent on the examples and not enough time is spent on understanding why these layers work the way that they do. This feels more like cataloguing than explaining to me.
Detailed Comments on Text
I like this diagram and I like that there is a footnote that deals with my obvious question about conv-nets.
Something of a non-sequitur.
Suggest moving this up to before the diagram.
Can batch norm itself be described using this FiLM terminology?
I actually don't see why you need to contrast film w/ attention - to me they are obviously different?
It doesn't seem like you'd lose anything by just calling this a multidimensional array...
At this point I'm not sure this is helping me understand things better?
I like all of these examples of where FILM is being used in the sense that
I also have to say that as a reader I got a bit fatigued reading about all the examples.
OK as I continue reading I'm more convinced that the list of examples is too long in its current form.
The section on properties of the trained models is too short, I think.
Idea - chunk up an artist's life into decades or something and see if they get more separated.
Here's a thing I would like more discussion about:
The text was updated successfully, but these errors were encountered: