Skip to content

Is MLP Mixer essentially convolution in disguise? #735

Answered by rwightman
amaarora asked this question in Q&A
Discussion options

You must be logged in to vote

@amaarora I think it's clear from the twitter convo by folk who've been at this much longer than myself, it's a touchy subject that boils down to semantics.

You can replace fully connected/dense layers with 1x1 convs pretty much anywhere, you cannot replace a 1x1 with a dense layer if you expect/need the output spatial dim to vary with input. Some CNN impl have used 1x1 conv instead of a FC/linear for a long time. Convolution is clearly a more flexible operation than a dense/fc (matmul). If you reduced the form of every operation in a net to it's simplest (assuming you won't want to keep spatial output dim that vary with input, but have fixed singleton spatial dims), then all 1x1 would en…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@amaarora
Comment options

Answer selected by amaarora
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants