Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could patches number != MLP token mixing dimension? #4

Closed
LouiValley opened this issue Nov 26, 2021 · 2 comments
Closed

Could patches number != MLP token mixing dimension? #4

LouiValley opened this issue Nov 26, 2021 · 2 comments

Comments

@LouiValley
Copy link

I try to change the model into B/16 MLP-Mixer.
is this setting, the patch number ( sequence length) != MLP token mixing dimension.
But the code will report an error when it implements "x = layers.Add()([x, token_mixing])" because the two operation numbers have different shapes.
Take an example,
B/16 Settings:
image 3232, 2D hidden layer 768, PP= 16*16, token mixing mlp dimentsion= 384, channel mlp dimension = 3072.
Thus patch number ( sequence length) = 4, table value shape= (4, 768)
When the code runs x = layers.Add()([x, token_mixing]) in the token mixing layer.
rx shape=[4, 768], token_mixing shape = [384, 768]

It is strange why the MLP-Mixer paper could set different parameters "patch number ( sequence length) != MLP token mixing dimensio"

@LouiValley
Copy link
Author

Hi, I just found an error in the code.
The mlp_block needs to be changed like this:
def mlp_block(x, mlp_dim):
y = layers.Dense(mlp_dim)(x)
y = tf.nn.gelu(y)
return layers.Dense(x.shape[-1])(y)

Notice: the token output dimension needs to be equal to the initial input[-1].
In the original version, the dimension differs. I think the setting patches number = MLP token mixing dimension, makes us ignore this problem.

def mlp_block(x, mlp_dim):
x = layers.Dense(mlp_dim)(x)
x = tf.nn.gelu(x)
return layers.Dense(x.shape[-1])(x)

That is why the original paper mentions: "Note that DS is selected independently of the number of input patches."

@sayakpaul
Copy link
Owner

Thanks for your suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants