-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About Output MLP head layer #13
Comments
This is just a design to add flexibility to the final output dimension. But you can remove it and see if there is any performance drop. I'll close this issue if there is no other question. |
thank you for the reply. That answer helped me a lot. |
Hi @Jeff-Zilence , In the paper, the output of the Transformer heads are concatenated and fed into a MLP projection to generate the global feature, but I could not find the operation in your code. Anything I missed? |
See the model file. It should be just linear layers. |
Thanks, you mean here? |
Hello, @Jeff-Zilence
I'm wondering MLP head layer which is applied to cls tocken.
In BERT, Final MLP head layer is used for classification task.
But Solving CVGL task, Why you use Final MLP head layer to cls tocken without using the cls token itself as an output feature??
Is there differences using Final MLP head or not?
I wonder if the task performance changes depending on the feature dimension of the MLP head.
Thank you so much for you apply :)
The text was updated successfully, but these errors were encountered: