About Output MLP head layer #13

Soyeon11 · 2023-03-16T05:09:34Z

I'm wondering MLP head layer which is applied to cls tocken.
In BERT, Final MLP head layer is used for classification task.
But Solving CVGL task, Why you use Final MLP head layer to cls tocken without using the cls token itself as an output feature??

Is there differences using Final MLP head or not?
I wonder if the task performance changes depending on the feature dimension of the MLP head.

Thank you so much for you apply :)

Jeff-Zilence · 2023-03-18T21:41:49Z

This is just a design to add flexibility to the final output dimension. But you can remove it and see if there is any performance drop. I'll close this issue if there is no other question.

Soyeon11 · 2023-03-20T03:55:58Z

thank you for the reply. That answer helped me a lot.

whu-lyh · 2023-05-17T08:49:55Z

Hi @Jeff-Zilence , In the paper, the output of the Transformer heads are concatenated and fed into a MLP projection to generate the global feature, but I could not find the operation in your code. Anything I missed?

Jeff-Zilence · 2023-05-18T04:05:23Z

See the model file. It should be just linear layers.

whu-lyh · 2023-05-18T04:22:14Z

Thanks, you mean here?

Soyeon11 closed this as completed Mar 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About Output MLP head layer #13

About Output MLP head layer #13

Soyeon11 commented Mar 16, 2023

Jeff-Zilence commented Mar 18, 2023

Soyeon11 commented Mar 20, 2023

whu-lyh commented May 17, 2023

Jeff-Zilence commented May 18, 2023

whu-lyh commented May 18, 2023

About Output MLP head layer #13

About Output MLP head layer #13

Comments

Soyeon11 commented Mar 16, 2023

Jeff-Zilence commented Mar 18, 2023

Soyeon11 commented Mar 20, 2023

whu-lyh commented May 17, 2023

Jeff-Zilence commented May 18, 2023

whu-lyh commented May 18, 2023