What's the essential difference between ConvBert and LSRA? #14

yuanenming · 2020-12-10T03:43:25Z

LSRA: Lite Transformer with Long-Short Range Attention.

LSRA also integrates convolution operations into transformer blocks. I'm just wondering what makes ConvBert differ from LSRA.

yuanenming · 2020-12-10T03:53:47Z

Is that LSRA combines multi-head attention and conv in a multi-branch manner, but ConvBert integrates conv into transformer blocks?
if the answer is yes. what are the pros and cons of the above two methods? Do you have experiments?

Thanks a lot!!!

zihangJiang · 2020-12-10T12:03:23Z

Hi @yuanenming , Thanks for your interest.

LSRA is for machine translation and abstractive summarization. They are combining dynamic conv and multi-head attention in a two-branch manner.

ConvBERT is a pre-training based model that can be fine-tuned on downstream tasks like sentence classification. We also propose a novel span-based dynamic convolution operator and combine it with the self-attention to form the mixed-attention block.

Experiments comparing span-based dynamic conv and dynamic conv can be found in Section 4.3 Table 2 in our paper.

You can find that our span-based dynamic conv is better than dynamic conv in this pre-training based model setting. But it's hard to directly compare LSRA with ConvBERT.

yuanenming · 2020-12-11T02:22:09Z

Thank you for your timely reply!
I will close this issue.

yuanenming closed this as completed Dec 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's the essential difference between ConvBert and LSRA? #14

What's the essential difference between ConvBert and LSRA? #14

yuanenming commented Dec 10, 2020

yuanenming commented Dec 10, 2020

zihangJiang commented Dec 10, 2020

yuanenming commented Dec 11, 2020

What's the essential difference between ConvBert and LSRA? #14

What's the essential difference between ConvBert and LSRA? #14

Comments

yuanenming commented Dec 10, 2020

yuanenming commented Dec 10, 2020

zihangJiang commented Dec 10, 2020

yuanenming commented Dec 11, 2020