Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

span light conv疑惑 #8

Closed
psy2013GitHub opened this issue Nov 23, 2020 · 1 comment
Closed

span light conv疑惑 #8

psy2013GitHub opened this issue Nov 23, 2020 · 1 comment

Comments

@psy2013GitHub
Copy link

你好,我想请问下,在span light conv中,既然已经用tf.layers.separable_conv1d生成了带span信息的矩阵key_conv_attn_layer,为什么还需要点乘query_layer呢?对应于conv_attn_layer = tf.multiply(key_conv_attn_layer, query_layer)。感觉此处点乘不是很有必要

@zihangJiang
Copy link
Collaborator

您好,因为self-attention中是image, 我们这里使用二者点乘的一个intuition是和self-attention保持一致,即产生的kernel也是input的两个线性变换乘积再经过softmax。
另一方面,我们认为产生的convolution kernel可以部分理解成当前token和附近neighbor tokens的关系,而不仅仅只是带有当前span的信息,所以我们采用了二者的点乘再经过softmax来生成卷积核。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants