Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is values calculate way right in layers.sequence.transformer?(Transformer里似乎V的计算有一点点问题吧) #490

Closed
darryyoung opened this issue Sep 7, 2022 · 0 comments · Fixed by #495
Labels

Comments

@darryyoung
Copy link

darryyoung commented Sep 7, 2022

As we know, in transformer the major part is Q,K,V. This prosess is always generate from queries' input and sequences' input. I'm wondering the following two lines whether right in deepctr.layers.sequence.Transformer (line 534 and 535):

keys = tf.tensordot(keys, self.W_key, axes=(-1, 0))
values = tf.tensordot(keys, self.W_Value, axes=(-1, 0))

after given tensordot with keys and self.W_key to keys, keys is already changed. Then use keys to calculate values, the values is become input * self.W_key * self.W_Value, but I think what we need is input * self.W_Value.
这里计算keys时候的输入keys是层的输入,然后就覆盖了吧,我的理解是,value的值变成了:input * W_key * W_Value,而真正应该用的是input*W_value

Should we change the order of these two lines?
这两行代码的顺序应该换一下吧:

values = tf.tensordot(keys, self.W_Value, axes=(-1, 0))
keys = tf.tensordot(keys, self.W_key, axes=(-1, 0))

if it's a thick in ctr predict, or my opinion is wrong, pls advise.

@shenweichen shenweichen linked a pull request Oct 16, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant