Is values calculate way right in layers.sequence.transformer?(Transformer里似乎V的计算有一点点问题吧) #490

darryyoung · 2022-09-07T07:49:53Z

As we know, in transformer the major part is Q,K,V. This prosess is always generate from queries' input and sequences' input. I'm wondering the following two lines whether right in deepctr.layers.sequence.Transformer (line 534 and 535):

keys = tf.tensordot(keys, self.W_key, axes=(-1, 0))
values = tf.tensordot(keys, self.W_Value, axes=(-1, 0))

after given tensordot with keys and self.W_key to keys, keys is already changed. Then use keys to calculate values, the values is become input * self.W_key * self.W_Value, but I think what we need is input * self.W_Value.
这里计算keys时候的输入keys是层的输入，然后就覆盖了吧，我的理解是，value的值变成了：input * W_key * W_Value，而真正应该用的是input*W_value

Should we change the order of these two lines?
这两行代码的顺序应该换一下吧:

values = tf.tensordot(keys, self.W_Value, axes=(-1, 0))
keys = tf.tensordot(keys, self.W_key, axes=(-1, 0))

if it's a thick in ctr predict, or my opinion is wrong, pls advise.

The text was updated successfully, but these errors were encountered:

darryyoung added the question label Sep 7, 2022

shenweichen linked a pull request Oct 16, 2022 that will close this issue

support python 3.9,3.10 #495

Merged

shenweichen closed this as completed in #495 Oct 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is values calculate way right in layers.sequence.transformer?(Transformer里似乎V的计算有一点点问题吧) #490

Is values calculate way right in layers.sequence.transformer?(Transformer里似乎V的计算有一点点问题吧) #490

darryyoung commented Sep 7, 2022 •

edited

Loading

Is values calculate way right in layers.sequence.transformer?(Transformer里似乎V的计算有一点点问题吧) #490

Is values calculate way right in layers.sequence.transformer?(Transformer里似乎V的计算有一点点问题吧) #490

Comments

darryyoung commented Sep 7, 2022 • edited Loading

darryyoung commented Sep 7, 2022 •

edited

Loading