You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As we know, in transformer the major part is Q,K,V. This prosess is always generate from queries' input and sequences' input. I'm wondering the following two lines whether right in deepctr.layers.sequence.Transformer (line 534 and 535):
after given tensordot with keys and self.W_key to keys, keys is already changed. Then use keys to calculate values, the values is become input * self.W_key * self.W_Value, but I think what we need is input * self.W_Value.
这里计算keys时候的输入keys是层的输入,然后就覆盖了吧,我的理解是,value的值变成了:input * W_key * W_Value,而真正应该用的是input*W_value
Should we change the order of these two lines?
这两行代码的顺序应该换一下吧:
As we know, in transformer the major part is Q,K,V. This prosess is always generate from queries' input and sequences' input. I'm wondering the following two lines whether right in deepctr.layers.sequence.Transformer (line 534 and 535):
after given tensordot with keys and self.W_key to keys, keys is already changed. Then use keys to calculate values, the values is become input * self.W_key * self.W_Value, but I think what we need is input * self.W_Value.
这里计算keys时候的输入keys是层的输入,然后就覆盖了吧,我的理解是,value的值变成了:input * W_key * W_Value,而真正应该用的是input*W_value
Should we change the order of these two lines?
这两行代码的顺序应该换一下吧:
if it's a thick in ctr predict, or my opinion is wrong, pls advise.
The text was updated successfully, but these errors were encountered: