Does the attention used in codes the same with the one in paper? #33

hapoyige · 2019-09-22T05:52:09Z

I find in function attn_head() (in utils/layers.py)
'''

simplest self-attention possible

f_1 = tf.layers.conv1d(seq_fts, 1, 1)
f_2 = tf.layers.conv1d(seq_fts, 1, 1)
logits = f_1 + tf.transpose(f_2, [0, 2, 1])
coefs = tf.nn.softmax(tf.nn.leaky_relu(logits) + bias_mat)
'''
In my understanding,the codes equals to $$f_1 W_1 + f_2 W_2$$
but in the paper, the chose attention mechanism use concatenation, and
$$W_1 = W_2 = W$$
Did I get something wrong?

KL-ice · 2019-09-22T12:44:27Z

Hello, I have encountered the same problem as you.
Also, in ./utils/layers.py, I didn't understand how the code calculates the correlation between f_1 and f_2.
I have seen the code for the pytorch version, and I think the implementation of the two is different in this place.
The pytorch version of the code is sensitive to the choice of random seeds, and switching to a different random seed can make the results very different.

PetarV- · 2019-09-22T13:08:50Z

Hello,

Thank you for your issue and interest in GAT!

The way in which attention heads are implemented here is exactly equivalent to the one in the paper, and it uses TensorFlow broadcasting semantics heavily.

For more details, see my response in this issue:
https://github.com/PetarV-/GAT/issues/15

Thanks,
Petar

KL-ice · 2019-09-22T13:32:31Z

Hi, Peter.
I have read the issue. Thank you very much for your reply, he has helped me a great deal.
Thanks，
Ice

hapoyige · 2019-09-22T15:22:32Z

Thanks a lot, Petar, I understood!

hapoyige closed this as completed Sep 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does the attention used in codes the same with the one in paper? #33

Does the attention used in codes the same with the one in paper? #33

hapoyige commented Sep 22, 2019

KL-ice commented Sep 22, 2019

PetarV- commented Sep 22, 2019

KL-ice commented Sep 22, 2019

hapoyige commented Sep 22, 2019

Does the attention used in codes the same with the one in paper? #33

Does the attention used in codes the same with the one in paper? #33

Comments

hapoyige commented Sep 22, 2019

simplest self-attention possible

KL-ice commented Sep 22, 2019

PetarV- commented Sep 22, 2019

KL-ice commented Sep 22, 2019

hapoyige commented Sep 22, 2019