# Is it possible to retrieve the attention weights of a specific node? #608

opened this issue Jun 5, 2019

sgdantas commented Jun 5, 2019

 Hey! I was wondering if it's possible to retrieve the attention weights of a specific node. By printing the alpha dimension I see that the attention is batched (Number of nodes with same degree x degree x 1) ``````def reduce_func(self, nodes): # reduce UDF for equation (3) & (4) # equation (3) alpha = F.softmax(nodes.mailbox['e'], dim=1) print(alpha.shape) # equation (4) h = torch.sum(alpha * nodes.mailbox['z'], dim=1) return {'h': h} `````` However, a node might be connected to other nodes with different degrees. I wanted to retrieve the attention that other nodes pay to a specific node, is that possible? Thanks!
mufeili commented Jun 5, 2019

 It's possible. Take our PyTorch GAT Implementation as an example. The edge attentions are stored in `g.edata['a_drop']`. We can fetch the attentions as follows: ```from scipy.sparse import lil_matrix def preprocess_attention(edge_atten, g, to_normalize=True): """Organize attentions in the form of csr sparse adjacency matrices from attention on edges. Parameters ---------- edge_atten : numpy.array of shape (# edges, # heads, 1) Un-normalized attention on edges. g : dgl.DGLGraph. to_normalize : bool Whether to normalize attention values over incoming edges for each node. """ n_nodes = g.number_of_nodes() num_heads = edge_atten.shape[1] all_head_A = [lil_matrix((n_nodes, n_nodes)) for _ in range(num_heads)] for i in range(n_nodes): predecessors = list(g.predecessors(i)) edges_id = g.edge_ids(predecessors, i) for j in range(num_heads): all_head_A[j][i, predecessors] = edge_atten[edges_id, j, 0].data.cpu().numpy() if to_normalize: for j in range(num_heads): all_head_A[j] = normalize(all_head_A[j], norm='l1').tocsr() return all_head_A # Take the attention from one layer as an example # num_edges x num_heads x 1 A = self.g.edata['a_drop'] # list of length num_heads, each entry is csr of shape (num_nodes, num_nodes) A = preprocess_attention(A, self.g) ``` Now `A[h][i, j]` will give you the attention of edge `j -> i` in head `h`. For non-existing edges this value will be zero.
sgdantas commented Jun 5, 2019

 Thank you, it works!

mufeili commented Jun 5, 2019

 Glad to know that @sgdantas F :). Let me know if you have any follow up questions. Also for questions like this we encourage users to post on our discussion forum so that more users can be benefited.