Skip to content

Latest commit

 

History

History
32 lines (25 loc) · 2.13 KB

Do Transformers Really Perform Bad for Graph Representation.md

File metadata and controls

32 lines (25 loc) · 2.13 KB

Key ideas

  • Transformer architecture is becoming a dominant choice in NLP and CV, why not in Graphs?
  • Solution: Graphormer which uses special positional encoding for graphs.
  • Problem not mentioned in the paper: graph "positional encoding" often requires shortest path computation between two nodes.

Introduction

  • Transformer is the most powerful NN in modeling sequential data such as speech and natural language processing
  • It's an open question whether Transformers are suitable to model graphs - Graphormer seems to be an affirmative answer
  • Most leaderboards and benchmarks in chemistry seem to do well
  • Centrality encoding captures node importance in the graph
  • Spatial encoding captures the structural relation between nodes - for each node pair, we assign a learnable embedding based on the spatial relation.

Screenshot 2022-05-06 at 19 16 09

Preliminary

  • GNN aim to learn representation of nodes and graphs.
  • Modern GNNs mfollow a learning schema that updates the representation of an ode by aggregating representations of its neighbors.
  • Aggregate-combine step: Screenshot 2022-05-06 at 19 22 15
  • The task: graph classification

Centrality Encoding

  • Learnable embedding applied to each node based on the indegree/outdegree of them
  • Screenshot 2022-05-06 at 19 23 39

Spatial Encoding

  • Biased term is specific to the shortest path distance between v_i and v_j. *Screenshot 2022-05-06 at 19 24 37

Special node

  • A Node [VNode] is added to the graph which is then connected to every individual node (distance of 1). The point is that hte representation of the entire h_g graph is the node feature of [VNode]