Skip to content

bigheiniu/GRENADE

Repository files navigation

GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs

This repository contains the code for the paper "GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs" accepted at EMNLP Findings 2023.

MainModel.pdf

Requirements

Dependencies

Install the required packages using the following command:

pip install -r requirements.txt

Dataset

The dataset is available at here. Inside each project (ogbl-citation2, ogbn-arxiv and ogbn-products) folder, there are several key files:

  • {project}-ogbn.torch: The dataset file including adjacency matrix, node classification labels, and split information.
  • {project}_text.csv/X.all.txt: The raw text content for each node.
  • mrr_edges.torch: The file containing the edges for link prediction task.

Usage

Graph-Centric Language Model for Self-Supervised Pretraining

cd scripts
sh ssl_train.sh

Downstream Evaluation

The evaluation includes the following tasks:

  • MLP node classification
  • GraphSage node classification
  • Link Prediction
cd scripts
sh eval.sh

Model Checkpoint

The node embeddings checkpoint is available at here.

Citation

If you use this code for your research, please cite our paper:

@misc{li2023grenade,
      title={GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs}, 
      author={Yichuan Li and Kaize Ding and Kyumin Lee},
      year={2023},
      eprint={2310.15109},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

About

GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages