MLMP: Metapath-Enhanced Language Model Pretraining on Text-Attributed Heterogeneous Graphs

This repository contains the source code and datasets for MLMP: Metapath-enhanced Language Model Pretraining on Text-Attributed Heterogeneous Graphs.

Links

Datasets
Preprocess
Pretraining
Finetuning

Datasets

Download processed data. To reproduce the results in our paper, you need to first download the processed datasets. You need to also download bert-base-cased and put them into ./data.

Preprocess

You need to execute ./data/data_process.ipynb for OAG-Venue dataset and ./data/data_process_googreads.ipynb for GoodReads dataset.

Pretraining

Pretraining in ./pretrain.

sh run.sh

Finetuning

Node Classification

Run node classification in ./downstream/node-classification.

sh run.sh

Link Prediction

Run link prediction in ./downstream/link-predict.

sh run.sh

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
downstream		downstream
pretrain		pretrain
.DS_Store		.DS_Store
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLMP: Metapath-Enhanced Language Model Pretraining on Text-Attributed Heterogeneous Graphs

Links

Datasets

Preprocess

Pretraining

Finetuning

Node Classification

Link Prediction

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MLMP: Metapath-Enhanced Language Model Pretraining on Text-Attributed Heterogeneous Graphs

Links

Datasets

Preprocess

Pretraining

Finetuning

Node Classification

Link Prediction

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages