This is the implementation of our EMNLP 2022 paper:
MetaFill: Text Infilling for Meta-Path Generation on Heterogeneous Information Networks.
Please cite our paper when you use this code in your work.
❱❱❱ pip install -r requirements.txt
If you want to quickly generate metapaths with our model, please put the files in pretrained_models in this directory and run the generation script gen_metapath.sh
. The k-hop meta-paths and their scores will be stored in meta_path_ft_heterographine_k.txt
.
Generate the masked data for finetuning GPT-2 for text infilling:
❱❱❱ python mask_data.py
-
Follow the paper "Enabling language models to fill in the blanks"1 to set the environment, and put their finetuned model on arxiv abstracts under
abs_ilm
. -
train.sh
is the script for finetuning the GPT-2 on HeteroGraphine.
train_classifier.sh
is the script for training the node type classifier for HeteroGraphine.
Remove the "--from-pretrained" in gen_metapath.sh
and run it.
We use the open-source code of "Enabling language models to fill in the blanks"1 to finetune the GPT-2 for text infilling
- [1] Donahue C, Lee M, Liang P. Enabling Language Models to Fill in the Blanks[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 2492-2501.