Code for our EMNLP2019 paper,
Core Semantic First: A Top-down Approach for AMR Parsing. [paper][bib]
Deng Cai and Wai Lam.
python2==Python2.7
python3==Python3.6
sh setup.sh
(install dependencies)
in the directory preprocessing
- make a directory, for example
preprocessing/2017
. Put filestrain.txt
,dev.txt
,test.txt
in it. The format of these files be the same as our example filepreprocessing/data/dev.txt
. sh go.sh
(you may make necessary changes inconvertingAMR.java
)python2 preprocess.py
- We use Stanford Corenlp to extract NER, POS and lemma, see
go.sh
andconvertingAMR.java
for details. - We already provided the alignment information used for concept prediction as in
common/out_standford
(LDC2017T10 by the aligner of Oneplus/tamr).
in the directory parser
python3 extract.py && mv *vocab *table ../preprocessing/2017/.
Make vocabularies for the dataset in../preprocessing/2017
(you may make necessary changes inextract.py
and the command line as well)sh train.sh
Be patient! Checkpoints will be saved in the directoryckpt
by default. (you may make necessary changes intrain.sh
).
in the directory parser
sh work.sh
(you should make necessary changes inwork.sh
)
- The most important argument is
--load_path
, which is supposed to be set to a specific checkpoint file, for example,somewhere/some_ckpt
. The output file will be in the same folder with the checkpoint file, for example,somewhere/some_ckpt_test_out
in the directory amr-evaluation-tool-enhanced
python2 smatch/smatch.py --help
A large of portion of the code under this directory is borrowed from ChunchuanLv/amr-evaluation-tool-enhanced, we add more options as follows.
--weighted whether to use a weighted smatch or not
--levels LEVELS how deep you want to evaluate, -1 indicates unlimited, i.e., full graph
--max_size MAX_SIZE only consider AMR graphs with limited size <= max_size, -1 indicates no limit
--min_size MIN_SIZE only consider AMR graphs with limited size >= min_size, -1 indicates no limit
For examples:
-
To calculate the smatch-weighted metric in our paper.
python2 smatch/smatch.py --pr -f parsed_data golden_data --weighted
-
To calculate the smatch-core metric in our paper
python2 smatch/smatch.py --pr -f parsed_data golden_data --levels 4
We release our pretrained model at Google Drive.
To use the pretrained model, move the vocabulary files under [Google Drive]/vocabs
to preprocessing/2017/
and adjust work.sh
accordingly (set --load_path
point to [Google Drive]/model.ckpt
).
We also provide the exact model output reported in our paper. The output file and the corresponding reference file are in the legacy
folder.
If you find the code useful, please cite our paper.
@inproceedings{cai-lam-2019-core,
title = "Core Semantic First: A Top-down Approach for {AMR} Parsing",
author = "Cai, Deng and
Lam, Wai",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
month = nov,
year = "2019",
address = "Hong Kong, China",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/D19-1393",
pages = "3790--3800",
}
For any questions, please drop an email to Deng Cai.