relative-amr-eval

This repository contains the code and dataset for the paper The Relative Clauses AMR Parsers Hate Most by Xiulin Yang and Nathan Schneider.

To replicate the experiments

First, you need to create a virtual environment with python==3.8 and install all the dependencies.

conda create -n venv python==3.8
pip install -r requirements.txt

dataset

The parsed results for ewt can be found in the parse_resultsfolder.

classification code

rc-types.py: the code to classify the relative clauses without distinguishing types of reduced relative clauses.
rrc-types.py: the code to classify reduced relative clauses and add EUD annotations.

evaluation code

reentrancy.py: the code to evaluate the output from am-parser
reentrancy_amrlib.py: the code to evaluate the output from other parsers that need alignment from LEAMR
dep_parse_amr.py: the code to generate dependency trees for data from AMR 3.0
amrbart_postprocess.py: the code that post-processes the parses from AMRBART.

To recover the enhanced Universal Dependency (EUD) relations back

First, go to the eud_ewtfolder and you will find the following documents.

en_ewt-ud-{dev,test,train}.conllu are downloaded from ewt-dev-branch
eud_ewt_{dev,test,train}.conllu are the post-processed files that have the recovered eud annotation.
rc-types.py: the script used to classify sentences based on their eud annotations (which means for the reduced relative clauses, the Cxn value in the misc column will be xxx-red-missingdep-xxx.
rrc-types.py: the script used to classify reduced relative clauses. The output is stored in the eud_{train,dev,test} folders (each folder contains necessary documents for each split of the EWT treebank)
verb_transitivity.tsv: the tsv file that contains verb transitivity information.

correction

In order to check if there is any mis-classified reduced relative clauses, you can follow the following pipeline:

Go to the eud_{train,dev,test} folder and you need to check the following two documents:
- {orc,oblrc}.txt: you need to check if sentences have misclassified examples (you can also check the .conllu files if you like).
Once you find a misclassification example, you can go to eud_ewt_{train, dev, test}.conllu and you will find that the eud annotation of the sentence is likely to be wrong; you should correct it.
After all the corrections of one split, you can run rc-types.py to generate an updated eud_ewt_split.conllu. You need to change the PATH variable to the correct split at the beginning of each script.

double check

Once you have corrected all sentences, you can make a double check by (1) run rc-types.py to get the updated eud_ewt_split.conllu under the eud_ewt folder; (2) run rrc-types.py to get the updated reduced relative clause classification and recheck if they are correct. If some annotation is wrong, then follow the correction section.

Citation

@inproceedings{yang-schneider-2024-relative,
    title = "The Relative Clauses {AMR} Parsers Hate Most",
    author = "Yang, Xiulin  and
      Schneider, Nathan",
    editor = "Bonial, Claire  and
      Bonn, Julia  and
      Hwang, Jena D.",
    booktitle = "Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.dmr-1.16",
    pages = "151--161",
    abstract = "This paper evaluates how well English Abstract Meaning Representation parsers process an important and frequent kind of Long-Distance Dependency construction, namely, relative clauses (RCs). On two syntactically parsed datasets, we evaluate five AMR parsers at recovering the semantic reentrancies triggered by different syntactic subtypes of relative clauses. Our findings reveal a general difficulty among parsers at predicting such reentrancies, with recall below 64{\%} on the EWT corpus. The sequence-to-sequence models (regardless of whether structural biases were included in training) outperform the compositional model. An analysis by relative clause subtype shows that passive subject RCs are the easiest, and oblique and reduced RCs the most challenging, for AMR parsers.",
}

TODO

Add argparse and bash script for easier implementation of the code.
Add more detailed instructions in readme.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

relative-amr-eval

To replicate the experiments

dataset

classification code

evaluation code

To recover the enhanced Universal Dependency (EUD) relations back

correction

double check

Citation

TODO

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.idea		.idea
eud_ewt		eud_ewt
parse_results		parse_results
.DS_Store		.DS_Store
README.md		README.md
amrbart_postprocess.py		amrbart_postprocess.py
corase_rc_classification.py		corase_rc_classification.py
dep_parse_amr.py		dep_parse_amr.py
preprocess.py		preprocess.py
rc-types.py		rc-types.py
reentrancy.py		reentrancy.py
reentrancy_amrlib.py		reentrancy_amrlib.py
requirements.txt		requirements.txt
rrc-types.py		rrc-types.py

xiulinyang/relative-amr-eval

Folders and files

Latest commit

History

Repository files navigation

relative-amr-eval

To replicate the experiments

dataset

classification code

evaluation code

To recover the enhanced Universal Dependency (EUD) relations back

correction

double check

Citation

TODO

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages