What Causes the Failure of Explicit to Implicit Discourse Relation Recognition?

Code for the NAACL 2024 paper "What Causes the Failure of Explicit to Implicit Discourse Relation Recognition?"

If any questions, please contact the email: willie1206@163.com

1. Requirement

Our working environment is Python 3.8. Before you run the code, please make sure you have installed all the required packages. You can achieve it by simply execute the shell as sh requirements.sh

Then you need to download roberta-base from here, and put it in a local folder. In my case, I put it in "/hits/basement/nlp/liuwi/resources/pretrained_models". Please note, if you use a different path from mine, you may need to modify the path string in the code.

2. Data and Preprocessing

For PDTB 2.0 and PDTB 3.0

Please refer to the preprocessing.py in ConnRel repository.

During this project, we annotate a small number of examples, which you can find in "data/dataset/anno_100".

For Gum dataset

Since the Gum dataset is publicly available, we release the processed corpus in "data/dataset/gum7".

3. Run

For E2I and I2I baselines

Simply do sh scripts/run_E2I.sh or sh scripts/run_I2I.sh. Please choose the dataset you want to run and comment other commands in the shell file.

For Two Strategies

Prepare predictions and vectors when the input contains and does not contain a connective. You can do sh scripts/run_kfold_base.sh to achieve so.
Use noisy filtering and joint training with connectives to improve the E2I baseline. Run the command sh scripts/run_filter_joint.sh. Please make sure step 1 is finished before step 2.

4. Citation

You can cita this paper through:

@inproceedings{liu-etal-2024-causes,
    title = "What Causes the Failure of Explicit to Implicit Discourse Relation Recognition?",
    author = "Liu, Wei  and
      Wan, Stephen  and
      Strube, Michael",
    editor = "Duh, Kevin  and
      Gomez, Helena  and
      Bethard, Steven",
    booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)",
    month = jun,
    year = "2024",
    address = "Mexico City, Mexico",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.naacl-long.150",
    pages = "2738--2753",
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data/dataset		data/dataset
scripts		scripts
.gitignore		.gitignore
README.md		README.md
analyze.py		analyze.py
filter_joint.py		filter_joint.py
filter_util.py		filter_util.py
joint.py		joint.py
kfold_base.py		kfold_base.py
model.py		model.py
requirements.sh		requirements.sh
task_dataset.py		task_dataset.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What Causes the Failure of Explicit to Implicit Discourse Relation Recognition?

1. Requirement

2. Data and Preprocessing

3. Run

4. Citation

About

Releases

Packages

Languages

liuwei1206/Exp2Imp

Folders and files

Latest commit

History

Repository files navigation

What Causes the Failure of Explicit to Implicit Discourse Relation Recognition?

1. Requirement

2. Data and Preprocessing

3. Run

4. Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages