Skip to content

wangchunliu/Chinese-DRS-parsing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chinese-DRS-parsing

The datasets and codes for my ACL-IJCNLP 2021 paper Input Representations for Parsing Discourse Representation Structures: Comparing English with Chinese. The codes were based on the work of Rik van Noord. The English data used in paper come from DRS_parsing respository.

git clone https://github.com/wangchunliu/Chinese-DRS-parsing/

Setup & Data

1.Install Marian:https://marian-nmt.github.io/docs/

cd Chinese-DRS-parsing
git clone https://github.com/marian-nmt/marian
cd marian
git checkout b2a945c
# Build
mkdir build
cd build
cmake ..
make -j
cd ../../

2.From Chinese-DRS-data, we can get Chinese DRSs data.

git clone https://github.com/wangchunliu/Chinese-DRS-data

When we need to get tokenised Chinese data:Python -m jieba -d ‘ ’ infile > outfile.

When we need to get BPE Chinese data, we use the commands from https://github.com/rsennrich/subword-nmt.

3.From DRS_parsing respository, we can get evaluation scripts. The scripts in this section will import from clf_referee.py, so make sure DRS_parsing/evaluation/ is on the $PYTHONPATH.

git clone https://github.com/RikVN/DRS_parsing
export PYTHONPATH=${PYTHONPATH}:/your/folders/here/DRS_parsing/evaluation/

Training and Parsing

The script src/marian_scripts/pipeline.sh can be used to run our own experiments, note that each experiment needs its own config file. In config/marian/default_config.sh we can see which settings can be overwritten to create different experiments.

  1. Training model:
sh ./src/marian_scripts/pipeline.sh config/marian/silver_ci_lstm.sh 

2.Parsing raw text:

sh ./src/marian_scripts/parse_raw_text.sh config/marian/silver_ci_lstm.sh $PRETRAINED_MODEL $OUTPUT_FILE $SENT_FILE 
  1. Evaluation F1 scores:
python ../DRS_parsing/evaluation/counter.py -f1 $CLF_OUTPUT -f2 $GOLD_DEV

About

This repository is the work of "Input Representations for Parsing Discourse Representation Structures: Comparing English with Chinese" for ACL 2021

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published