Skip to content
CoNLL 2018 Shared Task Team HUJI
Python Shell JavaScript Other
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
ci
docs
experiments
server
test_files
tests
tupa
.appveyor.yml
.travis.yml
LICENSE.txt
MANIFEST.in
README.md
ablation.txt
ablation_conll2018.sh
activate_conll2018.sh
activate_models_conll2018.sh
count_enhanced_conll2018.sh
eval_ablation_conll2018.sh
eval_conll2018.sh
eval_enhanced_conll2018.sh
grep_las_conll2018.sh
requirements.txt
rm_all_conll2018_parsed.sh
run_conll2018.sh
run_conll2018_tira.sh
setup.py
train_2.3.sh
train_conll2018.sh
train_conll2018_delex.sh
train_conll2018_multitask.sh
train_conll2018_multitask_all.sh
ud-2.3.txt
udpipe.py
waiting.txt

README.md

TUPA in the CoNLL 2018 UD Shared Task

TUPA is a transition-based parser for Universal Conceptual Cognitive Annotation (UCCA).

This repository contains the version of TUPA used as the submission by the HUJI team to the CoNLL 2018 UD Shared Task:

@InProceedings{hershcovich2018universal,
  author    = {Hershcovich, Daniel  and  Abend, Omri  and  Rappoport, Ari},
  title     = {Universal Dependency Parsing with a General Transition-Based DAG Parser},
  booktitle = {Proc. of CoNLL UD 2018 Shared Task},
  year      = {2018},
  url       = {http://www.cs.huji.ac.il/~danielh/udst2018.pdf}
}

System outputs on development and test treebanks, as well as trained models (including ablation experiments), are available in this release.

For more information, please see the official TUPA code repository.

Requirements

  • Python 3.6+

Training

Download the UD treebanks and extract them (e.g. to ../data/ud-treebanks-v2.2).

Run train_conll2018.sh to train a model on each treebank. For example, to train on ../data/ud-treebanks-v2.2/UD_English-EWT, run:

./train_conll2018.sh ../data/ud-treebanks-v2.2/UD_English-EWT

Or, if you have a slurm cluster, just run

sbatch --array=1-121 train_conll2018.sh

to train models for all treebanks.

Parsing

Either download the pre-trained models, or train your own (see above). If you trained your own models, update their suffixes in activate_models_conll2018.sh.

To parse the test treebanks, run run_conll2018.sh. For example:

./run_conll2018.sh ../data/ud-treebanks-v2.2/UD_English-EWT

Or parse all test treebanks using slurm:

sbatch --array=1-121 run_conll2018.sh

To parse the development treebanks, run:

./run_conll2018.sh ../data/ud-treebanks-v2.2/UD_English-EWT dev

Or parse all development treebanks using slurm:

sbatch --array=1-121 run_conll2018.sh dev

Evaluation

Either run the models yourself (see above), or download the system outputs.

To get LAS-F1 scores for test treebanks, run:

./eval_conll2018.sh

To get LAS-F1 scores for dev treebanks, run:

./eval_conll2018.sh dev

For evaluation on enhanced dependencies, run:

./eval_enhanced_conll2018.sh

Author

License

This package is licensed under the GPLv3 or later license (see LICENSE.txt).

You can’t perform that action at this time.