Skip to content
Code for EMNLP2019 "Target Language-Aware Constrained Inference for Cross-lingual Dependency Parsing"
Python Perl Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data2.2
run
src/examples
README.md

README.md

Target Language-Aware Constrained Inference for Cross-lingual Dependency Parsing

This repository contains the source code to reproduce the experiments in the EMNLP 2019 paper Target Language-Aware Constrained Inference for Cross-lingual Dependency Parsing by Tao Meng, Nanyun Peng and Kai-Wei Chang.

  • Abstract

Prior work on cross-lingual dependency parsing often focuses on capturing the commonalities between source and target languages and overlooks the potential of leveraging linguistic properties of the languages to facilitate the transfer. In this paper, we show that weak supervisions of linguistic knowledge for the target languages can improve a cross-lingual graph-based dependency parser substantially. Specifically, we explore several types of corpus linguistic statistics and compile them into corpus-wise constraints to guide the inference process during the test time. We adapt two techniques, Lagrangian relaxation and posterior regularization, to conduct inference with corpus-statistics constraints. Experiments show that the Lagrangian relaxation and posterior regularization inference improve the performances on 15 and 17 out of 19 target languages, respectively. The improvements are especially significant for target languages that have different word order features from the source language.

  • Data

Firstly, you should download the UD Tree Bank data from Universal Dependencies v2 (.conllu files), and multilingual embedding data from FastText (.vec files) and save them in ./data2.2 first. Now in ./data2.2 we only have dummy files of Hebrew(he).

  • Running experiments

Requirements

python == 2.7
pytorch == 0.3.1

WALS settings

To use WALS features to compile constraints, please refer to ./run/run_WALS.sh. The WALS is stored in pickle file WALS_extra.pkl. The model will automatically load the WALS features and compile them into C1,C2,C3 three corpus-wise constraints mentioned in paper.

./run/run_WALS.sh
Alternative arguments:
  --decode [proj/mst]     adding projective constraints or not
  --constraints_method [PR/Lagrange] 
                          algorithms

oracle settings

To use oracle settings, please refer to ./run/run_ratio.sh. It will load constraints in constraints.txt. The exact ratio is stored in ./run/model/constraint

./run/run_ratio.sh
Alternative arguments:
  --decode [proj/mst]     adding projective constraints or not
  --threshold THETA       the margin
You can’t perform that action at this time.