Skip to content

Latest commit

 

History

History
31 lines (24 loc) · 1.24 KB

dependency.md

File metadata and controls

31 lines (24 loc) · 1.24 KB

dependency task-type

back to main README

This is a deep biaffine parser, which is used similarly as the seq task type. However, it has one peculiarity, namely that it reads data from two columns. However, you only have to define the first column (which should be the index of the head), and then it automatically reads the labels from the column behind it:

1	Champ	_	PROPN	_	_	1	vocative	_	_
2	champ	_	VERB	_	_	0	root	_	_
3	Macaaaamp	_	NOUN	_	_	2	obj	_	_
    "dependency": {
        "task_type": "dependency",
        "column_idx": 6
    }

This does not support ellipsis, or word splitting, which are included in the standard UD downloads. Because of this, we include a script that removes these, and attaches the left-over words to the dependency structure. This can be done with the scripts scripts/misc/cleanconl.py, which as arguments takes a list of conllu files, and replaces all of these with there cleaned version (warning, this replaces the original file)

Furthermore, it should be noted that it does not actually use the word indexes which are present in standard UD format, but just uses the line index from the file.