Skip to content

The tagging performance of DRAGNN is worse than SyntaxNet. #1347

@banyh

Description

@banyh

System information

  • the top-level directory of the model: /home/banyhong/syntaxnet_wrapper/syntaxnet_wrapper/models/syntaxnet
  • OS Platform: Linux Ubuntu 14.04
  • TensorFlow binary: Linux CPU version, 1.1.0-rc1
  • TensorFlow version: ('v1.1.0-rc1-168-g0054c39', '1.1.0-rc1')
  • Bazel version: 0.4.3

Describe the problem

The tagging performance of DRAGNN is worse than SyntaxNet.

I've modified dragnn/tools/parse-to-conll.py to make it printint out token.tag.

The sentence "Alice drove down the street in her car" has been parsed by SyntaxNet tagger+parser, DRAGNN parser. The pos-tag of Alice is NOUN++NNP in SyntaxNet, but is ADV++RB in DRAGNN.

Source code / logs

Modified parse-to-conll.py line 227 to 231:

            f.write('%s\t%s\t_\t_\t_\t_\t%d\t%s\t_\t%s\n'%(
                i + 1,
                token.word.encode('utf-8'), head,
                token.label.encode('utf-8'),
                token.tag.encode('utf-8')))

To get DRAGNN parser result:

bazel --output_user_root=bazel_root run -c opt //dragnn/tools:parse-to-conll -- \
    --parser_master_spec=/home/banyhong/syntaxnet_wrapper/syntaxnet_wrapper/models/syntaxnet/dragnn/conll17/English/parser_spec.textproto \
    --parser_checkpoint_file=/home/banyhong/syntaxnet_wrapper/syntaxnet_wrapper/models/syntaxnet/dragnn/conll17/English/checkpoint \
    --parser_resource_dir=/home/banyhong/syntaxnet_wrapper/syntaxnet_wrapper/models/syntaxnet/dragnn/conll17/English \
    --use_gold_segmentation=True \
    --input_file=/home/banyhong/syntaxnet_wrapper/syntaxnet_wrapper/models/syntaxnet/input.conll \
    --inference_beam_size=char_lstm=16,lookahead=16,tagger=64,parser=64 \
    --output_file=/home/banyhong/syntaxnet_wrapper/syntaxnet_wrapper/models/syntaxnet/output.conll

The content in output.conll is:

#Alice drove down the street in her car
1       Alice   _       _       _       _       2       nsubj   _       attribute { name: "fPOS" value: "ADV++RB" }
2       drove   _       _       _       _       0       root    _       attribute { name: "Mood" value: "Imp" } attribute { name: "VerbForm" value: "Fin" } attribute { name: "fPOS" value: "VERB++VB" }
3       down    _       _       _       _       2       compound:prt    _       attribute { name: "fPOS" value: "ADP++RP" }
4       the     _       _       _       _       5       det     _       attribute { name: "Definite" value: "Def" } attribute { name: "PronType" value: "Art" } attribute { name: "fPOS" value: "DET++DT" }
5       street  _       _       _       _       2       obj     _       attribute { name: "Number" value: "Sing" } attribute { name: "fPOS" value: "NOUN++NN" }
6       in      _       _       _       _       8       case    _       attribute { name: "fPOS" value: "ADP++IN" }
7       her     _       _       _       _       8       nmod:poss       _       attribute { name: "Gender" value: "Fem" } attribute { name: "Number" value: "Sing" } attribute { name: "Person" value: "3" } attribute { name: "Poss" value: "Yes" } attribute { name: "PronType" value: "Prs" } attribute { name: "fPOS" value: "PRON++PRP$" }
8       car     _       _       _       _       5       nmod    _       attribute { name: "Number" value: "Sing" } attribute { name: "fPOS" value: "NOUN++NN" }

The parsing result of SyntaxNet is:

1       Alice   _       NOUN    NNP     _       2       nsubj   _       _
2       drove   _       VERB    VBD     _       0       ROOT    _       _
3       down    _       ADP     IN      _       2       prep    _       _
4       the     _       DET     DT      _       5       det     _       _
5       street  _       NOUN    NN      _       3       pobj    _       _
6       in      _       ADP     IN      _       2       prep    _       _
7       her     _       PRON    PRP$    _       8       poss    _       _
8       car     _       NOUN    NN      _       6       pobj    _       _

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions