Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post-processing fails if sentence has <number><space><number> #15

Open
iamanigeeit opened this issue Feb 22, 2021 · 1 comment
Open

Comments

@iamanigeeit
Copy link

iamanigeeit commented Feb 22, 2021

Problem
The parser will output a non-breaking space character if the the input sentence contains \d+ \d+. This leads post-processing failure with error penman.DecodeError: Expected ":" or "/" at position XXX

Example .pred file

# ::id 9900
# ::snt @united iCloud it's not there yet -- PLEASE HELP 917 703 1472
# ::tokens ["@united", "iCloud", "it", "'s", "not", "there", "yet", "--", "PLEASE", "HELP", "917\u00a0703\u00a01472"]
# ::lemmas ["@united", "icloud", "it", "be", "not", "there", "yet", "--", "please", "help", "917\u00a0703\u00a01472"]
# ::pos_tags ["VBN", "NN", "PRP", "VBZ", "RB", "RB", "RB", ":", "VB", "NN", "CD"]
# ::ner_tags ["O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "NUMBER"]
# ::abstract_map {}
(c0 / multi-sentence
    :snt1 (c1 / icloud
              :mod (c3 / be-located-at
                       :ARG1 (c7 / it)
                       :ARG2 (c8 / there)
                       :time (c9 / yet)))
    :snt2 (c2 / help-01
              :ARG1 (c4 / you)
              :mode imperative
              :ARG1 (c6 / book
                        :name (c10 / 917 703 1472))))

Note that in the last line, 917 703 1472 contains non-breaking spaces.

./postprocess_2.0.sh sample.txt.pred
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/perry/anaconda3/envs/stog/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/perry/anaconda3/envs/stog/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/home/perry/PycharmProjects/phd/AMR-gs-master/stog/data/dataset_readers/amr_parsing/postprocess/postprocess.py", line 16, in postprocess2
    for amr in nr.restore_file(file_path):
  File "/home/perry/PycharmProjects/phd/AMR-gs-master/stog/data/dataset_readers/amr_parsing/postprocess/node_restore.py", line 19, in restore_file
    for amr in AMRIO.read(file_path):
  File "/home/perry/PycharmProjects/phd/AMR-gs-master/stog/data/dataset_readers/amr_parsing/io.py", line 48, in read
    amr.graph = AMRGraph.decode(' '.join(graph_lines))
  File "/home/perry/PycharmProjects/phd/AMR-gs-master/stog/data/dataset_readers/amr_parsing/amr.py", line 640, in decode
    _graph = amr_codec.decode(raw_graph_string)
  File "/home/perry/anaconda3/envs/stog/lib/python3.6/site-packages/penman.py", line 172, in decode
    span, data = self._decode_penman_node(s)
  File "/home/perry/anaconda3/envs/stog/lib/python3.6/site-packages/penman.py", line 405, in _decode_penman_node
    span, data = self._decode_penman_node(s, pos=pos)
  File "/home/perry/anaconda3/envs/stog/lib/python3.6/site-packages/penman.py", line 405, in _decode_penman_node
    span, data = self._decode_penman_node(s, pos=pos)
  File "/home/perry/anaconda3/envs/stog/lib/python3.6/site-packages/penman.py", line 405, in _decode_penman_node
    span, data = self._decode_penman_node(s, pos=pos)
  File "/home/perry/anaconda3/envs/stog/lib/python3.6/site-packages/penman.py", line 427, in _decode_penman_node
    raise DecodeError('Expected ":" or "/"', string=s, pos=pos)
penman.DecodeError: Expected ":" or "/" at position 364

Workaround
Check for non-breaking spaces and replace them with - or _ in the output.

@goodmami
Copy link

While this repo uses an old version of penman, this issue also affects the latest version. I've created goodmami/penman#99 to track the issue there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants