You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "hoge.py", line 99, in <module>
knp.parse("!!")
File "/root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.8/site-packages/pyknp/knp/knp.py", line 70, in parse
return self.parse_juman_result(juman_str, juman_format)
File "/root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.8/site-packages/pyknp/knp/knp.py", line 97, in parse_juman_result
return BList(knp_lines, self.pattern, juman_format)
File "/root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.8/site-packages/pyknp/knp/blist.py", line 39, in __init__
self.parse(spec)
File "/root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.8/site-packages/pyknp/knp/blist.py", line 116, in parse
synnodes = SynNodes(string)
File "/root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.8/site-packages/pyknp/knp/syngraph.py", line 21, in __init__
self.tagids = [int(n) for n in tagid.split(',')]
File "/root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.8/site-packages/pyknp/knp/syngraph.py", line 21, in <listcomp>
self.tagids = [int(n) for n in tagid.split(',')]
ValueError: invalid literal for int() with base 10: '!!'
I ran into a similar issue using Juman++ (latest release) and pyknp from pip.
Traceback (most recent call last):
File "./benchmark-jumanpp.py", line 10, in <module>
for word in tok.analysis(line.strip()).mrph_list():
File "/mnt/pool/code/tokenizer-benchmark/env/lib/python3.8/site-packages/pyknp/juman/juman.py", line 89, in analysis
return self.juman(input_str, juman_format)
File "/mnt/pool/code/tokenizer-benchmark/env/lib/python3.8/site-packages/pyknp/juman/juman.py", line 76, in juman
result = MList(self.juman_lines(input_str), juman_format)
File "/mnt/pool/code/tokenizer-benchmark/env/lib/python3.8/site-packages/pyknp/juman/mlist.py", line 29, in __init__
mrph = Morpheme(line, mid, juman_format)
File "/mnt/pool/code/tokenizer-benchmark/env/lib/python3.8/site-packages/pyknp/juman/morpheme.py", line 79, in __init__
self._parse_spec(spec.strip("\n"))
File "/mnt/pool/code/tokenizer-benchmark/env/lib/python3.8/site-packages/pyknp/juman/morpheme.py", line 142, in _parse_spec
self.hinsi_id = int(parts[4])
ValueError: invalid literal for int() with base 10: 'input'
This problem seems to be fixed now.
I tested in the following environments and confirmed that pyknp works well.
JUMAN++ 1.02 / 2.0.0-rc3
KNP current HEAD of master ku-nlp/knp@2ad4f6d / 4.2
pyknp current HEAD of master 38469c8 / latest version from pip (0.4.5)
Python 3.7.9
OS macOS Bug Sur (11.0.1) / Ubuntu 20.04.1
Minimal reproduce code
KNP output
JUMAN++ 1.02
KNP current HEAD of master ku-nlp/knp@165d699
pyknp current HEAD of master 6ba00ea
Python 3.8.5
OS Ubuntu 20.04
The text was updated successfully, but these errors were encountered: