Returns only top result with NBest option #36

itmammoth · 2019-10-30T07:03:07Z

Hi,
I am trying to parse some words with the NBest option, but it doesn't seem to work correctly.

mecab 0.996
mecab-ptyhon3 0.996.2
python 3.7.4

What mecab returns is

$ echo こんにゃく粉 | mecab -N2
こんにゃく粉	名詞,一般,*,*,*,*,こんにゃく粉,コンニャクコ,コンニャクコ
EOS
こんにゃく	名詞,一般,*,*,*,*,こんにゃく,コンニャク,コンニャク
粉	名詞,接尾,一般,*,*,*,粉,コ,コ
EOS

On the other hand, what mecab-python3 returns is

import MeCab

tagger = MeCab.Tagger('-N2')
print(tagger.parse('こんにゃく粉'))

こんにゃく粉     名詞,一般,*,*,*,*,こんにゃく粉,コンニャクコ,コンニャクコ
EOS

The text was updated successfully, but these errors were encountered:

polm · 2019-11-06T06:00:26Z

The -N option doesn't work that way when initializing the tagger, you need to use an nbest related method in the API. Example:

tagger.parseNBest(2, 'こんにゃく粉')
# =>  'こんにゃく 粉 \nこんにゃく 粉 \n'
# note: I get the same results for the top two, probably because I'm using Unidic.

In general, be careful because the command line works slightly differently from the C API in MeCab. Another example is that from the command line all newlines are treated as sentence boundaries, while from the API they're just whitespace.

itmammoth · 2019-11-08T04:01:21Z

Thanks @polm !
You saved my day.

itmammoth closed this as completed Nov 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Returns only top result with NBest option #36

Returns only top result with NBest option #36

itmammoth commented Oct 30, 2019

polm commented Nov 6, 2019

itmammoth commented Nov 8, 2019

Returns only top result with NBest option #36

Returns only top result with NBest option #36

Comments

itmammoth commented Oct 30, 2019

polm commented Nov 6, 2019

itmammoth commented Nov 8, 2019