You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Costs are lower for more frequent words. But the examples above do not seem to be so frequent as assigned a very low cost. I suspect this could possibly be a result of integer overflow or sort.
Goal
I would like to know:
(1) if this is a correct/intended result or a bug
(2) if correct/intended, how negative costs should be interpreted.
Can someone help me with this?
The text was updated successfully, but these errors were encountered:
In conclusion, we think this case is correct and not a bug.
And using negative integer values in the range of 2-byte integers as a cost value conform to the IPADIC specification.
Also, the cost value given to each words are not necessarily based on the frequency of word observation in the real world or in the corpus.
Chapter 5, (P 79 -) in the following book will help you understand how different cost values are used in the analysis process.
Thanks first for the great database.
Motivation
I find some words in the data are assigned negative costs.
Costs are lower for more frequent words. But the examples above do not seem to be so frequent as assigned a very low cost. I suspect this could possibly be a result of integer overflow or sort.
Goal
I would like to know:
(1) if this is a correct/intended result or a bug
(2) if correct/intended, how negative costs should be interpreted.
Can someone help me with this?
The text was updated successfully, but these errors were encountered: