New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect list of words from Kanji to ON readings in words #693

Closed
andrejrenard opened this Issue Jul 25, 2016 · 2 comments

Comments

Projects
None yet
2 participants
@andrejrenard

andrejrenard commented Jul 25, 2016

Try '政' with onyomi 'しょう'

The first word is '郵政省' where the 'しょう' reading indeed is found but for '省' (it is 'せい' for '政').

@mvysny mvysny added the bug label Aug 3, 2016

@mvysny

This comment has been minimized.

Show comment
Hide comment
@mvysny

mvysny Aug 3, 2016

Owner

Hi Andrej, thank you for letting me know. Unfortunately, this is not easily fixed - the JMDict data does not contain readings for separate kanjis in words. Therefore, Aedict can only guess the readings, and in some cases, it will guess incorrectly. However, there perhaps is something I can do, perhaps I can post-process Lucene results to actually verify the reading.

Owner

mvysny commented Aug 3, 2016

Hi Andrej, thank you for letting me know. Unfortunately, this is not easily fixed - the JMDict data does not contain readings for separate kanjis in words. Therefore, Aedict can only guess the readings, and in some cases, it will guess incorrectly. However, there perhaps is something I can do, perhaps I can post-process Lucene results to actually verify the reading.

@mvysny

This comment has been minimized.

Show comment
Hide comment
@mvysny

mvysny Aug 4, 2016

Owner

Fixed, the algorithm will now remove obvious misreadings. Yet, without full data on kanji/reading for that particular word, there is unfortunately no way to remove 重商政策: じゅうしょうせいさく... I'm not sure whether I should complicate the algorithm further, by performing a Kanjidic lookup for possible readings... Closing as fixed for now, please reopen if there are any further glaring issues.
Fixed in Aedict 3.39.24

Owner

mvysny commented Aug 4, 2016

Fixed, the algorithm will now remove obvious misreadings. Yet, without full data on kanji/reading for that particular word, there is unfortunately no way to remove 重商政策: じゅうしょうせいさく... I'm not sure whether I should complicate the algorithm further, by performing a Kanjidic lookup for possible readings... Closing as fixed for now, please reopen if there are any further glaring issues.
Fixed in Aedict 3.39.24

@mvysny mvysny closed this Aug 4, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment