Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
in kanji view, aedict should list a few similar kanjis #788
In the kanji view, aedict could have a "not to be confused with" section to help people learn their kanjis and spot the differences between similar ones, for example 副 should not be confused with 福.
Here is an example of a line from storkeEditDistance.csv:
Which means that 天 is very similar to 夫, and their "distance" is 1. That value is a little different from a "distance", as the higher it is, the more the kanjis are similar. On the same line, we can see that 天 is also similar to 大, but with a lesser "distance" of 0.75.
I don't think aedict needs to show the distance to the user as it is more or less an arbitrary number, but it should list for each kanji the kanjis that are similar in the same order (from the most similar to the less one).
I am not sure you need both files, as the yehAndLiRadical file is based on radical and not strokes, but they often overlap and strokeEditDistance really gives kanjis that look alike, even if they don't share radicals. Here is an example from yehAndLiRadical and strokeEditDistance respectively:
So I would recommend only adding strokeEditDistance, but it's up to you!
And don't forget to add a link to that page from aedict :)
Keep up the good work!
Hey, that's a really nifty feature, thanks! I agree that the
Let's thus start by using
Going through the file, I think 0.7 would be more acceptable, for example
Some kanjis are relevant even with a distance of 0.5 I think. For example:
But that would include a lot of false positives. I guess you should try a value and see, maybe gather user feedback. In my app, I didn't put a limit, I included all of them, but it doesn't have the same goal.