Confusingly similar ideographic characters can be tricky for learners of Japanese and Chinese. This web app can help by visualising methods to compute a notion of distance between two Kanji characters, connecting those which are deemed 'close together' and thus supposedly hard to tell apart:
At the center, we can see the kanji we are focusing on. On the periphery, we can see its closest neighbors at a certain distance reflecting the similarity between them and the centered kanji. The peripheral kanji are also arranged roughly by how similar they are amongst themselves, with strong similarity being indicated by thicker lines.We intuitively understand that some kanji are more similar than others, for example а pair like (森, 林) sharing the component 木. The data visualised here takes into account the nested component structure of the kanji, including their relative position and appearance, to assign a similarity score, using a mathematical framework called Optimal Transport. Here is our preprint detailing the method. You can also select the Bag-of-Radicals Distance by Yeh and Li (2002) using the fraction of shared radicals and the Stroke Edit distance by Yencken and Baldwin (2008) using the number of strokes to add and remove to reach one kanji from the other. We also recommend our open toolkit for computing all kinds of kanji-related things in the programming language R, available here.
This website is brought to you by Lennart Finke and Dominic Schuhmacher from the Spatial Stochastics group at the University of Göttingen. It uses d3.js, licensed under ISC, and Bulma, licensed under MIT. The website content is licensed under MIT and you can see the source code here.
