Number of Chinese character components #2

lancejpollard · 2019-01-12T01:32:10Z

Hi there, this looks like an interesting project. I'm wondering if it involves coming up with a set of primitives/components/radicals/etc. that are building blocks of all chinese characters. I'm interested to know if you found a system to handle every character, or if there are still new characters you encounter which throw the framework for a loop -- that is, the framework wasn't able to account for it, for some new stroke or character component of some sort.

I'd also be interested to know of a list of such components if available. I'm not sure I see them included in the data directory.

Thank you.

lancejpollard · 2019-01-12T11:38:29Z

Wondering if you have somewhere also that describes any math behind it, such as how you know that subdividing the grid will allow for covering all possible ways a stroke might be on a Chinese character. I'd be really interested to know how that works.

LingDong- · 2019-01-13T11:06:12Z

Hi there! Thanks for your interest in the project.

Yes most characters can be broken down to a limited set of primitive components. By primitive component I mean a character whose definition does not contain that of any other character.

If a new character cannot be composed of existing primitive components, you can think of that character as a primitive component itself.

Most of these components are defined in the beginning of the rrpl.json file, but they're not delibrately organized yet. You can catch them with a regex such as "[123456780\-\|\(\)]+".

If you would just like to see a list of common radicals in the Chinese language, check out https://zh.wikipedia.org/wiki/部首#康熙部首讀音表, they're in the 2nd column of the table.

The way the code subdivides the gird is by counting the number of "parrallel" components. For example, in (A)-(B), component A and B each get 50% of horizontal space, while in (A)-(B)-(C), the components each get 1/3 of horizontal space, and in ((A)-(B))-(C), A and B each get 25% and C gets 50%.

I hope my explanation is helpful!

lancejpollard closed this as completed Jan 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Number of Chinese character components #2

Number of Chinese character components #2

lancejpollard commented Jan 12, 2019

lancejpollard commented Jan 12, 2019

LingDong- commented Jan 13, 2019

Number of Chinese character components #2

Number of Chinese character components #2

Comments

lancejpollard commented Jan 12, 2019

lancejpollard commented Jan 12, 2019

LingDong- commented Jan 13, 2019