Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ideas on enhancement #11

Open
farteryhr opened this issue Dec 21, 2019 · 0 comments
Open

ideas on enhancement #11

farteryhr opened this issue Dec 21, 2019 · 0 comments

Comments

@farteryhr
Copy link

some machine learning to adjust the grid size of components when composed...
though i believe it's trivially the future work...

about decomposition,
https://www.babelstone.co.uk/CJK/index.html -> ids.txt
there's an actively maintained dataset in IDS format, not sure if you're already using it. those could be easily parsed and directly transformed to your format. though also it's expected not a few of them will need to be hand corrected. visualizing them with this project may help to identify errors too.

personally i suggest not to break up the components too fine/deep. down to some mid-level components, much likely they're essentially (look up at zdic.net or hanziyuan.net for example) standalone components but may be generalized and merged to look like combinations of semantically irrelevant sub-shapes during 隸變 楷化. in other word, there's much chance for some fake orthogonality. for example 它→宀匕, 宁→宀丁, 寅→宀?, but in fact none of them is really composed with 宀. also too many levels of composition degrades output quality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant