Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

做的一个项目用到了分词的功能(非拼音) #131

Closed
wintsa123 opened this issue Oct 8, 2023 · 1 comment
Closed

做的一个项目用到了分词的功能(非拼音) #131

wintsa123 opened this issue Oct 8, 2023 · 1 comment
Labels
question Further information is requested

Comments

@wintsa123
Copy link

wintsa123 commented Oct 8, 2023

想问一下pinyin pro的分词是怎么实现的,分词进行标注,避免错误的多音字。
刚好一个项目用到了分词的功能

@zh-lx zh-lx added the question Further information is requested label Oct 8, 2023
@zh-lx
Copy link
Owner

zh-lx commented Oct 8, 2023

首先分词肯定要有一套词库,然后我基于词库初始化实现了一个 AC自动机,从性能上讲 AC自动机应该是相对简单且高效的多词匹配算法,如果觉得实现有难度的话,就从头对词库的词遍历匹配也行,只不过效率和准确率相对低一点

@zh-lx zh-lx closed this as completed Oct 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants