Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

对于多音词处理的问题 #65

Closed
PrinOrange opened this issue Jun 23, 2022 · 5 comments · Fixed by #70
Closed

对于多音词处理的问题 #65

PrinOrange opened this issue Jun 23, 2022 · 5 comments · Fixed by #70
Labels
bug Something isn't working enhancement New feature or request

Comments

@PrinOrange
Copy link

感谢项目贡献者的工作!我认为可以对本项目的一些功能做出改进:
本项目对于多音词处理上有一些小bug.而且根据文档,multiple属性不能解决此问题
例如

pinyin("朝阳");
//输出"zhāo cháo yáng"

其中“朝”字被自动检测出两种读音。
如果关闭multiple属性为false,输出结果仍然是zhāo cháo yáng,一行出现了三个拼音。
"朝阳"一词有两种正确读音:zhao yang和chao yang均正确。对于其他多音词也可能会出现这个问题。
我认为可以把多音词的各种读音都返回到一个数组中。

第二个问题就是,项目中的多音词库似乎不足。比如“增长”,有zeng zhangzeng chang两种读音。程序只能得到zeng zhang读音。再比如"大夫",有da fu(古代官职名)和dai fu(医生称谓),项目中只能返回dai fu一词。
我认为可以在现有的权威拼音数据库中更新一些数据,可以解决这个问题。

@PrinOrange
Copy link
Author

还有一个读音bug,是“假发”的读音,程序返回的是jiǎ fā,“发”的声调有误

@PrinOrange
Copy link
Author

找到了一些数据项目,可以作为参考:
汉语词汇拼音数据:https://github.com/mozillazg/phrase-pinyin-data
汉字拼音数据:https://github.com/mozillazg/pinyin-data

@zh-lx
Copy link
Owner

zh-lx commented Jun 24, 2022

感谢您的反馈,朝阳这个词是因为目前我的词库里拼音有误,这两天会修改一下。
对于“大夫”这种有多个读音的词语,目前还没有想到太好的输出格式。
多音词后续会继续完善一下❤

@zh-lx zh-lx added bug Something isn't working enhancement New feature or request labels Jun 24, 2022
@duanxingyu
Copy link

负责人,会被转换成fu_zhai_ren,正确的是fu_ze_ren,也是多音词的问题

@zh-lx
Copy link
Owner

zh-lx commented Jun 27, 2022 via email

@zh-lx zh-lx linked a pull request Jul 17, 2022 that will close this issue
@zh-lx zh-lx closed this as completed Jul 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants