New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

部分拼音拼写有误 #112

Closed
hyu9999 opened this Issue Dec 20, 2017 · 4 comments

Comments

Projects
None yet
4 participants
@hyu9999

hyu9999 commented Dec 20, 2017

运行环境

  • 操作系统(Linux/macOS/Windows):windows
  • Python 版本:3.5
  • pypinyin 版本:0.28.0

问题描述

部分拼音拼写有误

问题复现步骤

pinyin('为什么')-->[['wéi'], ['shèn'], ['mǒ']] 这个应该是[['wèi'], ['shén'], ['me']]
pinyin('什么')-->[['shén'], ['me']]

@mozillazg

This comment has been minimized.

Owner

mozillazg commented Dec 20, 2017

@hyu9999 感谢反馈,之后我会抽空整理一下词典库。
如果比较急的话可以通过自定义词典的方式覆盖掉默认的拼音:

In [1]: from pypinyin import pinyin, load_phrases_dict

In [2]: pinyin('为什么')
Out[2]: [['wéi'], ['shèn'], ['']]

In [3]: load_phrases_dict({'为什么': [['wèi'], ['shén'], ['me']]})

In [4]: pinyin('为什么')
Out[4]: [['wèi'], ['shén'], ['me']]
@liangqi

This comment has been minimized.

liangqi commented May 21, 2018

运行环境

操作系统(Linux/macOS/Windows):macOS 10.13.4
Python 版本:3.6.4
pypinyin 版本:0.30.1

问题复现步骤

pinyin('步履蹒跚')
[['bù'], ['lǚ'], ['mán'], ['shān']]
pinyin('超市')
[['cháo'], ['shì']]

建议

蹒和超并不是多音字,参见 http://www.zdic.net/z/25/js/8E52.htmhttp://www.zdic.net/z/25/js/8D85.htm

建议提供词典库的详细信息,方便大家评审和更正,谢谢

@liangqi

This comment has been minimized.

liangqi commented May 21, 2018

词典数据应该是在 https://github.com/mozillazg/phrase-pinyin-data/blob/2c39053d2de48a2f53218f9a012ca085fc9ff6ef/pinyin.txt

没有蹒的词条
“超市: cháo shì”,其它的超好像音调都是正确的。

@zgdlime

This comment has been minimized.

zgdlime commented Sep 1, 2018

@hyu9999
请问python-pinyin如何批量处理?

运行环境
操作系统:Windows10
Python 版本:python-3.4.3
pypinyin 版本:v0.33.0

我有一个文本文件b.txt,utf-8格式,文件里面有内容:
这个
进行
因为
还是
时候
看到
……
想把转换成汉语拼音,该如何操作?

能批处理、拖叠文件等一步到位吗?
指教一下吧!谢!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment