Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

词库贡献 #666

Open
iDvel opened this issue Feb 5, 2024 · 110 comments
Open

词库贡献 #666

iDvel opened this issue Feb 5, 2024 · 110 comments
Labels
dict 词库相关

Comments

@iDvel
Copy link
Owner

iDvel commented Feb 5, 2024

目前词库已经过脚本检查及大量人工校对,但难免有疏漏。
如果有词汇缺失、错音、错字、初始排序不合理的问题,可以直接 PR 或在此留言。

@iDvel iDvel added the dict 词库相关 label Feb 5, 2024
@iDvel iDvel mentioned this issue Feb 5, 2024
@iDvel iDvel pinned this issue Feb 5, 2024
@iDvel

This comment was marked as resolved.

iDvel added a commit that referenced this issue Feb 5, 2024
@iDvel

This comment was marked as resolved.

iDvel added a commit that referenced this issue Feb 10, 2024
@iDvel

This comment was marked as resolved.

@tao659
Copy link

tao659 commented Feb 13, 2024

主要用的是 蔄山 和 苘山 也读 man shan 这两个 现在 也有一些写𬜬山 但是很少,主要是 前面两个,公文一般是蔄山, 非正式公文一般写苘山 的多

或许将 苘山 独读 man shan 的 只有 本地人

@iDvel
Copy link
Owner Author

iDvel commented Feb 13, 2024

「𬜬山」、「蔄山」都写上得了。
「苘qing」应该是误写,百度百科是一点都不能信的。

汉字在这种规范上老坑爹了。
类推简化了「蔄man」→「𬜬man」,十多年过去了,字典还是「𬜬」,当地人包括当地政府还是用「蔄」。
规范出了没人用,也没有顺从当地习惯更改规范,最后就是混用,摆烂,也没人管。

iDvel added a commit that referenced this issue Feb 13, 2024
iDvel added a commit that referenced this issue Feb 14, 2024
@boomker
Copy link
Contributor

boomker commented Feb 17, 2024

整理出部分错音词条放到附件里
rime-ice_zhuyin-err.txt

iDvel pushed a commit that referenced this issue Feb 18, 2024
hegotit pushed a commit to hegotit/rime-ice that referenced this issue Feb 19, 2024
@chenbihao

This comment was marked as off-topic.

@iDvel
Copy link
Owner Author

iDvel commented Feb 20, 2024

唵嘛呢叭咪 唵嘛呢嘛呢叭咪吽

唵并没有 ong 这个音,汉语里面也没有 ong 音节的字。 我看好多电视剧里就念 an 的; 或者按外来音,注音为 wong,之类, 或者直接注在英文,或者中英混合词典中

(注 ong 音,会导致编译为词典包 pack 的时候,由于缺少这个音节,报错并 drop 掉这个词汇)

「唵嘛呢叭咪吽」按字典的音来注吧, an ma ni ba mi hong http://www.jiaodui.com/bbs/read.php?tid=10782
目前也有简单的方法,可输入「六字真言」或「六字大明咒」,通过 emoji 来输出。

@iDvel

This comment was marked as off-topic.

iDvel added a commit that referenced this issue Feb 20, 2024
hegotit pushed a commit to hegotit/rime-ice that referenced this issue Feb 21, 2024
@mavsill

This comment was marked as off-topic.

@iDvel

This comment was marked as off-topic.

@mavsill

This comment was marked as off-topic.

@iDvel

This comment was marked as off-topic.

@mavsill

This comment was marked as off-topic.

@iDvel
Copy link
Owner Author

iDvel commented Feb 23, 2024

还是就这样吧,我试了一下大小是一样的,速度好像也没多大差距。
我把很多同义多音字如「熟、血」之类的也扔到 tencent 词库,让 Rime 自动注了。
平时加词我也是扔到 tencent 里了,不用写注音,方便一点。

@gaboolic
Copy link
Contributor

#703

词库里很多 “犭更犬”,是否要改为 “㹴犬”

luckmoon pushed a commit to luckmoon/rime-ice that referenced this issue Feb 28, 2024
@tansongchen
Copy link

尝试使用雾凇拼音来开发其他输入方案的过程中,发现部分词组的注音中某个字的读音没有包含在它单独的读音中:

dropping entry '陈寅恪' with invalid syllable: que
dropping entry '放饭流歠' with invalid syllable: chu
dropping entry '解州' with invalid syllable: hai
dropping entry '解州关帝庙' with invalid syllable: hai
dropping entry '解州镇' with invalid syllable: hai
dropping entry '亠部' with invalid syllable: jiong
dropping entry '擖哧' with invalid syllable: ka
dropping entry '肋脦' with invalid syllable: de
dropping entry '艋舺' with invalid syllable: jia
dropping entry '将进酒' with invalid syllable: qiang
dropping entry '青玉案' with invalid syllable: wan
dropping entry '青玉案元夕' with invalid syllable: wan
dropping entry '通什镇' with invalid syllable: za
dropping entry '菶菶萋萋' with invalid syllable: yong
dropping entry '鲗鱼涌' with invalid syllable: ze
dropping entry '槁项黄馘' with invalid syllable: xu
dropping entry '黄馘槁项' with invalid syllable: xu
dropping entry '尨眉皓发' with invalid syllable: rong
dropping entry '泥而不滓' with invalid syllable: nie

这一点是否需要修正,即保证词组中的读音一定在单字中也出现过?

@tansongchen
Copy link

tencent 这几个词没有相应的拼音,注不出来

E20240301 15:00:13.292201 232383 entry_collector.cc:135] Encode failure: '李到𬀪'.
E20240301 15:00:14.678122 232383 entry_collector.cc:135] Encode failure: '薄护尾𬶏'.
E20240301 15:00:14.679606 232383 entry_collector.cc:135] Encode failure: '薄身罗马诺𬶋'.

@leenux9527

This comment was marked as resolved.

@iDvel
Copy link
Owner Author

iDvel commented Apr 18, 2024

加词: 二百次 利箭 示波 图纹 授职 片区 千颗 亿倍 冰碴 松明子 问清 原由 千艘 叠被 监听站 八万个 高能态

「原由」→「缘由」

iDvel added a commit that referenced this issue Apr 18, 2024
hegotit pushed a commit to hegotit/rime-ice that referenced this issue Apr 19, 2024
@shionryuu

This comment was marked as resolved.

@gaboolic

This comment was marked as resolved.

@jslcslx

This comment was marked as resolved.

@gaboolic

This comment was marked as resolved.

@uliuyt

This comment was marked as resolved.

@jslcslx

This comment was marked as resolved.

@hegotit

This comment was marked as resolved.

@gaboolic

This comment was marked as resolved.

@hegotit

This comment was marked as resolved.

mirtlecn added a commit that referenced this issue Apr 28, 2024
expoli pushed a commit to expoli/rime-ice that referenced this issue Apr 29, 2024
@kip05

This comment was marked as resolved.

expoli pushed a commit to expoli/rime-ice that referenced this issue Apr 30, 2024
@Chengxcy

This comment was marked as resolved.

mirtlecn added a commit that referenced this issue May 1, 2024
@gaboolic
Copy link
Contributor

gaboolic commented May 1, 2024

词频问题,zhongzhuan,目前是中专 中转。 中转 大于 中专 比较好
daoqile 出现的是“到起了”,预期是“到期了”

@Chengxcy
Copy link

Chengxcy commented May 1, 2024

词频问题,zhongzhuan,目前是中专 中转。 中转 大于 中专 比较好
daoqile 出现的是“到起了”,预期是“到期了”

对,“到期了”合理,有“到起了”这个用法吗?“到齐了”倒是有这个用法

@mirtlecn
Copy link
Collaborator

mirtlecn commented May 1, 2024

说实话,我也不懂词库里面一些很怪的词,是怎么来的

@iDvel
Copy link
Owner Author

iDvel commented May 1, 2024

这种词不是词库里的,是因为没有 dao qi le,又由于 起了 被打了几次,Rime 自动组成了 到 / 起了
有发现再单独添加吧。

iDvel added a commit that referenced this issue May 1, 2024
@hoofcushion
Copy link
Contributor

hoofcushion commented May 2, 2024

这些词全都指代文本编辑器 Vim 内置的配置语言。

VimL (Vim Language 的缩写)
Vim Script (为 Github 网站使用,专有名词形式)
Vim script (为 Bram Moolenaar 本人使用)
Vim scripts (同上,复数形式)
Vimscript (同上,连写形式)
Vimscripts (同上,复数形式)

@Lion176
Copy link

Lion176 commented May 3, 2024

cn_en.txt 加词
A+ Ajia

@Lanlan-Cat
Copy link

PixPin_2024-05-03_17-35-58

第三个箭头是单箭头↖,这里和emoji的箭头显示一样,是不是有问题?

@kip05
Copy link

kip05 commented May 3, 2024

第三个箭头是单箭头↖,这里和emoji的箭头显示一样,是不是有问题?

是不同的符号

image

expoli pushed a commit to expoli/rime-ice that referenced this issue May 5, 2024
expoli pushed a commit to expoli/rime-ice that referenced this issue May 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dict 词库相关
Projects
None yet
Development

No branches or pull requests