Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何扩充词库 #5

Closed
gaochao19860203 opened this issue Oct 26, 2017 · 4 comments
Closed

如何扩充词库 #5

gaochao19860203 opened this issue Oct 26, 2017 · 4 comments

Comments

@gaochao19860203
Copy link

你好,咨询下 如何扩充词库?有具体的操作介绍吗

@gaochao19860203
Copy link
Author

是这个方法吗 ?add_word_to_vocab?
def add_word_to_vocab(word, nearby, nearby_score)
nearby_score如何得出?

@hailiang-wang
Copy link
Member

扩充语料请看https://github.com/Samurais/wikidata-corpus
1)训练 w2v 词向量
2)获得整个词表的近义词集合和分数
3)生成pkl文件

请参考 https://github.com/huyingxi/Synonyms synonyms/init.py 的代码。
目前扩充词库有部分代码是修改了word2vec源码,这部分不会开源。

如果你有很大的语料,我可以做扩充的工作,前提是做好后贡献在 synonyms 里。

@gaochao19860203
Copy link
Author

再咨询下,如果 是想增加一对近义词,应该怎么操作?把这对近义词提交给你?

@doudouaili
Copy link

如果想增加近义词,但是这些词是用户自己定义的,希望有能增加的方式。有什么方式可以增加俩个近义词直接的评分吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants