We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
我现在使用jieba 分词来处理系统的关键词,但是有好多并不是我想要的。我觉得如果我可以生成一个自己的词频词库或许能够更好地提取关键词。所以想问一下,我应该如何训练 jieba 生成自己的词库呢
The text was updated successfully, but these errors were encountered:
@hainuo 你想訓練哪個部分呢?字典、hmm 模型或是 idf 詞頻?
Sorry, something went wrong.
现在基础的专业词汇没有先训练字典吧
发自我的 iPhone
在 2016年9月5日,20:12,Fukuball Lin notifications@github.com 写道: @hainuo 你想訓練哪個部分呢?字典、hmm 模型或是 idf 詞頻? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
在 2016年9月5日,20:12,Fukuball Lin notifications@github.com 写道:
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
@hainuo 不懂你的需求,結巴目前的運作原理就是先用字典斷詞再用 HMM 找新詞,某種程度字典決定了結巴的準確度,字典的詞頻如何產生,對結巴來說就是訓練過程,不過我目前沒有可用來訓練的語料庫就是了
No branches or pull requests
我现在使用jieba 分词来处理系统的关键词,但是有好多并不是我想要的。我觉得如果我可以生成一个自己的词频词库或许能够更好地提取关键词。所以想问一下,我应该如何训练 jieba 生成自己的词库呢
The text was updated successfully, but these errors were encountered: